Why You Should Be Using Location Data in the Development of Your AI Models
January 26, 2023
Artificial intelligence (AI) algorithms are only as good as the data on which they are trained. Using bad data in an AI algorithm produces results that can be biased, whether intentionally or not, and can lead organizations toward misguided strategic business decisions. High-quality, timely, and accurate data that is fully representative of the population is imperative to developing robust AI solutions that can help businesses enhance customer experience, brand loyalty, supply chain flow, and more.
But, what defines a “quality” dataset? How can AI developers ensure that the data they use to train their algorithms is unbiased and inclusive of all segments of the population? While we might be biased (pun fully intended), location data needs to be included among the data sources regardless of the AI’s application.
Here’s why. Nearly every American (97%) owns a mobile device, and 85% of them own smartphones. We carry these devices quite literally everywhere we go – when was the last time you saw someone without one? Given their ubiquitous adoption and the tendency for them to be taken everywhere a person goes, mobile devices can capture a nearly universal accounting of the activities of a given population, especially when used alongside other data sources.
By using the insights produced by location data, businesses can get up-to-date, accurate information about how their customers engage with them in the real world. Businesses can also use these insights to identify consumer trends, monitor their supply chain operations, and choose where to open their next store.
How to Evaluate Good Datasets From Bad
The analyst firm IDC forecasts that the artificial intelligence category will increase by more than 25% in value by 2026. As AI moves beyond the technology industry and is used by teams that are potentially less data-savvy, companies will still need to ensure the datasets selected to train their models are high-quality, robust, and inclusive. At Gravy, we believe there are four types of questions to ask when evaluating datasets:
- Do you know the source of the data? Is the data provider transparent in its origin? And does it include the right attributes for your analysis?
- Is the data accurate? How has it been verified and qualified for inclusion in the dataset?
- Is the dataset representative of the population you are interested in analyzing? Is it large enough to generate reliable analysis?
- How timely is the data? How often does it get updated to add new data points or remove outdated information?
This last set of questions is vital for teams seeking to launch AI solutions today. Nearly a third (32%) of sales and marketing executives who launched an AI program during the pandemic told consulting firm McKinsey that their machine learning models failed due to the use of pre-COVID data to train them.
How Location Data is Being Used Today
A wide array of organizations use location data in unique ways. For example, National CineMedia (NCM), which sells ads that are shown in movie theaters, wanted to augment the U.S. Census data it traditionally relied on to develop demographic profiles of moviegoers. When it added Gravy’s data analysis, the company was able to unlock the types of ads audiences in its partner theaters wanted to see. This improved NCM’s ability to place ads for the local brands, amusement parks, and concert venues of most interest to those audiences.
Location data can be used to generate datasets that identify meaningful consumer personas. Brands can use this data to train their AI models to better understand consumer behavior and purchase intent for various groups—from frequent shoppers or in-market auto buyers, to new homeowners or recent graduates, and more. Analysis of this data could influence where a supermarket chain might decide to open a new location in an expanding suburb, for example.
When seeking insight into the actions of your target consumer, it’s important the dataset used fully represents the population of the community. Care needs to be taken, however, so that potentially identifiable data about individuals remains anonymous. Aggregated, anonymized data is the best source for analysis. Stripping personal data from a source that represents a population before using it to train an algorithm prevents unconscious bias and protects the privacy of individuals.
Businesses should be concerned with the accuracy of the location data they are sourcing to train their AI algorithms. Once data has been verified and tested for accuracy, only then can companies extract the insights that lead to a near real-time understanding of their customers, operations, and other elements of the business.
These insights can help organizations identify key prospects and customers, as well as determine the right ways to engage with each group. Businesses can also leverage these insights to develop new products and services, better manage supply chain operations, determine the best location for a new store, and more. When location data is added to the AI algorithm training mix, companies (and consumers) can only benefit.
This blog is part of our five-post blog series on location intelligence that aims to educate readers about location data and its uses, while also dispelling common misconceptions. To learn more about how your organization can benefit from the use of location intelligence, contact an expert from Gravy Analytics today.