A methodology for combining multiple commercial data sources to improve measurement of the food and alcohol environment: applications of geographical information systems

  • Dara D. Mendez | ddm11@pitt.edu University of Pittsburgh, Graduate School of Public Health, Department of Epidemiology, Pittsburgh, United States.
  • Jessica Duell University of Pittsburgh, Graduate School of Public Health, Department of Epidemiology, Pittsburgh, United States.
  • Sarah Reiser University of Pittsburgh, Graduate School of Public Health, Department of Epidemiology, Pittsburgh, United States.
  • Deborah Martin University of Pittsburgh, Graduate School of Public Health, Department of Epidemiology, Pittsburgh, United States.
  • Robert Gradeck University of Pittsburgh, Center for Social and Urban Research, Pittsburgh, United States.
  • Anthony Fabio University of Pittsburgh, Graduate School of Public Health, Department of Epidemiology, Pittsburgh, United States.

Abstract

Commercial data sources have been increasingly used to measure and locate community resources. We describe a methodology for combining and comparing the differences in commercial data of the food and alcohol environment. We used commercial data from two commercial databases (InfoUSA and Dun&Bradstreet) for 2003 and 2009 to obtain infor- mation on food and alcohol establishments and developed a matching process using computer algorithms and manual review by applying ArcGIS to geocode addresses, standard industrial classification and North American industry classification tax- onomy for type of establishment and establishment name. We constructed population and area-based density measures (e.g. grocery stores) and assessed differences across data sources and used ArcGIS to map the densities. The matching process resulted in 8,705 and 7,078 unique establishments for 2003 and 2009, respectively. There were more establishments cap- tured in the combined dataset than relying on one data source alone, and the additional establishments captured ranged from 1,255 to 2,752 in 2009. The correlations for the density measures between the two data sources was highest for alcohol out- lets (r = 0.75 and 0.79 for per capita and area, respectively) and lowest for grocery stores/supermarkets (r = 0.32 for both). This process for applying geographical information systems to combine multiple commercial data sources and develop meas- ures of the food and alcohol environment captured more establishments than relying on one data source alone. This replic- able methodology was found to be useful for understanding the food and alcohol environment when local or public data are limited.

Downloads

Download data is not yet available.
Published
2014-11-01
Section
Original Articles
Keywords:
food and alcohol establishments, neighbourhood, geocoding, geographical information systems, database, USA.
Statistics
Abstract views: 1398

PDF: 579
Share it

PlumX Metrics

PlumX Metrics provide insights into the ways people interact with individual pieces of research output (articles, conference proceedings, book chapters, and many more) in the online environment. Examples include, when research is mentioned in the news or is tweeted about. Collectively known as PlumX Metrics, these metrics are divided into five categories to help make sense of the huge amounts of data involved and to enable analysis by comparing like with like.

How to Cite
Mendez, D. D., Duell, J., Reiser, S., Martin, D., Gradeck, R., & Fabio, A. (2014). A methodology for combining multiple commercial data sources to improve measurement of the food and alcohol environment: applications of geographical information systems. Geospatial Health, 9(1), 71-96. https://doi.org/10.4081/gh.2014.7