Pivot Address-Level Point Data into Polygon Data

If you deal with geographical data, you may find yourself in a situation where you have a list of addresses which need to be counted (aggregated) by area and displayed as a polygon map. In this case, a simple pivot table in a spreadsheet software can solve the problem.

Note: A special case of a polygon map is a choropleth map, which represents polygons that are colored in a particular way to represent underlying values. A lot of polygon maps end up being choropleth maps, so we will be using this term a lot in this book.

Let’s take a look at a list of all hospitals that are registered with the Medicare program in the United States. The dataset is stored and displayed by Socrata, a web database popular among government agencies and city administrations. This particular dataset has information on each hospital’s name, location (nicely divided into Address, City, State, and ZIP Code columns), a phone number and some other indicators, such as mortality and patient experience.

Now, imagine you are given a task to create a choropleth map of total hospitals by US state. Instead of showing individual hospitals as points (as in Figuere 11.5a ), you want darker shades of blue to represent states with more hospitals (as in Figure 11.5b).

(ref:pivot-address) You can count addresses by state (or other area) to produce polygon, or choropleth, maps instead of point maps.

(ref:pivot-address)

Figure 11.5: (ref:pivot-address)

First, save the database to your local machine by going to Export > Download > CSV of Socrata interface. Figure 11.6 shows where you can find the Export button.

In Socrata, you can export the entire dataset as a CSV.

Figure 11.6: In Socrata, you can export the entire dataset as a CSV.

Next, open the file in your favorite spreadsheet tool. If you use Google Sheets, use File > Import > Upload to import CSV data. Make sure your address columns are present, and move on to creating a pivot table (in Google Sheets, go to Data > Pivot table, make sure the entire data range is selected, and click Create). In the pivot table, set Rows to State, because we want to get counts by state. Next, set pivot table’s Values to State—or really any other column that has no missing values—and choose Summarize by: COUNTA. Voila!

Use pivot tables in any spreadsheet software to count addresses per area (such as state, county, of zip code).

Figure 11.7: Use pivot tables in any spreadsheet software to count addresses per area (such as state, county, of zip code).

Your aggregated dataset is ready, so save it as a CSV. If you use Google Sheets, go to File > Download > Comma-separated values (.csv, current sheet). You can now merge this dataset with your polygons manually using editing capabilities of GeoJson.io, or merge it all in one go using powerful Mapshaper.

We will introduce both tools in the next few sections. But before we do that, let’s talk about data normalization and why showing counts of hospitals per state doesn’t really tell a good story.