Convert to GeoJSON format

Geospatial data comes in an overwhelming number of file formats. We will tell you about a few most common ones so that you have a general idea of what tools you can use to work with them. But before we do that, let’s talk about the basics of geospatial (map) data.

About geospatial data

The first thing to know about geospatial data is that it consists of two components, location and attribute. When you use Google Maps to search for a restaurant, you get a red marker on the screen that points to the latitude and longitude of the physical location of the restaurant in the real world. These latitude and longitude (two numbers) are your location component. The name of the restaurant, its human-friendly address, and guest reviews are the attributes, which bring value to your location data.

Second, geospatial data can be raster or vector, as illustrated in Figure 14.4. Raster data, as shown in Figure 14.4a, is a grid of cells (“pixels”) of a certain size (for example, 1 meter by 1 meter). For example, satellite images of the Earth that you see on Google Maps are raster geospatial data. Each pixel contains the color of Earth that satellite cameras were able to capture. People and algorithms can then use raster data (images) to create outlines of buildings, lakes, roads, and other objects. These outlines become vector data. For example, most of OpenStreetMap was built by volunteers tracing outlines of objects from satellite images.

Geospatial data can be raster or vector.

Figure 14.4: Geospatial data can be raster or vector.

In this book, we will focus on vector data, which is based on features, which can be points, lines, and polygons, as shown in Figure 14.4b. Vector data can be much more precise than raster data, because point coordinates can be expressed with precise decimals. In addition, vector data can contain as much extra attribute information about each object as desired, whereas raster data is generally limited to 1 value per cell, whether it is the surface color (as is the case with satellite imagery), temperature, or altitude. Moreover, vector map files tend to be smaller in size than raster ones, which is important if you share files on the web.

Let’s take a look at some of the most common vector file formats.

GeoJSON

GeoJSON is a newer, popular open format for map data that comes in .geojson or .json files. It was first developed in 2008, and then standardized in 2016 by the Internet Engineering Task Force (IETF). The code snippet below represents a single point with latitude of 41.76 and longitude of -72.67 in GeoJSON format. That point has a name attribute (property) whose value is Hartford.

{
  "type": "Feature",
  "geometry": {
    "type": "Point",
    "coordinates": [-72.67, 41.76]
  },
  "properties": {
    "name": "Hartford"
  }
}

Other feature types in GeoJSON can be polygons (Polygon) and line strings (LineString, also known as polylines), which are represented as arrays of points.

The simplicity and readability of GeoJSON allows you to edit it even in the most simple text editor. We strongly recommend you use and share your map data in GeoJSON. Web-based maps, such as those built with Leaflet, Mapbox, Google Maps JS API, and Carto, as well as ArcGIS and QGIS all support GeoJSON. By having your geospatial data stored and shared in GeoJSON, you ensure you can use it on the web with nearly any mapping tool. You can also be confident that other people will be able to use and extract data from the file without bulky and often expensive GIS software installed.

Also, your GitHub repository will automatically display any GeoJSON files in a map view, like is shown in Figure 14.5.

GitHub can show previews of GeoJSON files stored in repositories.

Figure 14.5: GitHub can show previews of GeoJSON files stored in repositories.

Warning: In GeoJSON, coordinates are ordered in longitude-latitude format, the same as X-Y coordinates in mathematics. This is the opposite of Google Maps and some other web map tools, which order values as latitude-longitude. For example, Hartford, Conn. is located at (-72.67, 41.76) according to GeoJSON, but at (41.76, -72.67) in Google Maps. Neither notation is right or wrong, just make sure you know which one you are dealing with. Tom MacWright created a great summary table showing lat/lon order of different geospatial formats and technologies.

Shapefiles

The shapefile format was created in the 1990s by Esri, the company that develops ArcGIS software. Shapefiles typically appear as a folder of subfiles with suffixes such as .shp, .shx, and .dbf. The folder with shapefiles is often compressed in a .zip file.

Although government agencies commonly distribute map data in shapefile format, the standard tools for editing these files—ArcGIS and its free and open-source cousin, QGIS—are not as easy to learn as other tools in this book. For this reason, we recommend converting shapefiles into GeoJSON files if possible. Mapshaper, discussed a bit later in the chapter, can perform such conversion.

GPS Exchange Format (GPX)

If you ever exported your Strava run or a bike ride from a GPS device, chances are you ended up with a .gpx file. GPX is an open standard and is based on XML markup language. Like GeoJSON, you can inspect a GPX file in any simple text editor to see its contents. Most likely, you will see a collection timestamps and latitude/longitude coordinates of the recording GPS device at that particular time. You should be able to convert GPX to GeoJSON with GeoJson.io utility discussed later in this chapter.

Keyhole Markup Language (or KML)

The KML format rose in popularity during the late 2000s. It was developed for Google Earth, a free and user-friendly tool that allowed many people to view and edit two- and three-dimensional geographic data. KML files were often used with maps powered by Google Fusion Tables, but that became history in late 2019. GeoJson.io should be able to convert your KML file into a GeoJSON.

Sometimes .kml files are distributed in a compressed .kmz format. See Converting from KMZ to KML format section of this book to learn to convert.

MapInfo TAB

Similar to Esri’s shapefiles, MapInfo’s TAB format comes as a folder with .tab, .dat, .ind, and some other files. It is a proprietary format created and supported by MapInfo, Esri’s competitor, and is designed to work well with MapInfo Pro GIS software. Unfortunately, you will most likely need MapInfo Pro, QGIS, or ArcGIS to re-save these as GeoJSON or a Shapefile.

We’ve mentioned only a handful of the most common geospatial file formats. There is a myriad of other, less known formats for both raster and vector data. Remember that GeoJSON is one of the best, most universal formats for your vector data, and we strongly recommend to store and share your map data in GeoJSON. In the next section, we will look at free online tools to create, convert, join, crop, and in other ways manipulate GeoJSON files.