Design Choropleth Colors & Intervals
When you build a choropleth map, your choices about how to represent data with colors will determine its overall appearance, so it’s important to learn key principles. Good choropleth maps make true and insightful geographic patterns clearly visible to readers, whether they are printed in black-and-white on paper or displayed in color on a computer screen. Furthermore, the best choropleth maps are designed to be interpreted correctly by people with colorblindness. For an excellent overview of visualization colors in general, see Lisa Charlotte Rost’s “A Friendly Guide to Colors in Data Visualization” and “How to Pick More Beautiful Colors for Your Data Visualizations,” both on the Datawrapper blog.23
To illustrate key concepts about colors in choropleth map design, let’s explore a wonderful tool called ColorBrewer, created by Cynthia Brewer and Mark Harrower. See the interface in Figure 8.4. Since ColorBrewer is a design assistant, do not expect to upload your data into it to create a map. Instead, ColorBrewer will recommend color palettes that work best with our map data and the type of story we wish to tell, and allow us to export those color codes into our preferred mapping tool.
In this section, we’ll focus on two important decisions you’ll need to make when designing choropleth maps: choosing the type of color palette (sequential, divergent, or qualitative) and the intervals to group together similar-colored data points.
When you open ColorBrewer, the top row asks you to select the number of data classes in your choropleth map, which means the number of intervals or steps in your color range. This design tool can recommend distinct colors for up to twelve data classes, depending on the type of scheme you select. But for now, use the default setting of 3, and we’ll return to this topic later when we discuss intervals in more detail further below.
Choose Choropleth Palettes to Match Your Data
One of the most important decisions you’ll make when designing a choropleth map is to select the type of palette. You’re not simply choosing a color, but the arrangement of colors to help readers correctly interpret your information. The rule is straightforward: choose an appropriate color palette that matches your data format, and the story you wish to tell.
ColorBrewer groups palettes into three types—sequential, diverging, and qualitative—as shown in Figure 8.5.
Sequential palettes work best to show low-to-high numeric values. Examples include anything that can be placed in sequence on a scale, such as median income, amount of rainfall, or percent of the population who voted in the prior election. Sequential palettes can be single-hue (such as different shades of blue) or multi-hue (such as yellow-orange-red). Darker colors usually represent higher values, but not always.
Diverging palettes work best to show numbers above and below a standard level (such as zero, the average, or median value). They typically have two distinct hues to represent positive and negative directions, with darker colors at the extremes, and a neutral color in the middle. Examples include income above or below the median level, rainfall above or below seasonal average, or percentage of voters above or below the norm.
Qualitative palettes work best to show categorical data, rather than numeric scales. They typically feature unique colors that stand apart from one another to emphasize differences. Examples include different types of land use (residential, commercial, open space, water). They also can represent categories such as a warning system that resembles a stoplight (green, yellow, and red), as these specific colors must be manually assigned to be correctly interpreted.
Choose an appropriate palette that matches your data format and story you wish to tell. For example, we began with the same data on income per capita in the contiguous US states in 2018, but modified it demonstrate the interpretive strengths of each palette, as shown in Figure 8.6.
The first map shows a sequential color scheme with five shades of blue to illustrate the low-to-high range of income levels. This map works best for a data story that emphasizes the highest income levels, shown by the darker blue colors along the Northeastern coast from Maryland to Massachusetts.
The second map shows a diverging color scheme to illustrate income levels at the low and high extremes. We modified the data by subtracting the average US per capita income value of $33,381 from each state’s value. This new relative measure is dark orange for states far below the average, and dark purple for states far above it, while a neutral color represents the middle. This map works best for a data story that emphasizes an economic division between lower-income Southern states versus higher-income East Coast and West Coast states.
The third map shows a qualitative color scheme that divides the 48 contiguous states into 3 equal groups based on their per capita incomes, and paints them in the colors of a stoplight (red, yellow, and green) to represent low, middle, and top thirds.
TODO: Decide if we should keep or remove the third map, since it’s not an ideal representation of qualitative data.
After you select data classes and a color palette, ColorBrewer displays alphanumeric codes that web browsers translate into colors. You can select hexadecimal codes (
#ffffff is white), RGB codes (
255,255,255 is white), or CMYK codes (
0,0,0,0 is white), and copy or export them into your preferred map tool, if it allows color palettes to be imported.
Choose Color Intervals to Group Choropleth Map Data
Another important design choice for choropleth maps is the color intervals, which determine how you group and display data by using similar colors. Since your ability to set intervals varies across different mapping tools, this section will explain broad concepts, and specific map tutorials will demonstrate how to apply them.
Some mapping tools allow you to choose two different types of color intervals to show movement up or down a data scale, as shown in Figure 8.7. Steps are clearly-marked color dividers, like a staircase, while continuous is a gradual change in hue, like a ramp.
If both options exist, which one is best? There is no clear map design rule about this. On one hand, some recommend using continuous intervals to show greater geographical diversity, except when it’s important to your data story to display a threshold, where steps make sense to show areas above or below a certain line. On the other hand, some point out that people are quite bad at distinguishing different hues on a continuous scale, so recommend using clearly-defined steps to help readers match colors to data values in your legend. Therefore, our general advice is to make design choices that are both honest and insightful: tell the truth about the data and also draw our attention to what matters about this interpretation.[TODO: is this sufficient or should we add examples or cite different designers here? Cite Datawrapper Academy on “honesty and usefulness” https://academy.datawrapper.de/article/117-color-palette-for-your-map]
Some mapping tools also allow you to choose how to interpolate your data, meaning the method for grouping numbers to represent similar colors on your map. This may involve a two-part decision about step dividers and numerical methods.
First, if you choose steps, how many dividers should you use to slice up your data? Once again, there is no clear rule. Fewer steps creates a coarse map that highlights broad differences, while more steps creates a granular map that emphasizes geographic diversity between areas. But adding more steps also makes differences less visible. Remember that simply adding more colors does not necessarily make a better map. We recommend experimenting with ColorBrewer to raise or lower the Number of data classes (also known as steps or dividers) for different types of color palettes, to visualize the consequences of your possible design choices, as shown in Figure 8.8. Make decisions with the best interests of your readers in mind, to represent your data in honest and insightful ways.
Second, whether you choose steps or continuous, which interpolation method is the best way to group your data into similar colors on your scale? Map tools may display the options in different ways, as seen in Figure 8.9, and they also may vary depending on whether you selected steps or continuous colors.
- Linear means that the values are placed in a straight line, from lowest to highest. This method works best when the data are evenly distributed, because the colors draw attention to the outliers at the low and high ends of the scale.
- Quantiles means that the values are divided into groups of an equal number. More specifically, quartiles, quintiles, and deciles mean dividing the values into four, five, or ten groups of equal quantity. This method works best when the data are not evenly distributed, because it uses colors to draw more attention to the diversity of groups inside the scale, not just the low and high ends.
- Rounded values are similar to quantiles, but decimals in the scale are replaced with rounded numbers that look nicer to readers’ eyes.
- Natural breaks (Jenks) offers a compromise between linear (which emphasizes the extreme ends) and quantiles (which emphasize internal diversity).
- Custom allows you to manually place dividers wherever you wish along the color scale. We generally recommend to not use custom settings because they are more likely to create a misleading map, as you’ll learn in Chapter 15: Detect Lies and Reduce Data Bias.
TODO: CONFIRM definitions above; add max/mid/min as an option? decide if these definitions are sufficient, or consider adding small maps to visually contrast the differences, as shown in https://academy.datawrapper.de/article/117-color-palette-for-your-map; cite this source
Which interpolation method is best? While there are no rigid rules, some methods above work better for different types of data stories. If you wish to emphasize the lows and highs in your data, choose linear because it uses color to draw attention to the extreme ends of the scale. Or if you wish to emphasize geographic diversity in your data, consider quantiles (or any of its cousins) because they use color to differentiate the middle portions of the scale. Or if you’re not sure what your data looks like, create a histogram like you learned in Chapter 7: Chart Your Data to visualize it and help make wise map design choices. [TODO: demonstrate a histogram with map data here? https://academy.datawrapper.de/article/294-how-to-customize-stepped-color-scales]
In any case, be very aware of how color palettes and interpolation dramatically shape the appearance of choropleth maps and how the data appears in readers’ eyes. Always create maps show us the story and tell the truth. [TODO: ADD THIS? In general, Datawrapper recommends choosing ranges to make sure readers “see all the differences in the data,” rather than hiding them out of sight…. cite https://academy.datawrapper.de/article/134-what-to-consider-when-creating-choropleth-maps] TODO ALSO: Review all of the recommendations in the Datawrapper Academy post above and decide ones to include as rules versus recommendations vs neither. For example: “use the same intervals… 0, 25, 50 instead of 0, 15, 50…” runs counter to other advice we give about interpolation.
Rost, “Your Friendly Guide to Colors in Data Visualisation.”; Rost, “How to Pick More Beautiful Colors for Your Data Visualizations.”↩︎