XY Scatter and Bubble Charts with Google Sheets

Consider using XY scatter charts, also known as scatterplots, to display data coordinates to show the relationship between two variables. The first example below compares the relationships between life expectancy (shown on the X axis) and fertility (shown on the Y axis), which each nation is represented as a dot (an X-Y coordinate). Bubble charts are basically scatter charts on steroids, meaning that they can display the relationship of up to five variables. Further below you’ll build a bubble chart based on the same XY life expectancy-fertility dataset, with added variables for population (displayed as circle size) and region of the world (displayed as circle color).

Fancier bubble charts animate the circles to represent one more variable: change over time. Such animated bubble charts were popularized by Hans Rosling, a renowned Swedish professor of global health.

Note: We recommend you watch one of Hans Rosling’s famous TED talks to see animated bar charts in action. You can also visit Gapminder Foundation website to see more data visualizations and learn more about Hans’s work and legacy.

XY Scatter chart

The scatter chart in Figure 5.29 uses World Bank data to reveal a downward slope: nations with lower fertility (births per woman) tend to have higher life expectancy. You can also phrase is the other way, nations with higher life expectancy at birth have lower fertility. Remember that correlation does not mean causation, so you cannot use this chart to argue that fewer births result in longer lives, or that longer-living females give birth to fewer children.

Figure 5.29: This scatter chart shows that nations with lower fertility tend to have higher life expectancy. See data by World Bank. Explore the full-screen interactive version.

The data used in Figure 5.29 is available from our Google Sheets Scatter chart template. You can copy it to your own Google Drive so that you’re able to edit it (go to File > Make a copy).

Figure 5.30 shows the first few rows of the dataset. Notice that the data is structured in three columns. The first column, Life Expectancy, is plotted on the x-axis (horizontal). The second column, Fertility, is plotted on the y-axis (vertical). The third column contains Country labels.

Data for a scatterplot is usually represented in 3 columns: x-values, y-values, and labels.

Figure 5.30: Data for a scatterplot is usually represented in 3 columns: x-values, y-values, and labels.

To build a scatter chart, select the two columns that contain your numeric data, and go to Insert > Chart. Google Sheets will likely to guess the chart type and you will see a scatterplot, but if not, you can always manually pick Scatter chart from the Chart type dropdown. Make sure your x-axis is set to Life Expectancy, and your Series shows Fertility. Note that both Life Expectancy and Fertility have 123 icon, meaning they are numeric.

You will see a lot of scatter charts out there that do not label data points, and that’s okay. Some scatter plots are designed to show whether or not there is a correlation, and knowing which points are which is not important. But sometimes labels are important for your storytelling.

In Chart editor, open the kebab menu (3 dots) of your Series dataset (Fertility), and then Add labels (see Figure 5.31). The labels added by default will be the x-values of points. To make Google Sheets read labels from the third column (Country), click the name of your label dataset (Life Expectancy), then Select a data range button in the upper-right corner of the dropdown, and choose cells in the relevant columns. Make sure to include the header (first row) if all other data ranges include it.

In the chart’s Setup window, choose Add labels to the Series.

Figure 5.31: In the chart’s Setup window, choose Add labels to the Series.

Tip: You may notice that some data points are too close to edges, and their labels are cut off. To fix this, go to Customize tab of the Chart editor. There, you can set minimum and maximum values for both horizontal and vertical axes. Unlike in bar charts, axes in scatter plots do not have to start at zero. You can set your minimum and maximum values to be a few units below and above the extreme points of your data range.

Bubble chart with 3 columns

In this tutorial, we will show you a little trick that you can use if you want a scatter chart with both data values displayed in a tooltip. We will use the same World Bank dataset as we did for the scatter plot.

The bubble chart (more about the proper use of bubble charts in the next section) in Figure 5.32 shows the same data as our scatterplot on life expectancy vs fertility.

In the interactive version of the chart, hover your cursor over each bubble (dot) to reveal a tooltip with the country name and the two data points.

Figure 5.32: This bubble chart is essentially a scatter chart, because no other dimensions (colors, sizes) are used. See data by World Bank. Explore the full-screen interactive version.

The data for this example is available in Google Sheets Bubble chart with 3 columns template.

Notice that we moved the labels column (Country) to be the first one in the dataset, but the order shouldn’t matter in this case. So our first column is the label for each bubble, the second column is the data to be plotted on horizontal x-axis, and the third column (fertility) will be placed on the y-axis.

Select all three columns, and go to Insert > Chart. Google Sheets will likely create a stacked column chart by default, so choose Bubble from the Chart type dropdown window.

If you want to remove labels from the bubbles, remove the ID series (click on the kebab menu > Remove).

Unfortunately, there is no easy way to reduce all bubbles to a uniformly smaller size. In the following section, we will introduce you to the proper way of using bubble charts.

Bubble chart with 5 columns

Bubble charts are a good alternative to scatter charts if you need to include one or two extra series in addition to your x- and y-coordinates. One of those can be expressed through bubble size (bigger bubbles represent larger values). Another one can make use of color (best for categorical data).

The bubble chart in Figure 5.33 shows fertility and life expectancy for a subset of the nations, with population (shown by bubble size) and region (shown by bubble color). Float your cursor over bubbles to view data details in the interactive version of the chart.

Figure 5.33: This bubble chart shows fertility and life expectancy for several countries, including their population (shown by bubble size) and region (shown by bubble color). See data by World Bank. Explore the full-screen interactive version.

The five-column dataset is available in this Google Sheets Bubble chart with 5 columns template. The columns are arranged in the following order: country label, x-axis value, y-axis value, color, and bubble size.

Bubble chart data. Bubble size represents population, color – region.

Figure 5.34: Bubble chart data. Bubble size represents population, color – region.

Select all data and go to Insert > Chart, and choose Bubble as the Chart type. Make sure your ID, X-axis, Y-axis, Series, and Size fields contain the series you want to display, and make sure to have Use row 1 as headers option checked.

To change labels color, go to Customize tab of the Chart editor, and set Text color under the Bubble menu. Make it gray or black, so that it won’t interfere with the bubble colors themselves.

Tip: If some of your bubbles are too close to the borders, set Min and Max values for the axis manually under Horizontal axis and Vertical axis menus.