React for Data Visualization
Student Login

Render a choropleth map of the US

With our data in hand, it's time to draw some pictures. A choropleth map will show us the best places to be in tech.

We're showing the delta between median household salary in a statistical county and the average salary of an individual tech worker on a visa. The darker the blue, the higher the difference.

The more a single salary can out-earn an entire household, the better off you are.

Choropleth map with shortened dataset

There's a lot of gray on this map because the shortened dataset doesn't have that many counties. Full dataset is going to look better, I promise.

Turns out immigration visa opportunities for techies aren't evenly distributed throughout the country. Who knew?

Just like before, we're going to start with changes in our App component, then build the new bit.

Step 1: Prep App.js

You might guess the pattern already: add an import, add a helper method or two, update render.

// src/App.js
import Preloader from "./components/Preloader"
import { loadAllData } from "./DataHandling"
// Insert the line(s) between here...
import CountyMap from "./components/CountyMap"
// ...and here.

That imports the CountyMap component from components/CountyMap/. Your browser will show an error overlay about some file or another until we're done.

In the App class itself, we add a countyValue method. It takes a county entry and a map of tech salaries, and it returns the delta between median household income and a single tech salary.

// src/App.js
function countyValue(county, techSalariesMap) {
const medianHousehold = medianIncomes[county.id],
salaries = techSalariesMap[county.name]
if (!medianHousehold || !salaries) {
return null
}
const median = d3.median(salaries, (d) => d.base_salary)
return {
countyID: county.id,
value: median - medianHousehold.medianIncome,
}
}

We use medianIncomes to get the median household salary and the techSalariesMap input to get salaries for a specific census area. Then we use d3.median to calculate the median value for salaries and return a two-element dictionary with the result.

countyID specifies the county and value is the delta that our choropleth displays.

In the render method, we'll:

  • prep a list of county values
  • remove the "data loaded" indicator
  • render the map
// src/App.js
render() {
// ...
if (techSalaries.length < 1) {
return (
<Preloader />
);
}
// Insert the line(s) between here...
const filteredSalaries = techSalaries,
filteredSalariesMap = _.groupBy(filteredSalaries, 'countyID'),
countyValues = countyNames.map(
county => countyValue(county, filteredSalariesMap)
).filter(d => !_.isNull(d));
let zoom = null;
// ...and here.
return (
<div className="App container">
<svg width="1100" height="500">
<CountyMap usTopoJson={usTopoJson}
USstateNames={USstateNames}
values={countyValues}
x={0}
y={0}
width={500}
height={500}
zoom={zoom} />
</svg>
</div>
);
}

We call our dataset filteredTechSalaries because we're going to add filtering in the subchapter about adding user controls. Using the proper name now means less code to change later. The magic of foresight 😄

We use _.groupBy to build a dictionary mapping each countyID to an array of salaries, and we use our countyValue method to build an array of counties for our map.

We set zoom to null for now. We'll use this later.

In the return statement, we remove our "data loaded" indicator, and add an <svg> element that's 1100 pixels wide and 500 pixels high. Inside, we place the CountyMap component with a bunch of properties. Some dataset stuff, some sizing and positioning stuff.

Step 1.1: Simplify App.js state

We simplify state into a single object that holds all datasets. That way we can save some re-renders and make our life easier too.

// src/App.js
const [datasets, setDatasets] = useState({
techSalaries: [],
medianIncomes: [],
countyNames: [],
usTopoJson: null,
USstateNames: null,
})
const {
techSalaries,
medianIncomes,
countyNames,
usTopoJson,
USstateNames,
} = datasets
async function loadData() {
const datasets = await loadAllData()
setDatasets(datasets)
}

Step 3: CountyMap.js

Here comes the fun part - declaratively drawing a map. You'll see why I love using React for dataviz.

We're using the full-feature integration approach and a lot of D3 maps magic. Drawing a map with D3 I'm always surprised how little code it takes.

Start with the imports: React, D3, lodash, topojson, County component.

// src/components/CountyMap.js
import React from "react"
import * as d3 from "d3"
import * as topojson from "topojson"
import _ from "lodash"

Out of these, we haven't built County yet, and you haven't seen topojson before.

TopoJSON is a geographical data format based on JSON. We're using the topojson library to translate our geographical datasets into GeoJSON, which is another way of defining geo data with JSON.

I don't know why there are two, but TopoJSON produces smaller files, and GeoJSON can be fed directly into D3's geo functions. ¯\(ツ)

Maybe it's a case of competing standards.

Constructor

We stub out the CountyMap component then fill it in with logic.

// src/components/CountyMap/CountyMap.js
const CountyMap = ({
usTopoJson,
USstateNames,
x,
y,
width,
height,
zoom,
values,
}) => {
if (!usTopoJson) {
return null;
}else{
return (
);
}
}
}
export default CountyMap;

We need three D3 objects to build a choropleth map: a geographical projection, a path generator, and a quantize scale for colors.

// src/components/CountyMap.js
const projection = d3.geoAlbersUsa().scale(1280)
const geoPath = d3.geoPath().projection(projection)
const quantize = d3.scaleQuantize().range(d3.range(9))

You might remember geographical projections from high school. They map a sphere to a flat surface. We use geoAlbersUsa because it's made specifically for maps of the USA.

D3 offers many other projections. You can see them on d3-geo's Github page.

A geoPath generator takes a projection and returns a function that generates the d attribute of <path> elements. This is the most general way to specify SVG shapes. I won't go into explaining the d here, but it's an entire language for describing shapes.

quantize is a D3 scale. We've talked about the basics of scales in the D3 Axis example. This one splits a domain into 9 quantiles and assigns them specific values from the range.

Let's say our domain goes from 0 to 90. Calling the scale with any number between 0 and 9 would return 1. 10 to 19 returns 2 and so on. We'll use it to pick colors from an array.

the D3 magic sauce

Keeping our geo path and quantize scale up to date is simple, but we'll make it harder by adding a zoom feature. It won't work until we build the filtering, but hey, we'll already have it by then! 😄

// src/components/CountyMap.js
const projection = d3
.geoAlbersUsa()
.scale(1280)
.translate([width / 2, height / 2])
.scale(width * 1.3)
const geoPath = d3.geoPath().projection(projection)
const quantize = d3.scaleQuantize().range(d3.range(9))
if (zoom && usTopoJson) {
const us = usTopoJson,
USstatePaths = topojson.feature(us, us.objects.states).features,
id = _.find(USstateNames, { code: zoom }).id
projection.scale(width * 4.5)
const centroid = geoPath.centroid(_.find(USstatePaths, { id: id })),
translate = projection.translate()
projection.translate([
translate[0] - centroid[0] + width / 2,
translate[1] - centroid[1] + height / 2,
])
}
if (values) {
quantize.domain([
d3.quantile(values, 0.15, (d) => d.value),
d3.quantile(values, 0.85, (d) => d.value),
])
}

There's a lot going on here.

We destructure projection, quantize, and geoPath out of component state. These are the D3 object we're about to update.

First up is the projection. We translate (move) it to the center of our drawing area and set the scale property. You have to play around with this value until you get a nice result because it's different for every projection.

Then we do some weird stuff if zoom is defined.

We get the list of all US state features in our geo data, find the one we're zoom-ing on, and use the geoPath.centroid method to calculate its center point. This gives us a new coordinate to translate our projection onto.

The calculation in .translate() helps us align the center point of our zoom US state with the center of the drawing area.

While all of this is going on, we also tweak the .scale property to make the map bigger. This creates a zooming effect.

After all that, we update the quantize scale's domain with new values. Using d3.quantile lets us offset the scale to produce a more interesting map. Again, I discovered these values through experiment - they cut off the top and bottom of the range because there isn't much there. This brings higher contrast to the richer middle of the range.

render

After all that hard work, rendering is a breeze. We prep our data then loop through it and render a County element for each entry.

// src/components/CountyMap.js
if (!usTopoJson) {
return null
} else {
const us = usTopoJson,
USstatesMesh = topojson.mesh(us, us.objects.states, (a, b) => a !== b),
counties = topojson.feature(us, us.objects.counties).features
const countyValueMap = _.fromPairs(values.map((d) => [d.countyID, d.value]))
return (
<g>
{counties.map((feature) => (
<County
geoPath={geoPath}
feature={feature}
zoom={zoom}
key={feature.id}
quantize={quantize}
value={countyValueMap[feature.id]}
/>
))}
<path
d={geoPath(USstatesMesh)}
style={{
fill: "none",
stroke: "#fff",
strokeLinejoin: "round",
}}
/>
</g>
)
}

We use the TopoJSON library to grab data out of the usTopoJson dataset.

.mesh calculates a mesh for US states – a thin line around the edges. .feature calculates feature for each count – fill in with color.

Mesh and feature aren't tied to US states or counties by the way. It's just a matter of what you get back: borders or flat areas. What you need depends on what you're building.

We use Lodash's _.fromPairs to build a dictionary that maps county identifiers to their values. Building it beforehand makes our code faster. You can read some details about the speed optimizations here.

As promised, the return statement loops through the list of counties and renders County components. Each gets a bunch of attributes and returns a <path> element that looks like a specific county.

For US state borders, we render a single <path> element and use geoPath to generate the d attribute.

Step 4: County component

The County component is built from two parts: imports and color constants, and a component that returns a <path>. All the hard calculation happens in CountyMap.

// src/components/CountyMap.js
const ChoroplethColors = _.reverse([
"rgb(247,251,255)",
"rgb(222,235,247)",
"rgb(198,219,239)",
"rgb(158,202,225)",
"rgb(107,174,214)",
"rgb(66,146,198)",
"rgb(33,113,181)",
"rgb(8,81,156)",
"rgb(8,48,107)",
])
const BlankColor = "rgb(240,240,240)"

We import React and Lodash, and define some color constants. I got the ChoroplethColors from some example online, and BlankColor is a pleasant gray.

Now we need the County component.

// src/components/CountyMap.js
const County = ({ geoPath, feature, zoom, key, quantize, value }) => {
let color = BlankColor
if (value) {
color = ChoroplethColors[quantize(value)]
}
return (
<path d={geoPath(feature)} style={{ fill: color }} title={feature.id} />
)
}

The render method uses a quantize scale to pick the right color and returns a <path> element. geoPath generates the d attribute, we set style to fill the color, and we give our path a title.

Your browser should now show a map.

Choropleth map with shortened dataset

Tech work visas just aren't that evenly distributed. Even with the full dataset most counties are gray.

If that didn't work, consult this diff on Github.

Step 5: optimize D3 code with custom hooks

Our CountyMap component got quite messy with all that D3 logic in there. We can clean it up with custom hooks.

The goal is to extract logic into self-contained custom hooks that return the end result we're looking for. Logic in the function, do the math, return the final D3 object.

Remember: any function that uses hooks is a hook. We prefix them with use out of convention.

While we're at it we can also wrap the code in useMemo so our code runs faster. useMemo ensures we recreate the objects only when something relevant changes.

// src/components/CountyMap.js
function useQuantize(values) {
return useMemo(() => {
const scale = d3.scaleQuantize().range(d3.range(9))
if (values) {
scale.domain([
d3.quantile(values, 0.15, (d) => d.value),
d3.quantile(values, 0.85, (d) => d.value),
])
}
return scale
}, [values])
}
function useProjection({ width, height, zoom, usTopoJson, USstateNames }) {
return useMemo(() => {
const projection = d3
.geoAlbersUsa()
.scale(1280)
.translate([width / 2, height / 2])
.scale(width * 1.3)
const geoPath = d3.geoPath().projection(projection)
if (zoom && usTopoJson) {
const us = usTopoJson,
USstatePaths = topojson.feature(us, us.objects.states).features,
id = _.find(USstateNames, { code: zoom }).id
projection.scale(width * 4.5)
const centroid = geoPath.centroid(_.find(USstatePaths, { id: id })),
translate = projection.translate()
projection.translate([
translate[0] - centroid[0] + width / 2,
translate[1] - centroid[1] + height / 2,
])
}
return geoPath
}, [width, height, zoom, usTopoJson, USstateNames])
}

You can then use them as any other hook:

// src/components/CountyMap.js
const CountyMap = ({
usTopoJson,
USstateNames,
x,
y,
width,
height,
zoom,
values,
}) => {
const geoPath = useProjection({
width,
height,
zoom,
usTopoJson,
USstateNames,
});
const quantize = useQuantize(values);

If that doesn't work, check out this diff on GitHub

Previous:
Load and parse your data (9:45)
Next:
Render a Histogram of salaries (16:40)
Created by Swizec with ❤️