Christopher Sardegna's Blog

Thoughts on technology, design, data analysis, and data visualization.


Lessons in D3 Labeling

Lessons in D3 Labeling

The Problem

When building useful charts using the Javascript D3 library for SixteenZero, I often ran into a problem where labels for data points would overlap each other, obscuring the data. In D3, this can be solved by leveraging d3.forceSimulation().

Original Data Structure

We build our charts by mapping data to an object like so:

const formattedData = Object.keys(data).map(n => ({ name: n, x: data[n], y:data[n] }));

In this instance, this data will be passed to a visualization as the data property and then scaled along the values we assign to x and y. However, for a force simulation to work, we need to add some additional data here. This data will be used to position both the dot on a 2-axis chart as well as the label that attaches to the dot.

Modified Data Structure

We need to add three properties to our data. First, we need to add a fx property, or a “fixed x”1 that the force simulation will use when determining where to move things. We want to fix the x-axis so that the labels do not stray too far from their intended positions.

Next, we want to change the y property to realY, because the y property will get overwritten by the force simulation and we want the original data to position the dot with. This is because we only want to modify the values for the label’s y and not that of the dot’s y since the dot is the actual data point.

Finally, we want to add a property called targetY, which represents the scaled Y value of where we want the label to be. The force simulation will use this value to place the label as close as it can to the targetY value. Since this is the scaled value and not the actual value, we convert it using the normal scale function (i.e. targetY: yScale(data[n])):

const yScale = d3.scaleLinear()
                 .domain([yMinValue, yMaxValue])
                 .rangeRound([0, height]);

The result is that our data should now look like this:

const formattedData = Object.keys(data).map(n => ({ name: n, fx: data[n], x: data[n], realY: data[n], targetY: yScale(data[n]) }));

Once we have the data ready, we can write our force simulation code.

Force Simulation Code

This describes the steps we need to take to properly arrange labels without disturbing their associated points.

Clamping

We do not want the force simulation to place labels outside of the boundaries of the chart, so we use the following function to prevent that from occurring:

const forceClamp = (min, max) =>
{
    let nodes;
    const force = () =>
    {
        nodes.forEach((n) =>
        {
            if (n.y > max) { n.y = max; }
            if (n.y < min) { n.y = min; }
        });
    };
    force.initialize = f => nodes = f;
    return force;
};

Force Simulation Function

This is the actual force simulation function. In this case, formattedData is the modified data structure described above, labelFontHeight is the size of the font used in the labels, and d.targetY accesses the scaled targetY property.

const force = d3.forceSimulation()
                .nodes(formattedData)
                .force('collide', d3.forceCollide(labelFontHeight / 2))
                .force('y', d3.forceY(d => d.targetY).strength(3))
                .force('clamp', forceClamp(0, height))
                .stop();

for (let i = 0; i < 300; i += 1) { force.tick(); }

This force runs for 300 ticks, which is the generally enough to have proper label placement. Higher numbers also generally mean more accurate placement.

Accessing the new Positions

This code generates a new y property that we need to access to place the labels in the new force-directed positions.

Original Code

The original code for the visualization accessed the y properties of the data structure and scaled them using yScale():

const dotDivs = g.selectAll('.dot')
                 .data(formattedData)
                 .enter().append('g');
dotDivs.append('circle')
       .attr('class', 'scatter-dot')
       .attr('r', 2)
       .attr('cx', d => xScale(d.x))
       .attr('cy', d => height - yScale(y))
       .style('visibility', 'hidden')
       .attr('id', (d, i) => `scatter-dot-${i}`);

const imgWidth = 30;
dotDivs.append('text')
       .attr('class', 'scatter-label')
       .attr('x', d => xScale(d.x) + imgWidth / 3)
       .attr('y', d => (height - yScale(d.y)) + imgWidth / 3)
       .attr('alignment-baseline', 'hanging')
       .attr('id', (d, i) => `scatter-text-${i}`)
       .text(d => d.name);

Since we are only trying to move the label2 and not the dot3, we need to only change two lines:

Circle Position

.attr('cy', d => height - yScale(y))

Would become:

.attr('cy', d => height - yScale(d.realY))

because we want to access the original y value, i.e., the actual position the data should be in.

Label Position

.attr('y', d => (height - yScale(d.y)) + imgWidth / 3)

Would become:

.attr('y', d => (height - d.y) + imgWidth / 3)

We do not need to use yScale() here because the force direction function already targeted the proper coordinates due to how we set yScale(targetY) in the modified data structure.

Conclusions

There are probably more ways to accomplish this task, but this works for both bar graphs where the axis labels are all along a single axis in a column as well as in a scatter plot where the data points can be at any (x, y) pair. It is fast and easy to implement, generally involving only tuning the strength() constant in forceSimulation().


  1. fx is a reserved term in d3 that tells the force function to not move the coordinates for the x axis. fy exists as well. ↩︎

  2. dotDivs.append('text') ↩︎

  3. dotDivs.append('circle') ↩︎