Tuesday, 8 July 2014

d3.js multi-line graph with automatic (interactive) legend

The following post is a portion of the D3 Tips and Tricks book which is free to download. To use this post in context, consider it with the others in the blog or just download the the book as a pdf / epub or mobi .
----------------------------------------------------------

Multi-line graph with automatic legend and toggling show / hide lines.

Purpose

Creating a multi-line graph is a pretty handy thing to be able to do and we worked through an example earlier in the book as an extension of our simple graph. In that example we used a csv file that had the data arranged with each lines values in a separate column.
date,close,open
1-May-12,68.13,34.12
30-Apr-12,63.98,45.56
27-Apr-12,67.00,67.89
26-Apr-12,89.70,78.54
25-Apr-12,99.00,89.23
24-Apr-12,130.28,99.23
23-Apr-12,166.70,101.34
This is a common way to have data stored, but if you are retrieving information from a database, you may not have the luxury of having it laid out in columns. It may be presented in a more linear fashion where each lines values are stores on a unique row with the identifier for the line on the same row. For instance, the data above could just as easily be presented as follows;
price,date,value
close,1-May-12,68.13
close,30-Apr-12,63.98
close,27-Apr-12,67.00
close,26-Apr-12,89.70
close,25-Apr-12,99.00
close,24-Apr-12,130.28
close,23-Apr-12,166.70
open,1-May-12,34.12
open,30-Apr-12,45.56
open,27-Apr-12,67.89
open,26-Apr-12,78.54
open,25-Apr-12,89.23
open,24-Apr-12,99.23
open,23-Apr-12,101.34
In this case, we would need to ‘pivot’ the data to produce the same multi-column representation as the original format. This is not always easy, but it can be achieved using the d3 nest function which we will examine.
As well as this we will want to automatically encode the lines to make them different colours and to add a legend with the line name and the colour of the appropriate line.
Finally, because we will build a graph script that can cope with any number of lines (within reason), we will need to be able to show / hide the individual lines to try and clarify the graph if it gets too cluttered.
All of these features have been covered individually in the book, so what we’re going to do is combine them in a way that presents us with an elegant multi-line graph that looks a bit like this;
Multi-line graph with legend

The Code

The following is the code for the initial example which is a slight derivative of the original simple graph. A live version is available online at bl.ocks.org or GitHub. It is also available as the files ‘super-multi-lines.html’ and ‘stocks.csv’ as a download with the book D3 Tips and Tricks (in a zip file) when you download the book from Leanpub.
<!DOCTYPE html>
<meta charset="utf-8">
<style> /* set the CSS */

body { font: 12px Arial;}

path { 
    stroke: steelblue;
    stroke-width: 2;
    fill: none;
}

.axis path,
.axis line {
    fill: none;
    stroke: grey;
    stroke-width: 1;
    shape-rendering: crispEdges;
}

</style>
<body>

<!-- load the d3.js library -->    
<script src="http://d3js.org/d3.v3.min.js"></script>

<script>

// Set the dimensions of the canvas / graph
var margin = {top: 30, right: 20, bottom: 30, left: 50},
    width = 600 - margin.left - margin.right,
    height = 270 - margin.top - margin.bottom;

// Parse the date / time
var parseDate = d3.time.format("%b %Y").parse; 

// Set the ranges
var x = d3.time.scale().range([0, width]);
var y = d3.scale.linear().range([height, 0]);

// Define the axes
var xAxis = d3.svg.axis().scale(x)
    .orient("bottom").ticks(5);

var yAxis = d3.svg.axis().scale(y)
    .orient("left").ticks(5);

// Define the line
var priceline = d3.svg.line()
    .x(function(d) { return x(d.date); })
    .y(function(d) { return y(d.price); });
    
// Adds the svg canvas
var svg = d3.select("body")
    .append("svg")
        .attr("width", width + margin.left + margin.right)
        .attr("height", height + margin.top + margin.bottom)
    .append("g")
        .attr("transform", 
              "translate(" + margin.left + "," + margin.top + ")");

// Get the data
d3.csv("stocks.csv", function(error, data) {
    data.forEach(function(d) {
  d.date = parseDate(d.date);
  d.price = +d.price;
    });

    // Scale the range of the data
    x.domain(d3.extent(data, function(d) { return d.date; }));
    y.domain([0, d3.max(data, function(d) { return d.price; })]); 

    // Nest the entries by symbol
    var dataNest = d3.nest()
        .key(function(d) {return d.symbol;})
        .entries(data);

    // Loop through each symbol / key
    dataNest.forEach(function(d) {

        svg.append("path")
            .attr("class", "line")
            .attr("d", priceline(d.values)); 

    });

    // Add the X Axis
    svg.append("g")
        .attr("class", "x axis")
        .attr("transform", "translate(0," + height + ")")
        .call(xAxis);

    // Add the Y Axis
    svg.append("g")
        .attr("class", "y axis")
        .call(yAxis);

});

</script>
</body>

Description

NESTING THE DATA
The example code above differs from the simple graph in two main ways.
Firstly, the script loads the file stocks.csv which was used by Mike Bostock in his small multiples example. This means that the variable names used are different (price for the value of the stocks, symbol for the name of the stock and good old date for the date) and we have to adjust the parseDate function to parse a modifed date value.
Secondly we add the code blocks to take the stocks.csv information that we load as data and we apply thed3.nest function to it and draw each line.
The following code nest’s the data
    var dataNest = d3.nest()
        .key(function(d) {return d.symbol;})
        .entries(data);
We declare our new array’s name as dataNest and we initiate the nest function;
 var dataNest = d3.nest()
We assign the key for our new array as symbol. A ‘key’ is like a way of saying “This is the thing we will be grouping on”. In other words our resultant array will have a single entry for each unique symbol or stock which will itself be an array of dates and values.
  .key(function(d) {return d.symbol;})
Then we tell the nest function which data array we will be using for our source of data.
  }).entries(data);
Then we use the nested data to loop through our stocks and draw the lines
    dataNest.forEach(function(d) {

        svg.append("path")
            .attr("class", "line")
            .attr("d", priceline(d.values)); 

    });  
The forEach function being applied to dataNest means that it will take each of the keys that we have just declared with the d3.nest (each stock) and use the values for each stock to append a line using its values.
The end result looks like the following;
A very plain multi-line graph
You would be justified in thinking that this is more than a little confusing. Clearly while we have been successful in making each stock draw a corresponding line, unless we can tell them apart, the graph is pretty useless.
APPLYING THE COLOURS
Making sure that the colours that are applied to our lines (and ultimately our legend text) is unique from line to line is actually pretty easy.
The code that we will implement for this change is available online at bl.ocks.org or GitHub. It is also available as the files ‘super-multi-colours.html’ and ‘stocks.csv’ as a download with the book D3 Tips and Tricks (in a zip file) when you download the book from Leanpub.
The changes that we will make to our code are captured in the following code snippet.
    var color = d3.scale.category10();

    // Loop through each symbol / key
    dataNest.forEach(function(d) {

        svg.append("path")
            .attr("class", "line")
            .style("stroke", function() {
                return d.color = color(d.key); })
            .attr("d", priceline(d.values));

    });
Firstly we need to declare an ordinal scale for our colours with var color = d3.scale.category10();. This is a set of categorical colours (10 of them in this case) that can be invoked which are a nice mix of difference from each other and pleasant on the eye.
We then use the colour scale to assign a unique stroke (line colour) for each unique key (symbol) in our dataset.
    .style("stroke", function() {
        return d.color = color(d.key); })
It seems easy when it’s implemented, but in all reality, it is the product of some very clever thinking behind the scenes when designing d3.js and even picking the colours that are used. The end result is a far more usable graph of the stock prices.
Multi-line graph with unique colours
Of course now we’re faced with the problem of not knowing which line represents which stock price. Time for a legend.
ADDING THE LEGEND
If we think about the process of adding a legend to our graph, what we’re trying to achieve is to take every unique data series we have (stock) and add a relevant label showing which colour relates to which stock. At the same time, we need to arrange the labels in such a way that they are presented in a manner that is not offensive to the eye. In the example I will go through I have chosen to arrange them neatly spaced along the bottom of the graph. so that the final result looks like the following;
Multi-line graph with legend
Bear in mind that the end result will align the legend completely automatically. If there are three stocks it will be equally spaced, if it is six stocks they will be equally spaced. The following is a reasonable mechanism to facilitate this, but if the labels for the data values are of radically different lengths, the final result will looks ‘odd’ likewise, if there are a LOT of data values, the legend will start to get crowded.
The code that we will implement for this change is available online at bl.ocks.org or GitHub. It is also available as the files ‘super-multi-legend.html’ and ‘stocks.csv’ as a download with the book D3 Tips and Tricks (in a zip file) when you download the book from Leanpub.
There are three broad categories of changes that we will want to make to our current code;
  1. Declare a style for the legend font
  2. Change the area and margins for the graph to accommodate the additional text
  3. Add the text
Declaring the style for the legend text is as easy as making an appropriate entry in the <style> section of the code. For this example I have chosen the following;
.legend {
    font-size: 16px;
    font-weight: bold;
    text-anchor: middle;
} 
To change the area and margins of the graph we can make the following small changes to the code.
var margin = {top: 30, right: 20, bottom: 70, left: 50}, 
    width = 600 - margin.left - margin.right,
    height = 300 - margin.top - margin.bottom;  
The bottom margin is now 70 pixels high and the overall space for the area that the graph (including the margins) covers is increased to 300 pixels.
To add the legend text is slightly more work, but only slightly more. The following code incorporates the changes and I have placed commented out asterisks to the end of the lines that have been added
    legendSpace = width/dataNest.length; // spacing for legend // ******

    // Loop through each symbol / key
    dataNest.forEach(function(d,i) {                           // ******

        svg.append("path")
            .attr("class", "line")
            .style("stroke", function() { // Add the colours dynamically
                return d.color = color(d.key); })
            .attr("d", priceline(d.values));

        // Add the Legend
        svg.append("text")                                    // *******
            .attr("x", (legendSpace/2)+i*legendSpace) // spacing // ****
            .attr("y", height + (margin.bottom/2)+ 5)         // *******
            .attr("class", "legend")    // style the legend   // *******
            .style("fill", function() { // dynamic colours    // *******
                return d.color = color(d.key); })             // *******
            .text(d.key);                                     // *******

    });
The first added line finds the spacing between each legend label by dividing the width of the graph area by the number of symbols (key’s or stocks).
    legendSpace = width/dataNest.length;
Then there is a small and subtle change that might other wise go unnoticed, but is nonetheless significant. We add an i to the forEach function;
    dataNest.forEach(function(d,i) {
This might not seem like much of a big deal, but declaring i allows us to access the index of the returned data. This means that each unique key (stock or symbol) has a unique number. In our example those numbers would be from 0 to 3 (MSFT = 0, AMZN = 1, IBM = 2 and AAPL = 3 (this is the order in which the stocks appear in our csv file)).
Now we get to adding our text. Again, this is a fairly simple exercise which is following the route that we have taken several times already in the book but using some of our prepared values.
        svg.append("text")
            .attr("x", (legendSpace/2)+i*legendSpace)
            .attr("y", height + (margin.bottom/2)+ 5)
            .attr("class", "legend")
            .style("fill", function() {
                return d.color = color(d.key); })
            .text(d.key); 
The horizontal spacing for the labels is achieved by setting each label to the position set by the index associated with the label and the space available on the graph. To make it work out nicely we add half alegendSpace at the start (legendSpace/2) and then add the product of the index (i) and legendSpace(i*legendSpace).
We position the legend vertically so that it is in the middle of the bottom margin (height + (margin.bottom/2)+ 5).
And we apply the same colour function to the text as we did to the lines earlier;
            .style("fill", function() {
                return d.color = color(d.key); })
The final result is a neat and tidy legend at the bottom of the graph;
Multi-line graph with legend
If you’re looking for an exercise to test your skills you could adapt the code to show the legend to the right of the graph. And if you wanted to go one better, you could arrange the order of the legend to reflect the final numeric value on the right of the graph (I.e in this case AAPL would be on the top and MSFT on the bottom).
MAKING IT INTERACTIVE
The last step we’ll take in this example is to provide ourselves with a bit of control over how the graph looks. Even with the multiple colours, the graph could still be said to be ‘busy’. To clean it up or at least to provide the ability to more clearly display the data that a user wants to see we will add code that will allow us to click on a legend label and this will toggle the corresponding graph line on or off.
This is a progression from the example of how to show / hide an element by clicking on another element that was introduced in he ‘Assorted tips and tricks’ chapter.
The only changes to our code that need to be implemented are in the forEach section below. I have left some comments with asterisks in the code below to illustrate lines that are added.
    dataNest.forEach(function(d,i) { 

        svg.append("path")
            .attr("class", "line")
            .style("stroke", function() {
                return d.color = color(d.key); })
            .attr("id", 'tag'+d.key.replace(/\s+/g, '')) // assign ID **
            .attr("d", priceline(d.values));

        // Add the Legend
        svg.append("text")
            .attr("x", (legendSpace/2)+i*legendSpace)
            .attr("y", height + (margin.bottom/2)+ 5)
            .attr("class", "legend")
            .style("fill", function() {
                return d.color = color(d.key); })
            .on("click", function(){                     // ************
                // Determine if current line is visible 
                var active   = d.active ? false : true,  // ************ 
                newOpacity = active ? 0 : 1;             // ************
                // Hide or show the elements based on the ID
                d3.select("#tag"+d.key.replace(/\s+/g, '')) // *********
                    .transition().duration(100)          // ************
                    .style("opacity", newOpacity);       // ************
                // Update whether or not the elements are active
                d.active = active;                       // ************
                })                                       // ************
            .text(d.key); 

    });
The full code for the complete working example is available online at bl.ocks.org or GitHub. It is also available as the files ‘super-multi.html’ and ‘stocks.csv’ as a download with the book D3 Tips and Tricks (in a zip file) when you download the book from Leanpub.
The first piece of code that wee need to add assign an id to each legend text label.
        .attr("id", 'tag'+d.key.replace(/\s+/g, ''))
Being able to use our key value as the id means that each label will have a unique identifier. “What’s with adding the 'tag' piece of text to the id?” I hear you ask. Good question. If our key starts with a number we could strike trouble (in fact I’m sure there are plenty of other ways we could strike trouble too, but this was one I came accross). As well as that we include a little regular expression goodness to strip any spaces out of the key with .replace(/\s+/g, '').
The .replace calls the regular expression action on our key\s is the regex for “whitespace”, andg is the “global” flag, meaning match ALL \s (whitespaces). The + allows for any contiguous string of space characters to being replaced with the empty string (''). This was a late addition to the example and kudos go to the participants in the Stack Overflow question here.
Then we use the .on("click", function(){ call carry out some actions on the label if it is clicked on. We toggle the .active descriptor for our element with var active = d.active ? false : true,. Then we set the value ofnewOpacity to either 0 or 1 depending on whether active is false or true.
From here we can select our lable using its uinque id and adjust it’s opacity to either 0 (transparent) or 1(opaque);
        d3.select("#tag"+d.key.replace(/\s+/g, ''))
            .transition().duration(100)
            .style("opacity", newOpacity);
Just because we can, we also add in a transition statement so that the change in transparency doesn't occur in a flash (100 milli seconds in fact (.duration(100))).
Lastly we update our d.active variable to whatever the active state is so that it can toggle correctly the next time it is clicked on.
Head on over to the live example on bl.ocks.org to see the toggling awesomeness that could be yours!

The description above (and heaps of other stuff) is in the D3 Tips and Tricks book that can be downloaded for free (or donate if you really want to :-)).

18 comments:

  1. D3Noob, I love this example, its really awesome!

    How would I add Dots/Points to these Line Charts?

    ReplyDelete
    Replies
    1. Try the section here on adding points and see how you get on. https://leanpub.com/D3-Tips-and-Tricks/read#leanpub-auto-change-a-line-chart-into-a-scatter-plot

      Delete
  2. Hey so do you think you could tell me how this would work with a JSON file instead of a CSV? Thanks a lot :)

    ReplyDelete
    Replies
    1. Hey! Sorry for the really late reply. The good news is that to use a json file instead of a csv it's as simple as replacing the `d3.csv("stocks.csv"` pary of the main file with something like `d3.json("stocks.json"`, where the file 'stocks.json' hac the data encoded as json. Check out here for a bit of a primer of json https://leanpub.com/D3-Tips-and-Tricks/read#leanpub-auto-understanding-javascript-object-notation-json.

      Delete
  3. Thanks for this example, this might be what I'm looking for!

    But I ran into an issue with the legend: I have many many data-values with different lenghts, and I can't get the legend-area to put thos in multiple lines. It seems all "symbols"/"keys" are placed in one line and they merge into each other. I tried to set the height of the legend itself without success. I also changed the "legendSpace" value to "200"...then the symbols/keys don't get merged into each other, but there are only four symbols/keys displayed/visible.

    Any hint on how I could solve this individual issue?

    ReplyDelete
    Replies
    1. Good question! I'm sure it could be done, but it would be quite a significant change to the code presented here. Perhaps some sort of 'packing' approach depending on the space made available.

      Delete
  4. Your code is working fine with Stock.csv
    #stocks.csv#
    symbol,date,price
    MSFT,Jan 2000,39.81
    MSFT,Feb 2000,36.35
    MSFT,Mar 2000,43.22
    MSFT,Apr 2000,28.37

    But it breaks when i tried to use
    #New.csv#
    symbol,date,price
    close,1-May-12,68.13
    close,30-Apr-12,63.98
    close,27-Apr-12,67.00
    close,26-Apr-12,89.70
    close,25-Apr-12,99.00
    close,24-Apr-12,130.28
    close,23-Apr-12,166.70
    open,1-May-12,34.12
    open,30-Apr-12,45.56
    open,27-Apr-12,67.89
    open,26-Apr-12,78.54
    open,25-Apr-12,89.23
    open,24-Apr-12,99.23
    open,23-Apr-12,101.34


    Please advise me how to resolve issue.
    Many thanks in advance.

    ReplyDelete
    Replies
    1. Hi there, The most likely problem you will be facing is that your 'date' column is formatted differently and D3 will want to know how it is formatted. Instead of me solving the problem for you I will let you know where to find the right information and see if you can work it out yourself (this will really help the learning process and this particular piece of information is super useful). So I've explained it in the book here https://leanpub.com/D3-Tips-and-Tricks/read#leanpub-auto-formatting-the-date--time or there is the official reference here https://github.com/d3/d3/wiki/Time-Formatting. As a hint, you should be focussing on the line `var parseDate = d3.time.format("%b %Y").parse;`. Good luck.

      Delete
  5. You are right , I will go through the given links and will fix it .
    Thanks once again.

    ReplyDelete
  6. Hi there, thank you for this example. I have a doubt, in the following line: var active = d.active ? false : true, where does it get the active property from? I don't see active property in your data.

    ReplyDelete
    Replies
    1. Check out the very end of the post. There is a short description of how the active variable is used and the line you mention is the first of two that reference that variable, the second is 7 lines later. It is a little tricky to get straight in the head. I imagine it as a process that simply swaps active states. I hope that helps

      Delete
  7. Hi there, thank you for this example it was easy to follow but i have a doubt the example which is give has the opening and closing dates same but if they were different meaning if the dates of opening and the dates of the closing were completely different or some of them matched not all then what can be done in that case

    ReplyDelete
    Replies
    1. That's a good question and the good news is that the code is still valid. Test it and see :-).

      Delete
  8. I'm getting an error "d3.scale" is undefined trying to run "var cpucolor = d3.scale.category10();" I was on d3 version 4.0.0 so I upgraded to the latest (4.3.0) and same error. This is on a page that's been running with d3 visualizations, so I know that d3 is present and other d3 methods work fine.

    ReplyDelete
    Replies
    1. Hi, The color setting functions have undergone some changes in v4. What you'll be looking for is something like
      var cpucolor = d3.scaleOrdinal(d3.schemeCategory10);
      The best suggestion I could do (since I haven't added this section to tthe new v4 book yet) is to check out the example here (http://bl.ocks.org/d3noob/ced1b9b18bd8192d2c898884033b5529) that has most of the necessary changes implimented that you will need. Check it out and see how you can adapt it.

      Delete
    2. In response to your question I have now updated that particular code example for the book (https://leanpub.com/d3-t-and-t-v4/read#leanpub-auto-multi-line-graph-with-automatic-legend-and-toggling-show--hide-lines) and for bl.coks.org (http://bl.ocks.org/d3noob/08af723fe615c08f9536f656b55755b4)

      Delete
  9. Not working in IE and Chrome. Firefox works fine though.

    ReplyDelete
    Replies
    1. That's weird. Although this would fit with a situation if you were hosting your files locally and didn't have a web server running. Firefox is more tolerant of that sort of thing. might that be the case? Or a better test might be seeing if the various browsers can use the same graph here http://bl.ocks.org/d3noob/e99a762017060ce81c76 if they can it will probably be a problem with your web server setup.

      Delete