Thursday, 4 August 2016

Create a simple line graph using d3.js v4

The following post is a section of the book 'D3 Tips and Tricks v4.x'.  The entire book can be downloaded in pdf format for free from Leanpub or you can read it online here.
Since this post is a snapshot in time. I recommend that you download a copy of the book which is updated frequently to improve and expand the content.
---------------------------------------

Starting with a simple graph

We’ll start with the full code for a simple graph and then we can go through it piece by piece.
Here’s what the basic graph looks like;
Basic Graph

And here’s the code that makes it happen;

<!DOCTYPE html>
<meta charset="utf-8">
<style> /* set the CSS */

.line {
  fill: none;
  stroke: steelblue;
  stroke-width: 2px;
}

</style>
<body>

<!-- load the d3.js library -->     
<script src="https://d3js.org/d3.v4.min.js"></script>
<script>

// set the dimensions and margins of the graph
var margin = {top: 20, right: 20, bottom: 30, left: 50},
    width = 960 - margin.left - margin.right,
    height = 500 - margin.top - margin.bottom;

// parse the date / time
var parseTime = d3.timeParse("%d-%b-%y");

// set the ranges
var x = d3.scaleTime().range([0, width]);
var y = d3.scaleLinear().range([height, 0]);

// define the line
var valueline = d3.line()
    .x(function(d) { return x(d.date); })
    .y(function(d) { return y(d.close); });

// append the svg obgect to the body of the page
// appends a 'group' element to 'svg'
// moves the 'group' element to the top left margin
var svg = d3.select("body").append("svg")
    .attr("width", width + margin.left + margin.right)
    .attr("height", height + margin.top + margin.bottom)
  .append("g")
    .attr("transform",
          "translate(" + margin.left + "," + margin.top + ")");

// Get the data
d3.csv("data.csv", function(error, data) {
  if (error) throw error;

  // format the data
  data.forEach(function(d) {
      d.date = parseTime(d.date);
      d.close = +d.close;
  });

  // Scale the range of the data
  x.domain(d3.extent(data, function(d) { return d.date; }));
  y.domain([0, d3.max(data, function(d) { return d.close; })]);

  // Add the valueline path.
  svg.append("path")
      .data([data])
      .attr("class", "line")
      .attr("d", valueline);

  // Add the X Axis
  svg.append("g")
      .attr("transform", "translate(0," + height + ")")
      .call(d3.axisBottom(x));

  // Add the Y Axis
  svg.append("g")
      .call(d3.axisLeft(y));

});

</script>
</body>

The full code for this example can be found on github or in the code samples bundled with this book (simple-graph.html and data.csv). A live example can be found on bl.ocks.org. Please note that the <head></head> tags are omitted which is a common thing for d3 examples (I don’t know why). This can cause problems for some browsers in certain conditions.
Once we’ve finished working through the explanation of the functional blocks that make up the graph, we’ll start looking at what we need to add in and adjust so that we can incorporate other useful functions that are completely reusable in other diagrams as well.
Working on the premiss that we can break the file down into component parts we will explain the major blocks as HTMLCSS andJavaScript. I’m going to play kind of fast and loose here, but never fear, it’ll all make sense.

HTML

Here’s the HTML portion of the code;

<!DOCTYPE html>
<meta charset="utf-8">
<style>

    The CSS is in here

</style>
<body>
<script src="https://d3js.org/d3.v4.min.js"></script>

<script>

    The D3 JavaScript code is here

</script>
</body>

Compare it with the full code. It kind of looks like a wrapping for the CSS and JavaScript. You can see that it really doesn’t boil down to much at all (that doesn’t mean it’s not important).
There are plenty of good options for adding additional HTML stuff into this very basic part of the file, but for what we’re going to be doing, we really don’t need to bother too much.
One thing probably worth mentioning is the line;

<script src="https://d3js.org/d3.v4.min.js"></script>

That’s the line that identifies the file that needs to be loaded to get D3 up and running. In this case the file is sourced from the official d3.js repository on the internet (that way we are using the most up to date version). The D3 file is actually called d3.v4.min.js which may come as a bit of a surprise. That tells us that this is version 4 of the d3.js file (the v4 part) which is an indication that it is separate from the v3 release, which was superseded in the middle of 2016. The other point to note is that this version of d3.js is the minimised version (hence min). This means that any extraneous information has been removed from the file to make it quicker to load.
Later when doing things like implementing integration with bootstrap (a pretty layout framework) we will be doing a great deal more, but for now, that’s the basics done.
The two parts that we left out are the CSS and the D3 JavaScript.

Cascading Style Sheets (CSS)

The CSS is as follows;

.line {
  fill: none;
  stroke: steelblue;
  stroke-width: 2px;
}

Cascading Style Sheets (CSS) give you control over the look / feel / presentation of web content. The idea is to define a set of properties to objects in the web page.
They are made up of ‘rules’. Each rule has a ‘selector’ and one or more ‘declarations’ and each declaration has a property and a value (or a group of properties and values).
For instance in the example code for this web page we have the following rule;

.line {
  fill: none;
  stroke: steelblue;
  stroke-width: 2px;
}

line is the selector. The period (.) in front of line indicates that the selector is a ‘class’. This tells us that on the web page, any particular element (and we are going to apply this rule to the line of our graph) which we decorate with the ‘class’, line will have the various declarations applied to it.
There are three declarations as part of the rule. These are contained within the curly braces and separated by semi-colons.
One of the declarations is for the width of the graph line (stroke-width: 2px;) The property is stroke-width: and the value is 2px (2 pixels). This tells the web page that any element in the web page that has the class line will have lines drawn that are (amongst other things) 2 pixels wide.
Sure enough if we look at the line of the graph…
Graph line with stroke-width of 2 pixels

That looks as if the line might actually be 2 pixels wide!
Let’s try a test. We can change that particular declaration to the following;

  stroke-width: 20px;

and the result is…
Graph line with stroke-width of 20 pixels

Ahh…. 20 pixels of goodness!
Because we’re getting the hang of things now, let’s change the colour declaration to…

  stroke: red;

and we get…
Graph line with stroke colour changed to red

Awesome! I think we can safely say that this has had the desired effect.
So what else is there?
Since there’s only one declaration left, it seems like a shame not to try something different with it;

    fill: blue;

We’ll get…
Basic Graph with changed CSS

So the ‘fill’ property looks like it will change the colour of the area that would be closed by the line. Nice.
The one thing to take away from this small exercise is that there is a good deal of flexibility in adjusting properties of elements on the web page via CSS.

D3 JavaScript

The D3 JavaScript part of the code is as follows;

// set the dimensions and margins of the graph
var margin = {top: 20, right: 20, bottom: 30, left: 50},
    width = 960 - margin.left - margin.right,
    height = 500 - margin.top - margin.bottom;

// parse the date / time
var parseTime = d3.timeParse("%d-%b-%y");

// set the ranges
var x = d3.scaleTime().range([0, width]);
var y = d3.scaleLinear().range([height, 0]);

// define the line
var valueline = d3.line()
    .x(function(d) { return x(d.date); })
    .y(function(d) { return y(d.close); });

// append the svg obgect to the body of the page
// appends a 'group' element to 'svg'
// moves the 'group' element to the top left margin
var svg = d3.select("body").append("svg")
    .attr("width", width + margin.left + margin.right)
    .attr("height", height + margin.top + margin.bottom)
  .append("g")
    .attr("transform",
          "translate(" + margin.left + "," + margin.top + ")");

// Get the data
d3.csv("data.csv", function(error, data) {
  if (error) throw error;

  // format the data
  data.forEach(function(d) {
      d.date = parseTime(d.date);
      d.close = +d.close;
  });

  // Scale the range of the data
  x.domain(d3.extent(data, function(d) { return d.date; }));
  y.domain([0, d3.max(data, function(d) { return d.close; })]);

  // Add the valueline path.
  svg.append("path")
      .data([data])
      .attr("class", "line")
      .attr("d", valueline);

  // Add the X Axis
  svg.append("g")
      .attr("transform", "translate(0," + height + ")")
      .call(d3.axisBottom(x));

  // Add the Y Axis
  svg.append("g")
      .call(d3.axisLeft(y));

});

Again there’s quite a bit of detail in the code, but it’s not so long that we can’t work out what’s doing what.
The first thing to note is that throughout the code we have lines that are adding a description of what the code does. These have two forward-stroke characters (//) preceding them which the computer will recognise as a line that only contains comments. I recommend that you add them into your own code where you think that you might want reminding of a function or description.
Let’s examine the blocks bit by bit to get a feel for it.

Setting up the margins and the graph area.

The part of the code responsible for defining the canvas (or the area where the graph and associated bits and pieces is placed ) is this part.

var margin = {top: 20, right: 20, bottom: 30, left: 50},
    width = 960 - margin.left - margin.right,
    height = 500 - margin.top - margin.bottom;

This is really (really) well explained on Mike Bostock’s page on margin conventions here http://bl.ocks.org/3019563, but at the risk of confusing you here’s my crude take on it.
The first line defines the four margins which surround the block where the graph (as an object) is positioned.

var margin = {top: 20, right: 20, bottom: 30, left: 50},

So there will be a border of 20 pixels at the top, 20 at the right and 30 and 50 at the bottom and left respectively. Now the cool thing about how these are set up is that they use an array to define everything. That means if you want to do calculations in the JavaScript later, you don’t need to put the numbers in, you just use the variable that has been set up. In this case margin.right = 20!
So when we go to the next line;

    width = 960 - margin.left - margin.right,

The width of the inner block of the area where the graph will be drawn is 960 pixels – margin.left – margin.right or 960-50-20 or 890 pixels wide. Of course now we have another variable ‘width’ that we can use later in the code.
Obviously the same treatment is given to height.
Another cool thing about all of this is that just because we appear to have defined separate areas for the graph and the margins, the whole area in there is available for use. It just makes it really useful to have areas designated for the axis labels and graph labels without having to juggle them and the graph proper at the same time.
So, let’s have a play and change some values.

var margin = {top: 80, right: 20, bottom: 80, left: 50},
    width = 400 - margin.left - margin.right,
    height = 270 - margin.top - margin.bottom;

The effect of changing the margins

Here we’ve made the graph narrower (400 pixels) but retained the left / right margins and increased the top / bottom margins while changing the overall height of the canvas to 270 pixels. The really cool thing that you can tell from this is that while we shrank the dimensions of the area that we had to draw the graph in, it was still able to dynamically adapt the axes and line to fit properly (Although the x axis values got a bit squished. Don’t worry we’ll work through that shortly). That is the really cool part of this whole business. D3 is running in the background looking after the drawing of the objects, while you get to concentrate on how the data looks without too much maths!


Getting the Data

We’re going to jump forward a little bit here to the portion of the JavaScript code that loads the data for the graph.
I’m going to go out of the sequence of the code here, because if you know what the data is that you’re using, it will make explaining some of the other functions much easier.
The section that grabs the data is this bit.

d3.csv("data.csv", function(error, data) {
  if (error) throw error;
  
// format the data
  data.forEach(function(d) {
      d.date = parseTime(d.date);
      d.close = +d.close;
  });

There’s lots of different ways that we can get data into our web page and turn it into graphics. The method that we’ll want to use will probably depend more on the format that the data is in than the mechanism we want to use for importing.
For instance, if it’s only a few points of data we could include the information directly in the JavaScript.
That would make it look something like;

var data = [
    {date:"1-May-12",close:"58.13"},
    {date:"30-Apr-12",close:"53.98"},
    {date:"27-Apr-12",close:"67.00"},
    {date:"26-Apr-12",close:"89.70"},
    {date:"25-Apr-12",close:"99.00"}
];

The format of the data shown above is called JSON (JavaScript Object Notation) and it’s a great way to include data since it’s easy for humans to read what’s in there and it’s easy for computers to parse the data out. For a brief overview of JSON there is a separate section in the “Assorted Tips and Tricks Chapter” that may assist.
But if you’ve got a fair bit of data or if the data you want to include is dynamic and could be changing from one moment to the next, you’ll want to load it from an external source. That’s when we call on D3’s ‘Request’ functions.


The different types of data that can be requested by D3 are;
  • text: A plain old piece of text that has options to be encoded in a particular way.
  • json: This is the afore mentioned JavaScript Object Notation.
  • xml: Extensible Markup Language is a language that is widely used for encoding documents in a human readable forrm.
  • html: HyperText Markup Language is the language used for displaying web pages.
  • csv: Comma Separated Values is a widely used format for storing data where plain text information is separated by (wait for it) commas.
  • tsv: Tab Separated Values is a widely used format for storing data where plain text information is separated by a tab-stop character.
Details on these ingestion methods and the formats for the requests are well explained on the D3 Wiki page. In this particular script we will look at the csv request method.


Back to our request…

d3.csv("data.csv", function(error, data) {
  if (error) throw error;
  
// format the data
  data.forEach(function(d) {
      d.date = parseTime(d.date);
      d.close = +d.close;
  });

The first line of that piece of code invokes the d3.csv request (d3.csv) and then the function is pointed to the data file that should be loaded (data.csv). This is referred to as the ‘URL’ (Unique Resource Locator) of the file. In this case the file is stored locally (in the same directory as the simple-graph.html file), but the URL could just as easily point to a file somewhere on the Internet.
The format of the data in the data.csv file looks a bit like this (although the file is longer (about 26 data points));

date,close
1-May-12,58.13
30-Apr-12,53.98
27-Apr-12,67.00
26-Apr-12,89.70
25-Apr-12,99.00

The ‘date’ and the ‘close’ heading labels are separated by a comma as are each subsequent date and number. Hence the ‘comma separated values’ :-).
The next part is part of the coolness of JavaScript. With the request for the file made, the script is told to carry out a function on the data (which will now be called ‘data’).

function(error, data) {

The function statement will catch any error that is generated and load the data that is ingested as the array ‘data’. The following line ensures that any errors that are generated are captured and ‘thrown’ to an appropriate ‘catch’ block (if it exists) in the function. If it doesn’t exist the program will terminate.
There are actually more things that get acted on as part of the function call (which we will examine soon), but the one we will consider here is contained in the following lines;

  data.forEach(function(d) {
      d.date = parseTime(d.date);
      d.close = +d.close;
  });

This block of code ensures that all the values that are pulled out of the csv file are set and formatted correctly. The first line declares that the data array called ‘data’ (confusingly) is being dealt with and tells the block of code that, for each group within the ‘data’ array it should carry out a function on it. Furthermore, when it carries out the formatting of each part of the array, it should designate the equivalent of each row as being ‘d’.

    data.forEach(function(d) { 

The information in the array can be considered as being stored in rows. Each row consists of two values: one value for ‘date’ and another value for ‘close’.
The function is pulling out values of ‘date’ and ‘close’ one row at a time.
Each time (Get it? forEach?) it gets a value of ‘date’ and ‘close’ it carries out the following operations;

      d.date = parseTime(d.date);

For each value of date being operated on (d.date), d3.js changes it into a date format that is processed via a separate function ‘parseTime’. (The parseTime function is defined in a separate part of the script, and we will examine that later.) For the moment, be satisfied that it takes the raw date information from the CSV file in each row and converts it into a format that D3 can recognise as a date/time. That value is then re-saved in the same variable space.
The next line then sets the ‘close’ variable to a numeric value (if it isn’t already) using the ‘+’ operator.

      d.close = +d.close;

At the end of this section of code, we have gone out and picked up a file with data in it of a particular type (comma separated values) and ensured that it is formatted in a way that the rest of the script can use correctly.
Now, the astute amongst you will have noticed that in the first line of that block of code (d3.csv("data.csv", function(error, data) {) we opened a normal bracket ( ( ) and a curly bracket ( { ), but we never closed them. That’s because they stay open until the very end of the file. That means that all those blocks that occur after the d3.csv bit are referenced to the data array. Or put another way, it uses the data in the dataarray to draw stuff!
But anyway, let’s get back to figuring what the code is doing by jumping back to the end of the margins block.

Formatting the Date / Time.

One of the glorious things about the World is that we all do things a bit differently. One of those things is how we refer to dates and time.
In my neck of the woods, it’s customary to write the date as day - month – year. E.g 23-12-2012. But in the United States the more common format would be 12-23-2012. Likewise, the data may be in formats that name the months or weekdays (E.g. January, Tuesday) or combine dates and time together (E.g. 2012-12-23 15:45:32). So, if we were to attempt to try to load in some data and to try and get D3 to recognise it as date / time information, we really need to tell it what format the date / time is in.


Time for a little demonstration (see what I did there).
We will change our data.csv file so that it only includes two points. The first one and the last one with a separation of a month and a bit. It will therefore look a little like this;

date,close
1-May-12,58.13
26-Mar-12,606.98

The graph now looks like this;
Simple line graph

Nothing too surprising here, a very simple graph (note the time scale on the x axis).
Now we will change the later date in the data.csv file so that it is a lot closer to the starting date;

date,close
29-Mar-12,58.13
26-Mar-12,606.98

So, just a three day difference. Let’s see what happens.
Simple line graph over three days

Ahh…. Not only did we not have to make any changes to our JavaScript code, but it was able to recognise the dates were closer and fill in the intervening gaps with appropriate time / day values. Now, one more time for giggles.
This time we’ll stretch the interval out by a few years.

date,close
29-Mar-21,58.13
26-Mar-12,606.98

and the result is…
Simple line graph over several years

Hopefully that’s enough encouragement to impress upon you that formatting the time is a REALLY good thing to get right. Trust me, it will never fail to impress :-).
Back to formatting.
The line in the JavaScript that parses the time is the following;

var parseTime = d3.timeParse("%d-%b-%y");

This line is used when the data.forEach(function(d) portion of the code (that we looked at a couple of pages back) used d.date = parseTime(d.date)as a way to take a date in a specific format and to get it recognised by D3. In effect it said “take this value that is supposedly a date and make it into a value I can work with”.
The function used is the d3.timeParse(specifier) function where the specifier in this case is the mysterious combination of characters %d-%b-%y. The good news is that these are just a combination of directives specific for the type of date we are presenting.
The % signs are used as prefixes to each separate format type and the ‘-’ (minus) signs are literals for the actual ‘-’ (minus) signs that appear in the date to be parsed.
The d refers to a zero-padded day of the month as a decimal number [01,31].
The b refers to an abbreviated month name.
And the y refers to the year (without the centuries) as a decimal number.
If we look at a subset of the data from the data.csv file we see that indeed, the dates therein are formatted in this way.

1-May-12,58.13
30-Apr-12,53.98
27-Apr-12,67.00
26-Apr-12,89.70
25-Apr-12,99.00

That’s all well and good, but what if your data isn’t formatted exactly like that?
Good news. There are multiple different formatters for different ways of telling time and you get to pick and choose which one you want. Check out the Time Formatting page on the D3 Wiki for a the authoritative list and some great detail, but the following is the list of currently available formatters (from the d3 wiki);
  • %a - abbreviated weekday name.
  • %A - full weekday name.
  • %b - abbreviated month name.
  • %B - full month name.
  • %c - date and time, as “%a %b %e %H:%M:%S %Y”.
  • %d - zero-padded day of the month as a decimal number [01,31].
  • %e - space-padded day of the month as a decimal number [ 1,31].
  • %H - hour (24-hour clock) as a decimal number [00,23].
  • %I - hour (12-hour clock) as a decimal number [01,12].
  • %j - day of the year as a decimal number [001,366].
  • %m - month as a decimal number [01,12].
  • %M - minute as a decimal number [00,59].
  • %p - either AM or PM.
  • %S - second as a decimal number [00,61].
  • %U - week number of the year (Sunday as the first day of the week) as a decimal number [00,53].
  • %w - weekday as a decimal number [0(Sunday),6].
  • %W - week number of the year (Monday as the first day of the week) as a decimal number [00,53].
  • %x - date, as “%m/%d/%y”.
  • %X - time, as “%H:%M:%S”.
  • %y - year without century as a decimal number [00,99].
  • %Y - year with century as a decimal number.
  • %Z - time zone offset, such as “-0700”.
  • There is also a a literal “%” character that can be presented by using double % signs.
As an example, if you wanted to input date / time formatted as a generic MySQL ‘YYYY-MM-DD HH:MM:SS’ TIMESTAMP format the D3 parse script would look like;

parseTime = d3.timeParse("%Y-%m-%d %H:%M:%S");

Setting Scales Domains and Ranges

This is another example where, if you set it up right, D3 will look after you forever.


From our basic web page we have now moved to the section that includes the following lines;

// set the ranges
var x = d3.scaleTime().range([0, width]);
var y = d3.scaleLinear().range([height, 0]);

The purpose of these portions of the script is to ensure that the data we ingest fits onto our graph correctly. Since we have two different types of data (date/time and numeric values) they need to be treated separately (but d3 manages them in almost the same way). To examine this whole concept of scales, domains and ranges properly, we will also move slightly out of sequence and (in conjunction with the earlier scale statements) take a look at the lines of script that occur later and set the domain. They are as follows;

  // Scale the range of the data
  x.domain(d3.extent(data, function(d) { return d.date; }));
  y.domain([0, d3.max(data, function(d) { return d.close; })]);

The idea of scaling is to take the range of values of data that we have and to fit them into the space we have available.
If we have data that goes from 53.98 to 636.23 (as the data we have for ‘close’ in our csv file does), but we have a graph that is 450 pixels high (height = 500 - margin.top – margin.bottom;) we clearly need to make an adjustment.
Not only that. Even though our data goes from 53.98 to 636.23, that would look slightly misleading on the graph and it should really go from 0 to a bit over 636.23. It sound’s really complicated, so let’s simple it up a bit.
First we make sure that any quantity we specify on the x axis fits onto our graph.

var x = d3.scaleTime().range([0, width]);

Here we set our variable (x) that will tell D3 where to draw something on the x axis. By using the d3.scaleTime() function we make sure that D3 knows to treat the values as date / time entities (with all their ingrained peculiarities). Then we specify the range that those values will cover (.range) and we specify the range as being from 0 to the width of our graphing area (See! Setting those variables for margins and widths are starting to pay off now!).
Then we do the same for the Y axis.

var y = d3.scaleLinear().range([height, 0]);

There’s a different function call (d3.scaleLinear()) but the .range setting is still there. In the interests of drawing a (semi) pretty picture to try and explain, hopefully this will assist;
Scaling the data to the graph size

I know, I know, it’s a little misleading because nowhere have we actually said to D3 this is our data from 53.98 to 636.23. All we’ve said is when we get the data, we’ll be scaling it into this space.
Now hang on, what’s going on with the [height, 0] part in y axis scale statement? The astute amongst you will note that for the time scale we set the range as [0, width] but for this one ([height, 0]) the values look backwards.
Well spotted.
This is all to do with how the screen is laid out and referenced. Take a look at the following diagram showing how the coordinates for drawing on your screen work;
Coordinates that the browser expects

The top left hand of the screen is the origin or 0,0 point and as we go left or down the corresponding x and y values increase to the full values defined by height and width.
That’s good enough for the time values on the x axis that will start at lower values and increase, but for the values on the y axis we’re trying to go against the flow. We want the low values to be at the bottom and the high values to be at the top.
No problem. We just tell D3 via the statement y = d3.scaleLinear().range([height, 0]); that the larger values (height) are at the low end of the screen (at the top) and the low values are at the bottom (as you most probably will have guessed by this stage, the .range statement uses the format .range([closer_to_the_origin, further_from_the_origin]). So when we put the height variable first, that is now associated with the top of the screen.
Coordinates with adjusted ranges

We’ve scaled our data to the graph size and ensured that the range of values is set appropriately. What’s with the domain part that was in this section’s title?
Come on, you remember this little piece of script don’t you?

  x.domain(d3.extent(data, function(d) { return d.date; }));
  y.domain([0, d3.max(data, function(d) { return d.close; })]);

While it exists in a separate part of the file from the scale / range part, it is certainly linked.
That’s because there’s something missing from what we have been describing so far with the set up of the data ranges for the graphs. We haven’t actually told D3 what the range of the data is. That’s also the reason this part of the script occurs where it does. It is within the section where the data.csv file has been loaded as ‘data’ and it’s therefore ready to use it.
So, the .domain function is designed to let D3 know what the scope of the data will be. This is what is then passed to the scale function.
Looking at the first part that is setting up the x axis values, it is saying that the domain for the x axis values will be determined by thed3.extent function which in turn is acting on a separate function which looks through all the ‘date’ values that occur in the ‘data’ array. In this case the .extent function returns the minimum and maximum value in the given array.
  • function(d) { return d.date; } returns all the ‘date’ values in ‘data’. This is then passed to…
  • The .extent function that finds the maximum and minimum values in the array and then…
  • The .domain function which returns those maximum and minimum values to D3 as the range for the x axis.
Pretty neat really. At first you might think it was overly complex, but breaking the function down into these components allows additional functionality with differing scales, values and quantities. In short, don’t sweat it. It’s a good thing.
The x axis values are dates; so the domain for them is basically from the 26th of March 2012 till 1st of May 2012. The y axis is done slightly differently

  y.domain([0, d3.max(data, function(d) { return d.close; })]);

Because the range of values desired on the y axis goes from 0 to the maximum in the data range, that’s exactly what we tell D3. The ‘0’ in the .domain function is the starting point and the finishing point is found by employing a separate function that sorts through all the ‘close’ values in the ‘data’ array and returns the largest one. Therefore the domain is from 0 to 636.23.
Let’s try a small experiment. Let’s change the y axis domain to use the .extent function (the same way the x axis does) to see what it produces.
The JavaScript for the y domain will be;

  y.domain(d3.extent(data, function(d) { return d.close; }));

You can see apart from a quick copy paste of the internals, all I had to change was the reference to ‘close’ rather than ‘date’.
And the result is…
Graph using .extent for data values

Look at that! The starting point for the y axis looks like it’s pretty much on the 53.98 mark and the graph itself certainly touches the x axis where the data would indicate it should.
Now, I’m not really advocating making a graph like this since I think it looks a bit nasty (and a casual observer might be fooled into thinking that the x axis was at 0). However, this would be a useful thing to do if the data was concentrated in a narrow range of values that are quite distant from zero.
For instance, if I change the data.csv file so that the values are represented like the following;
Concentrated data range graph

Then it kind of loses the ability to distinguish between values around the median of the data.
But, if I put in our magic .extent function for the y axis and redraw the graph…
Expanded concentrated data range using .extent

How about that?
The same data as the previous graph, but with one simple piece of the script changed and D3 takes care of the details.

Adding data to the line function

We’re getting towards the end of our journey through the script now. The next step is to associate the array ‘data’ with a new array that consists of a set of coordinates that we are going to plot.
I’m aware that the statement above may be somewhat ambiguous. You would be justified in thinking that we already had the data stored and ready to go. But that’s not strictly correct.

// define the line
var valueline = d3.line()
    .x(function(d) { return x(d.date); })
    .y(function(d) { return y(d.close); });

What we have is data in a raw format, we have added pieces of code that will allow the data to be adjusted for scale and range to fit in the area that we want to draw, but we haven’t actually taken our raw data and adjusted it for our desired coordinates. That’s what the code above does.
The main function that gets used here is the d3.line() function. This function uses accessor functions to store the appropriate information in the right area and in the case above they use the x and y accessors (that would be the bits that are .x and .y). The d3.line() function is called a ‘path generator’ and this is an indication that it can carry out some pretty clever things on its own accord. But in essence its job is to assign a set of coordinates in a form that can be used to draw a line.
Each time this line function is called on, it will go through the data and will assign coordinates to ‘date’ and ‘close’ pairs using the ‘x’ and ‘y’ functions that we set up earlier (which are responsible for scaling and setting the correct range / domain).
Of course, it doesn’t get the data all by itself, we still need to actually call the valueline function with ‘data’ as the source to act on. But never fear, that’s coming up soon.

Adding the SVG element.

As the title states, the next piece of script forms and adds the SVG element to the web page that D3 will then use to draw on.

var svg = d3.select("body").append("svg")
    .attr("width", width + margin.left + margin.right)
    .attr("height", height + margin.top + margin.bottom)
  .append("g")
    .attr("transform",
          "translate(" + margin.left + "," + margin.top + ")");

So what exactly does that all mean?
Well D3 needs to be able to have a space defined for it to draw things. When you define the space it’s going to use, you can also give the space you’re going to use an identifying name and attributes.
In the example we’re using here, we are ‘appending’ an SVG element (an element designed for drawing graphics on) to the <body> of the HTML page.


We also add a group element ‘g’ that is referenced to the top left corner of the actual graph area on the canvas. ‘g’ is a grouping element in the sense that it is normally used for grouping together several related elements. So in this case those grouped elements will have a common reference.
Graph area and margins

(the image above is definitely not to scale, but I hope you get the general idea)
Interesting things to note about the code. The .attr("stuff in here") parts are attributes of the appended elements they are part of.
For instance;

   .append("svg")
    .attr("width", width + margin.left + margin.right)
    .attr("height", height + margin.top + margin.bottom)

tells us that the ‘svg’ element has a “width” of width + margin.left + margin.right and the “height” of height + margin.top + margin.bottom.
Likewise…

  .append("g")
    .attr("transform",
          "translate(" + margin.left + "," + margin.top + ")");

tells us that the group element ‘g’ has been transformed by moving (translating) to the point margin.left, margin.top. Or to the top left of the graph space proper. This way when we tell something to be drawn on our page, we can use this reference point ‘g’ to make sure everything is in the right place.

Actually Drawing Something!

Up until now we have spent a lot of time defining, loading and setting up. Good news! We’re about to finally draw something!
Drawing the line
We jump lightly over some of the code that we have already explained and land on the part that draws the line.

  // Add the valueline path.
  svg.append("path")
      .data([data])
      .attr("class", "line")
      .attr("d", valueline);

This area occurs in the part of the code that has the data loaded (via the d3.csv block) and it’s ready for action.
The svg.append("path") portion adds a new path element . A path element represents a shape that can be manipulated in lots of different ways (see more here: http://www.w3.org/TR/SVG/paths.html).
We join our array of data (confusingly the array is called ‘data’) to the path element with the .data([data]) line. We could have used an alternative method here with a line that read .datum(data). Both are completely valid to use, but have different strengths.


The next line down applies the ‘line’ styles from the CSS section that we experimented with earlier.
The final line (.attr("d", valueline);) we add the attribute ‘d’ to the path with the data from the valueline function that we had declared earlier.


Drawing the Axes
Then we get to draw in the axes;

  // Add the X Axis
  svg.append("g")
      .attr("transform", "translate(0," + height + ")")
      .call(d3.axisBottom(x));

  // Add the Y Axis
  svg.append("g")
      .call(d3.axisLeft(y));

Both axes start by appending a group element (‘g’). Each axis will be bound to their own element.
The y axis can be drawn from the default position at the origin of the svg element (which we recall is 0,0 at the top left of the graph). However the x axis needs to be moved to the bottom of our graph.
On the x axis, we have a transform statement (.attr("transform", "translate(0," + height + ")")). If we want our x axis to be on the bottom of the graph, we need to move (transform) it to the bottom by a set amount. The set amount in this case is the height of the graph proper (height). So, for the point of demonstration we will remove the transform line and see what happens;
x axis transformed to the top of the graph

Yep, pretty much as anticipated.
The last part of the two sections of script ( .call(d3.axisBottom(x)); and .call(d3.axisLeft(y)); ) call the D3 x and y axis functions respectively and initiate the drawing action.
The method by which D3 orientates the axes is relatively self-evident and there are four options;
  • .axisTop: An axis with values and ticks drawn above a horizontal axis.
  • .axisRight: An axis with values and ticks drawn to the right of a vertical axis.
  • .axisBottom: An axis with values and ticks drawn below a horizontal axis.
  • .axisLeft: An axis with values and ticks drawn to the left of a vertical axis.
Just to illustrate the point, we can reverse the orientation of .axisBottom to .axisTop and .axisLeft to .axisRight to see what it looks like;
x and y axes reversed

There we go.
It is worth stating that the axes as presented for this simple graph are very much a ‘straight out of the box’ configuration. Later in the book we will look at options for configuring and styling axes in more depth.

Wrap Up

Well that’s it. In theory, you should now be a complete D3 ninja.
OK, perhaps a slight exaggeration. In fact there is a strong possibility that the information I have laid out here is at best borderline useful and at worst laden with evil practices and gross inaccuracies.
But look on the bright side. Irrespective of the nastiness of the way that any of it was accomplished or the inelegance of the code, if the picture drawn on the screen is pretty, you can walk away with a smile. :-)
This section concludes a very basic description of one type of a graphic that can be built with D3. We will look at adding value to it in subsequent chapters.

I’ve said it before and I’ll say it again. This is not a how-to for learning D3. This is how I have managed to muddle through and achieve what I wanted to do. If some small part of it helps you. All good. Those with a smattering of knowledge of any of the topics I have butchered above (or below) are fully justified in feeling a large degree of righteous indignation. To those I say, please feel free to amend where practical and possible, but please bear in mind this was written from the point of view of someone with no experience in the topic and therefore try to keep any instructions at a level where a new entrant can step in :-).

The post above (and heaps of other stuff) is in the book 'D3 Tips and Tricks v4.x' that can be downloaded for free (or donate to encourage further development :-)).

1 comment: