In the series Tableau charts, we strive to help Tableau users learn how to create different charts and graphs hence equipping them with different techniques of telling each data story. This article is all about handling distribution data, and as you may be aware, there are different ways developers and analysts can present such data to their users. Industry knowledge tells us that box plot (box and whisker plot) and histograms are some of the best graphs to show distribution of data. But, since in one of previous article we explored on how to create a box plot graph, we’ll therefore make this article all about histogram chart.
Related: How to create a box plot in Tableau.
So, what is a histogram? Wikipedia defines histogram as an accurate representation of the distribution of numeric data. Using Superstore data set pre-packaged with Tableau app, we’ll seek to demonstrate the distribution of Profit for this data. Like in the case of box plot, Tableau allows us to create a histogram by just two clicks using the ‘Show Me’ tab located at the top right corner.
Using ‘Show Me’ tab to create a histogram.
Once you’ve connected above data set with Tableau app;
Drag measure field Profit to the rows shelf or columns shelf.
Open ‘Show Me’ tab and select histogram,
Executing above we’ve a histogram below;
Looking carefully at this graph, you’ll note that the size of bins are uniform, actually the size is 283 to be precise, however Tableau allows us to edit size of bin by selecting the field ‘Profit (bin)’ on the dimension field >> Edit.. (Note; this field 'Profit (bin)' was automatically generated when we chose histogram in the Show Me tab, however we can create it by selecting Profit>>Create>>Bin...).
Using the resulting sub-menu, one can change the size of bin or even deploy a parameter to help end users select bin size of choice.
This first example is based on use of uniform bin size to create a histogram, which sometimes do not meet the business demands. Since most the data lies between -1,000 and 1,000 for this example, a good histogram would explore the distribution of data within the range and combine the rest of data together (as outliers). To achieve this, we will need now to create a histogram with varying bin size.
Histogram with varying bin size.
We’ll be using a calculated field to create bin size of 250 for all values between -1,000 and 1,000 and then merge the rest as ‘Profit>1,000’ and ‘Profit<1,000’.
Right click anywhere on the dimension or measure field area>>Create>>Calculated Field…
Enter the calculation below;
Now, let’s create our histogram;
Drag the calculated field ‘Varying Size Bin’ to the columns shelf.
Drag the measure field ‘Number of Records’ to the rows shelf.
Right click on the field ‘Varying Size Bin’ on the columns shelf and sort the ‘Varying Size Bin’ manually as shown below.
Executing above give us;
From this histogram, a total of 7,808 products generated a profit range between zero and 250 for the Superstores, with very few products generating high profits. And a good number of unprofitable products.
Using the last example of this article on varying bin size histograms, business can be able to map business demands back into the data and hence understand business phenomena's therefore responding appropriately.
I hope this article was helpful you, thanks for reading.