**Box and Whisker Plots**** **or **boxplots,** are a hugely useful data visualisation tool to clearly compare algorithm configuration performance results (or experiment data with multiple dimensions). However, using Python’s Matplotlib library to implement them suitably for comparisons by groups used to be tough. To make them attractive and clear you had to stitch together documentation and examples and more examples and grids and line colours and axis labels and some very hacky legend use case, etc.. each taken from across the matplotlib site and beyond. So I wrote a couple of scripts to simplify **grouped boxplots that can be directly reused..**

Here’s the **Grouped Boxplot (on left)** and **Ungrouped Boxplot (on right)**:

**The code to create the plots is below.**

**Notes:**

- The red line shows data
**centrality**using the independent median statistic. - The upper and lower borders of the box show
**data spread**using the relatively independent interquartile range (IQR) statistics, at median + 25% and at median – 25%. - The whiskers (typically) show up to the upper most data point value, that is within 1.5 to 1.75 times the IQR from the median. That is extended in both upper and lower quartile directions.
- The red + (plus symbol) show the
**outliers**, the data points that fall outside of the quartile plus whisker distance.

## Independent vs Dependent Summary Statistics

**Box plots are independent**, in the sense that as the plotted box visualisation of the data **medians** and **interquartile range** measures will not factor *wild* outliers into the summary. Though outliers are not ignored. This is as opposed to** dependent** plots that show **mean** and **standard deviation**; in these cases outliers are incorporated into the plotted visualisation. **Which you choose depends on what you want to show**.

# Code

You will need Matplotlib in Python 2.7.x. The Python 2.7 Anaconda distribution has everything you need to make boxplots using this code.

You can download the code files from github.

## Grouped Boxplots: *(Python 2.7/ 3 Code)*

import numpy as np

import matplotlib.pyplot as plt

# --- Your data, e.g. results per algorithm:

data1 = [5,5,4,3,3,5]

data2 = [6,6,4,6,8,5]

data3 = [7,8,4,5,8,2]

data4 = [6,9,3,6,8,4]

# --- Combining your data:

data_group1 = [data1, data2]

data_group2 = [data3, data4]

# --- Labels for your data:

labels_list = ['a','b']

xlocations = range(len(data_group1))

width = 0.3

symbol = 'r+'

ymin = 0

ymax = 10

ax = plt.gca()

ax.set_ylim(ymin,ymax)

ax.set_xticklabels( labels_list, rotation=0 )

ax.grid(True, linestyle='dotted')

ax.set_axisbelow(True)

ax.set_xticks(xlocations)

plt.xlabel('X axis label')

plt.ylabel('Y axis label')

plt.title('title')

# --- Offset the positions per group:

positions_group1 = [x-(width+0.01) for x in xlocations]

positions_group2 = xlocations

plt.boxplot(data_group1,

sym=symbol,

labels=['',''],

positions=positions_group1,

widths=width,

# notch=False,

# vert=True,

# whis=1.5,

# bootstrap=None,

# usermedians=None,

# conf_intervals=None,

# patch_artist=False,

)

plt.boxplot(data_group2,

labels=['a','b'],

sym=symbol,

positions=positions_group2,

widths=width,

# notch=False,

# vert=True,

# whis=1.5,

# bootstrap=None,

# usermedians=None,

# conf_intervals=None,

# patch_artist=False,

)

plt.savefig('boxplot_grouped.png')

plt.savefig('boxplot_grouped.pdf') # when publishing, use high quality PDFs

#plt.show() # uncomment to show the plot.

## Ungrouped Boxplots: *(Python 2.7/3 Code)*

import numpy as np import matplotlib.pyplot as plt # --- Your data, e.g. results per algorithm: data1 = [5,5,4,3,3,5] data2 = [6,6,4,6,8,5] data3 = [7,8,4,5,8,2] data4 = [6,9,3,6,8,4] # --- Combining your data: data = [data1, data3, data2, data4] # --- Labels for your data: labels_list = ['a','b','c','d'] width = 0.3 symbol = 'r+' ymin = 0 ymax = 10 ax = plt.gca() ax.set_ylim(ymin,ymax) ax.set_xticklabels( labels_list, rotation=0 ) ax.grid(True) ax.set_axisbelow(True) plt.xlabel('X axis label') plt.ylabel('Y axis label') plt.boxplot(data, widths=width) # --- Save to file: plt.savefig('boxplot.png') plt.savefig('boxplot.pdf') # when publishing, use high quality PDFs #plt.show() # uncomment to show the plot.

hth.

If you’re interested to read more via occasional content/ project updates, etc, feel free keep in touch via email or contact me on social @pmdscully (below).