Creating Dynamic and Engaging Bar Charts in Python
Written on
Chapter 1: Introduction to Animated Bar Charts
Bar charts are fundamental and widely used in data visualization. While many plotting libraries support basic bar plots, this guide will delve into creating animated versions. I aim to share code snippets for dynamic bar charts that are not only informative but also visually captivating. For those new to this concept, I hope you find it enjoyable.
To begin, I recommend starting with basic plots. The following examples are inspired by various sources. To save the animations created, I installed 'ImageMagick' in my Anaconda environment with the following command:
conda install -c conda-forge imagemagick
Next, let's gather the necessary imports:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
from matplotlib.animation import FuncAnimation
Here’s a complete code snippet for generating a basic animated bar chart. I will explain how it works afterward:
%matplotlib qt
fig = plt.figure(figsize=(8,6))
axes = fig.add_subplot(1,1,1)
axes.set_ylim(0, 120)
plt.style.use("seaborn")
lst1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
lst2 = [0, 5, 10, 15, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100]
def animate(i):
y1 = lst1[i]
y2 = lst2[i]
plt.bar(["one", "two"], [y1, y2], color=["red", "blue"])
plt.title("Intro to Animated Bar Plot", fontsize=14)
anim = FuncAnimation(fig, animate, frames=len(lst1)-1)
anim.save("bar1.gif", writer="imagemagick")
Now, let’s break down the code. Initially, two lists are created for the two bars. The animate function extracts elements from these lists to generate bars labeled 'one' and 'two'. The FuncAnimation function handles the animation process, taking in the animate function and the data length. The plot is saved under the name 'bar1.gif' utilizing the 'imagemagick' tool we installed earlier.
The next example is similar to the first but features additional bars:
%matplotlib qt
fig = plt.figure(figsize=(8,6))
plt.style.use("seaborn")
axes = fig.add_subplot(1,1,1)
axes.set_ylim(0, 100)
l1 = [i if i < 20 else 20 for i in range(100)]
l2 = [i if i < 85 else 85 for i in range(100)]
l3 = [i if i < 30 else 30 for i in range(100)]
l4 = [i if i < 65 else 65 for i in range(100)]
palette = list(reversed(sns.color_palette("seismic", 4).as_hex()))
def animate(i):
y1 = l1[i]
y2 = l2[i]
y3 = l3[i]
y4 = l4[i]
plt.bar(["one", "two", "three", "four"], sorted([y1, y2, y3, y4]), color=palette)
plt.title("Animated Bars", color="blue")
anim = FuncAnimation(fig, animate, frames=len(l1)-1, interval=1)
anim.save("bar2.gif", writer="imagemagick")
Notice that I set the y-axis limits at the beginning. Without this setting, the animation may exhibit unexpected behavior. Recently, I encountered a viral animated bar plot depicting COVID-19 death tolls, which inspired me to explore this further.
For the upcoming plots, I will utilize a superstore dataset. Feel free to download it for educational purposes.
df = pd.read_csv("Superstore.csv", encoding='cp1252')
df.columns
The dataset is extensive, but we will focus on the 'Order Date', 'Profit', and 'State' columns for our analysis of monthly sales per state. To prepare the data, the 'Order Date' must be converted to datetime format:
df['Order Date'] = pd.to_datetime(df['Order Date'])
Next, I will create a pivot table to summarize sales by state:
pv = df.pivot_table("Sales", index="Order Date", columns=["State"], aggfunc=np.sum)
This will yield some null values since daily sales data might not be available. I will replace these nulls with zeros and sort by 'Order Date':
pv.sort_index(inplace=True, ascending=True)
pv = pv.fillna(0)
To analyze monthly sales, I will extract the month and year from the 'Order Date':
pv['month_year'] = pv.index.strftime('%Y-%m')
pv_month = pv.set_index("month_year")
After grouping by month-year, we obtain the monthly sales data needed to create an animated bar chart.
pv_monthgr = pv_month.groupby('month_year').sum()
Creating the animated bar chart requires the bar_chart_race function. Here’s a basic implementation:
import bar_chart_race as bcr
bcr.bar_chart_race(df=pv_monthgr, filename="by_month.gif", filter_column_colors=True, cmap="prism", title="Sales By Months")
The chart is sorted in descending order by default, but the rapid animation makes it challenging to read state names.
In the following example, I will adjust the speed and display the top 10 states based on sales:
bcr.bar_chart_race(df=pv_monthgr, filename="bar_race3.gif", filter_column_colors=True, cmap='prism', sort='desc', n_bars=10, fixed_max=True, period_length=2500, title='Sales By Months', figsize=(10, 6))
Next, I will include total sales for each month in the plot:
def summary(values, ranks):
total_sales = int(round(values.sum(), -2))
s = f'Total Sales - {total_sales:,.0f}'
return {'x': .99, 'y': .1, 's': s, 'ha': 'right', 'size': 8}
bcr.bar_chart_race(df=pv_monthgr,
filename="bar_race4.gif", filter_column_colors=True,
cmap='prism', sort='desc', n_bars=15,
fixed_max=True, steps_per_period=3, period_length=1500,
bar_size=0.8,
period_label={'x': .99, 'y':.15, 'ha': 'right', 'color': 'coral'},
bar_label_size=6, tick_label_size=10,
bar_kwargs={'alpha':0.4, 'ec': 'black', 'lw': 2.5},
title='Sales By Months',
period_summary_func=summary,
figsize=(7, 5))
Lastly, I will introduce a function to include the 90th quantile in the animation:
def func(values, ranks):
return values.quantile(0.9)
bcr.bar_chart_race(df=pv_monthgr, filename="bar_race5.gif", filter_column_colors=True,
cmap='prism', sort='desc', n_bars=15,
steps_per_period=3, period_length=2000,
bar_size=0.8,
period_label={'x': .99, 'y':.15, 'ha': 'right', 'color': 'coral'},
bar_label_size=6, tick_label_size=10,
bar_kwargs={'alpha':0.4, 'ec': 'black', 'lw': 2.5},
title='Sales By Months',
period_summary_func=summary,
perpendicular_bar_func=func,
figsize=(7, 5))
For those who prefer vertical bars, you can adjust the orientation to 'v'. Explore various parameters to enhance your plots by visiting the relevant documentation.
Conclusion
While some may argue that traditional bar plots convey information more clearly, incorporating animated visuals can provide valuable insights into data trends over time. Observing how state rankings shift monthly based on sales data, alongside mean sales or quantiles, offers a dynamic perspective that static plots cannot provide. Engaging visuals often attract more attention and foster interest in the data being presented.
For further updates and resources, feel free to connect with me on Twitter, Facebook, and my YouTube channel.
More Reading