Introduction to Plotly¶
Introduction¶
This notebook and accompanying webinar was developed and released by the enDAQ team. This is the third “chapter” of our series on Python for Mechanical Engineers:
Introduction to Plotly for Plotting Data
To sign up for future webinars and watch previous ones, visit our webinars page.
Recap of Python Introduction¶
Python is popular for good reason
There are many ways to interact with Python
Here we are in Google Colab based on Jupyter Notebooks
There are many open source libraries to use
Today we are covering Plotly for plotting
Why Do We Plot?¶
To Understand Relationships, Make an Observation
Interactivity Matters!
To Share & Present
Beautiful Matters!
Overview of Python Plotting Libraries¶
Like everything in Python, there are a few options for plotting data! We’ll focus on Plotly but quickly cover:
Matplotlib
Seaborn
ggplot
bokeh
Plotly
Data Source¶
For some shock data, we’ll use the motorcycle crash test data discussed in our blog post on pseudo velocity. This was filtered with a 150 low pass filter first to clean the signal using the endaq library (which uses SciPy under the hood).
[ ]:
import pandas as pd
df = pd.read_csv('https://info.endaq.com/hubfs/data/motorcycle-crash.csv',index_col=0)
[ ]:
df
X (500g) | Y (500g) | Z (500g) | |
---|---|---|---|
timestamp | |||
0.00000 | -0.055582 | 0.033337 | 0.032217 |
0.00010 | -0.056731 | 0.041513 | 0.032720 |
0.00020 | -0.056972 | 0.049888 | 0.033106 |
0.00030 | -0.056273 | 0.058399 | 0.033419 |
0.00040 | -0.054605 | 0.066982 | 0.033706 |
... | ... | ... | ... |
0.19952 | 0.383610 | 0.117662 | -0.292337 |
0.19962 | 0.411270 | 0.135782 | -0.275515 |
0.19972 | 0.438324 | 0.153832 | -0.258591 |
0.19982 | 0.464640 | 0.171657 | -0.241783 |
0.19992 | 0.490098 | 0.189112 | -0.225302 |
2000 rows × 3 columns
Matplotlib¶
The most popular plotting library and the default go-to. But… there is a new game in town!
[ ]:
import matplotlib.pyplot as plt
plt.plot(df)
plt.title('Motorcycle Crash Data')
plt.ylabel('Acceleration (g)')
plt.xlabel('Time (s)')
plt.legend(df.columns)
plt.grid(color='grey')
plt.show()
Seaborn¶
A wrapper around Matplotlib to simplify the interface, beautify the plots, and support more stats-based analysis. Here is their documentation specifically on aesthetics, they care too!
[ ]:
import seaborn as sns
sns.set_theme()
p = sns.lineplot(data=df)
p.set_title('Motorcycle Crash Data')
p.set_ylabel('Acceleration (g)')
p.set_xlabel('Time (s)')
Text(0.5, 0, 'Time (s)')
ggplot¶
Introduces a “grammer of graphics” logic to plotting data which allows explicit mapping of data to the visual representation. This is something plotly express excels at.
[ ]:
from plotnine import ggplot, aes, geom_line
df_ggplot = df.copy()
df_ggplot['time'] = df.index
df_ggplot = pd.melt(df_ggplot, id_vars='time')
df_ggplot
(
ggplot(df_ggplot) # What data to use
+ aes(x="time", y='value',color='variable') # What variable to use
+ geom_line() # Geometric object to use for drawing
)
/usr/local/lib/python3.7/dist-packages/plotnine/utils.py:1246: FutureWarning: is_categorical is deprecated and will be removed in a future version. Use is_categorical_dtype instead
if pdtypes.is_categorical(arr):
<ggplot: (8785480354521)>
Bokeh¶
Finally, we have interactivity!
[ ]:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
output_notebook()
p = figure(title='Motorcycle Crash Data',
x_axis_label='Time (s)',
y_axis_label='Acceleration (g)')
colors = ['red','green','blue']
for c,color in zip(df.columns,colors):
p.line(df.index, df[c], legend_label = c, line_color = color, line_width = 2)
show(p)
Plotly¶
Interactive, beautiful, easy!
[ ]:
!pip install -U -q plotly
|████████████████████████████████| 23.9 MB 1.5 MB/s
[ ]:
import plotly.express as px
import plotly.io as pio; pio.renderers.default = "iframe"
fig = px.line(df)
fig.update_layout(
title_text = 'Motorcycle Crash Data',
xaxis_title_text = "Time (s)",
yaxis_title_text = "Acceleration (g)")
fig.show()
How Does Plotly Work?¶
There are three main components to how/why Plotly works:
To illustrate this relationship, let’s make a Plotly Treemap!
[ ]:
df_tree = pd.DataFrame({
'names': ["Plotly","Python", "Figure Object", "JavaScript", "Express", "Graph Objects","Data","Layout",'Frames',"Plotly.js"],
'parents' : ["", "Plotly", "Plotly", "Plotly", "Python",'Express',"Figure Object","Figure Object","Figure Object","JavaScript"]
})
[ ]:
fig = px.treemap(
names = df_tree.names,
parents = df_tree.parents
)
fig.update_traces(root_color="lightgrey")
fig.update_layout(
font_family='Open Sans',
font_size=32)
fig.show()
Plotly Express was introduced in May 2019 and is a GAME CHANGER! Here is the introduction article:
Shoutout to Plotly’s Docs¶
Before we proceed, I want to really stress how good the docs are from Plotly. They have a TON of examples, and very good documentation. We’ll be focusing on Plotly Express.
They also have their own community forum that is pretty rich with examples and people helping each other out.
Styling Figures¶
Example Data¶
Here, to mix it up, we’ll calculate a shock response spectrum from the acceleration data to then be plotting on a log log scale to help show how to style that.
[ ]:
!pip install -q endaq
import endaq
import numpy as np
[ ]:
def get_log_freqs(df,init_freq=1,bins_per_octave=12):
"""Given a timebased dataframe, return a log spaced frequency array up to the Nyquist frequency"""
fs = (df.shape[0]-1)/(df.index[-1]-df.index[0])
return 2 ** np.arange(np.log2(init_freq),
np.log2(fs/2),
1/bins_per_octave)
[ ]:
df_pvss = endaq.calc.shock.pseudo_velocity(df,
get_log_freqs(df,init_freq=1,bins_per_octave=12),
damp=0.05, two_sided=False)
df_pvss = df_pvss*9.81*39.37 #convert to in/s
[ ]:
df_pvss
X (500g) | Y (500g) | Z (500g) | |
---|---|---|---|
frequency (Hz) | |||
1.000000 | 47.804683 | 473.402407 | 148.658683 |
1.059463 | 49.240508 | 493.307994 | 154.812945 |
1.122462 | 50.532563 | 513.039780 | 160.893921 |
1.189207 | 51.637517 | 532.374454 | 166.829370 |
1.259921 | 52.551121 | 551.046158 | 172.533775 |
... | ... | ... | ... |
3866.109185 | 0.437852 | 1.025865 | 0.681822 |
4096.000000 | 0.413256 | 0.968276 | 0.643536 |
4339.560834 | 0.390047 | 0.913923 | 0.607404 |
4597.604550 | 0.368145 | 0.862623 | 0.573305 |
4870.992343 | 0.347478 | 0.814205 | 0.541124 |
148 rows × 3 columns
Manually Defining the Theme¶
First let’s start by plotting with the standard theme, modifying only the titles. We’ll also define the scale on the x and y axis to be a log scale.
[ ]:
fig = px.line(df_pvss)
fig.update_layout(
title_text = 'Motorcycle Crash Data',
xaxis_title_text = "Natural Frequency (Hz)",
yaxis_title_text = "Pseudo Velocity (in/s)",
xaxis_type = "log",
yaxis_type = "log")
fig.show()
Now let’s get crazy and customize all the “common” settings. But note that there are a LOT of different parameters that can be explicitely defined. Remember, Plotly has very thorough documentation, so check it out! * Figure Layout * X Axis * Y Axis
You will need to download Open Sans if you like the font like me too! To pick colors, I suggest Color Hex.
[ ]:
fig = px.line(df_pvss)
fig.update_layout(
font_family='Open Sans',
font_size=16,
font_color='#404041',
title_text = 'Motorcycle Crash Data',
title_font_family = 'Showcard Gothic',
title_font_size = 32,
title_font_color = '#e77025',
title_x = 0.5,
xaxis_title_text = "Natural Frequency (Hz)",
xaxis_title_font_family = 'Algerian',
xaxis_title_font_size = 24,
xaxis_title_font_color = '#7f3f98',
xaxis_type = 'log',
yaxis_title_text ="Pseudo Velocity (in/s)",
yaxis_title_font_family = 'Playbill',
yaxis_title_font_size = 24,
yaxis_title_font_color = '#be1e2d',
yaxis_type = 'log',
legend_bgcolor = 'yellow',
legend_title_text = 'Legend',
legend_title_font_size = 24,
legend_orientation = 'v',
legend_y = 1.0,
legend_yanchor = 'top',
legend_x = 1.0,
legend_xanchor = 'right',
plot_bgcolor = '#f3f3f3',
width = 800,
height = 600)
fig.show()
One thing that Plotly does which is REALLY cool is that they use “magic underscore notation” which means that this:
yaxis_title_font_family = 'Open Sans ExtraBold'
is the equivalent of this:
{'yaxis':
{'title':
{'font':
{'family': 'Open Sans ExtraBold'}
}
}
}
Now I am particular about my plots and how they look, I think aesthetics matter! So I can create a few custom themes that can be added to figures when making them.
[ ]:
template_light = dict(
template="presentation",
font_family='Open Sans',
font_size=16,
font_color='#404041',
title_font_family = 'Open Sans ExtraBold',
title_font_size = 24,
title_x = 0.5,
xaxis_title_font_family = 'Open Sans ExtraBold',
xaxis_title_font_size = 20,
yaxis_title_font_family = 'Open Sans ExtraBold',
yaxis_title_font_size = 20,
legend_title='',
legend_orientation='h',
legend_y = -0.2,
plot_bgcolor = '#f3f3f3',
yaxis_gridcolor = '#dad9d8',
yaxis_linecolor = '#404041',
yaxis_mirror = True,
xaxis_gridcolor = '#dad9d8',
xaxis_linecolor = '#404041',
xaxis_mirror = True,
)
template_dark = template_light.copy()
template_dark['template'] = "plotly_dark"
template_dark['font_color'] = '#f3f3f3'
template_dark['plot_bgcolor'] = "#111111"
template_dark['yaxis_linecolor'] = "#404041"
template_dark['xaxis_linecolor'] = "#404041"
template_dark['yaxis_gridcolor'] = "#404041"
template_dark['xaxis_gridcolor'] = "#404041"
I also tend to make a lot of similar plots, so it can be helpful to define some axes labels and types as variables.
[ ]:
template_pvss = dict(
xaxis_title_text = "Natural Frequency (Hz)",
xaxis_type = 'log',
yaxis_title_text ="Pseudo Velocity (in/s)",
yaxis_type = 'log'
)
template_psd = dict(
xaxis_title_text = "Frequency (Hz)",
xaxis_type = 'log',
yaxis_title_text ="Acceleration (g^2/Hz)",
yaxis_type = 'log'
)
template_accel = dict(
xaxis_title_text = "Time (s)",
yaxis_title_text ="Acceleration (g)",
)
[ ]:
fig = px.line(df_pvss)
fig.update_layout(
{**template_light, **template_pvss},
title_text = 'Custom Light Theme')
fig.show()
[ ]:
fig = px.line(df_pvss)
fig.update_layout(
{**template_dark, **template_pvss},
title_text = 'Custom Dark Theme')
fig.show()
Themes¶
As you may have noticed, I used some templates in my custom theme. There are a few to pick from and you can make your own as well. This is well documented with examples on Plotly’s website.
[ ]:
themes = ["plotly", "plotly_white", "plotly_dark", "ggplot2", "seaborn", "simple_white", 'presentation', "none"]
for theme in themes:
fig = px.line(df_pvss)
fig.update_layout(
template_pvss,
template = theme,
title_text = theme,
)
fig.show()
Colors¶
Plotly also supports a whole range of different color schemes you can implement. There are a lot of built in ones and remember, great documentation!
[ ]:
px.colors.qualitative.swatches().show()
[ ]:
px.colors.qualitative.Alphabet
['#AA0DFE',
'#3283FE',
'#85660D',
'#782AB6',
'#565656',
'#1C8356',
'#16FF32',
'#F7E1A0',
'#E2E2E2',
'#1CBE4F',
'#C4451C',
'#DEA0FD',
'#FE00FA',
'#325A9B',
'#FEAF16',
'#F8A19F',
'#90AD1C',
'#F6222E',
'#1CFFCE',
'#2ED9FF',
'#B10DA1',
'#C075A6',
'#FC1CBF',
'#B00068',
'#FBE426',
'#FA0087']
[ ]:
px.colors.sequential.swatches_continuous().show()
[ ]:
px.colors.sequential.Blackbody
['rgb(0,0,0)',
'rgb(230,0,0)',
'rgb(230,210,0)',
'rgb(255,255,255)',
'rgb(160,200,255)']
Here I’m going to make my own swatch of some kind with a custom color scale.
[ ]:
colors = ['#EE7F27', '#6914F0', '#2DB473', '#D72D2D', '#3764FF', '#FAC85F','#27eec0','#b42d4d','#82d72d','#e35ffa']
fig = px.bar(x=np.arange(10),
y=np.zeros(10)+1,
color=colors,
color_discrete_sequence=colors,
height=200)
fig.update_layout(template_dark,
font_color="#111111",
legend_font_color="white")
fig.show()
When creating a figure you can specify the color sequence you’d like as a parameter. You can use a custom list like what I’m showing here or you can call one of the px.colors.qualitative.
lists.
[ ]:
fig = px.line(df_pvss,
color_discrete_sequence=colors)
fig.update_layout(
{**template_dark, **template_pvss},
title_text = 'Custom Colors & Dashes')
fig.show()
Saving Figures¶
To save static images, first you need to install kaleido.
[ ]:
!pip install -U kaleido
Collecting kaleido
Downloading kaleido-0.2.1-py2.py3-none-manylinux1_x86_64.whl (79.9 MB)
|████████████████████████████████| 79.9 MB 85 kB/s
Installing collected packages: kaleido
Successfully installed kaleido-0.2.1
In colab only you’ll also need to do this install.
[ ]:
!wget https://github.com/plotly/orca/releases/download/v1.2.1/orca-1.2.1-x86_64.AppImage -O /usr/local/bin/orca
!chmod +x /usr/local/bin/orca
!apt-get install xvfb libgtk2.0-0 libgconf-2-4
Now saving an image is easy and they support a few types. More information and options available in their docs.
[ ]:
fig.write_image("fig.png")
fig.write_image("fig.jpeg")
fig.write_image("fig.svg")
fig.write_image("fig.pdf")
You can also keep the interactivity by downloading an HTML file, either with or without the plotly javascript library. You’ll need the library if you want to view the plots when you don’t have internet access. For more, see their docs on saving HTML files.
[ ]:
fig.write_html('fig-full.html')
fig.write_html('fig-small.html',full_html=False,include_plotlyjs='cdn')
Plot Types¶
Remember Plotly has GREAT documentation, and LOTS of plot types. Below is a GIF of their main page on all the different plot types:
Now here are the gallery of plots all in plotly.express meaning they are super easy to create! Again, check out the docs:
Now we’ll go through a few relevant examples for mechanical engineers and vibration analysis.
Bubble¶
[ ]:
df_table = pd.read_csv('https://info.endaq.com/hubfs/data/endaq-cloud-table.csv')
df_table = df_table[['serial_number_id', 'file_name', 'file_size', 'recording_length', 'recording_ts',
'accelerationPeakFull', 'psuedoVelocityPeakFull', 'accelerationRMSFull',
'velocityRMSFull', 'displacementRMSFull', 'pressureMeanFull', 'temperatureMeanFull']].copy()
df_table['recording_ts'] = pd.to_datetime(df_table['recording_ts'], unit='s')
df_table = df_table.sort_values(by=['recording_ts'], ascending=False)
df_table
serial_number_id | file_name | file_size | recording_length | recording_ts | accelerationPeakFull | psuedoVelocityPeakFull | accelerationRMSFull | velocityRMSFull | displacementRMSFull | pressureMeanFull | temperatureMeanFull | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
11 | 11456 | 50_Joules_900_lbs-1629315312.ide | 1597750 | 20.201752 | 2021-07-26 19:56:39 | 231.212 | 2907.650 | 2.423 | 54.507 | 1.066 | 98.745 | 24.175 |
10 | 11456 | 100_Joules_900_lbs-1629315313.ide | 1596714 | 20.200623 | 2021-07-26 19:21:55 | 218.634 | 2961.256 | 2.877 | 53.875 | 1.053 | 98.751 | 24.180 |
22 | 9695 | Tilt_000000-1625156721.IDE | 719403 | 23.355163 | 2021-07-01 16:21:01 | 0.378 | 330.946 | 0.044 | 11.042 | 0.345 | 99.510 | 26.410 |
7 | 11162 | Calibration-Shake-1632515140.IDE | 2218130 | 27.882690 | 2021-05-17 19:16:10 | 8.783 | 1142.282 | 2.712 | 46.346 | 0.617 | 102.251 | 24.545 |
17 | 11071 | surgical-instrument-1625829182.ide | 541994 | 6.951172 | 2021-04-22 16:53:10 | 5.739 | 387.312 | 1.568 | 24.418 | 0.242 | 99.879 | 21.889 |
8 | 10916 | FUSE_HSTAB_000005-1632515139.ide | 537562 | 18.491791 | 2021-04-22 16:13:24 | 0.202 | 53.375 | 0.011 | 1.504 | 0.036 | 90.706 | 18.874 |
2 | 10118 | Bolted-1632515144.ide | 6149229 | 29.396118 | 2021-04-21 21:44:07 | 15.343 | 148.276 | 2.398 | 14.101 | 0.154 | 99.652 | 23.172 |
20 | 9680 | LOC__6__DAQ41551_25_01-1625170793.IDE | 8664238 | 63.878937 | 2021-03-25 04:53:27 | 564.966 | 2357.599 | 54.408 | 145.223 | 3.088 | 102.875 | 26.031 |
19 | 9680 | LOC__4__DAQ41551_15_05-1625170794.IDE | 6927958 | 64.486054 | 2021-03-25 04:22:10 | 585.863 | 2153.020 | 46.528 | 148.591 | 2.615 | 105.750 | 32.202 |
18 | 9680 | LOC__3__DAQ41551_11_01_02-1625170795.IDE | 2343292 | 28.456818 | 2021-03-25 04:06:19 | 622.040 | 8907.949 | 94.197 | 372.049 | 9.580 | 105.682 | 33.452 |
21 | 9680 | LOC__2__DAQ38060_06_03_05-1625170793.IDE | 1519172 | 27.057647 | 2021-03-25 02:54:22 | 995.670 | 5845.241 | 131.087 | 323.287 | 3.144 | 104.473 | 25.616 |
12 | 11046 | Drive-Home_07-1626805222.ide | 36225758 | 634.732056 | 2021-03-19 19:35:57 | 23.805 | 356.128 | 0.097 | 6.117 | 0.135 | 101.988 | 28.832 |
5 | 11046 | Drive-Home_01-1632515142.ide | 3632799 | 61.755371 | 2021-03-19 18:35:55 | 0.479 | 40.197 | 0.021 | 1.081 | 0.023 | 100.284 | 29.061 |
14 | 10030 | 200922_Moto_Max_Run5_Control_Larry-1626297441.ide | 4780893 | 99.325134 | 2020-09-22 23:47:35 | 29.864 | 1280.349 | 3.528 | 55.569 | 1.060 | NaN | NaN |
0 | 9695 | train-passing-1632515146.ide | 10492602 | 73.612335 | 2020-04-29 18:20:36 | 7.513 | 419.944 | 0.372 | 6.969 | 0.061 | 104.620 | 23.432 |
15 | 9695 | ford_f150-1626296561.ide | 96097059 | 1207.678344 | 2020-03-13 23:35:08 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | 9295 | Seat-Base_21-1632515142.ide | 5248836 | 83.092255 | 2019-12-08 10:16:50 | 1.085 | 251.009 | 0.130 | 7.318 | 0.190 | 98.930 | 17.820 |
1 | 9316 | Seat-Top_09-1632515145.ide | 10491986 | 172.704559 | 2019-12-08 10:14:31 | 1.105 | 86.595 | 0.082 | 1.535 | 0.040 | 98.733 | 20.133 |
16 | 7530 | Motorcycle-Car-Crash-1626277852.ide | 10489262 | 151.069336 | 2019-07-03 17:02:52 | 480.737 | 12831.590 | 1.732 | 143.437 | 3.988 | 100.363 | 26.989 |
6 | 0 | HiTest-Shock-1632515141.ide | 2655894 | 20.331848 | 2018-12-04 15:22:54 | 619.178 | 6058.093 | 11.645 | 167.835 | 4.055 | 101.126 | 9.538 |
13 | 5120 | Mining-SSX28803_06-1626457584.IDE | 402920686 | 3238.119202 | 2018-09-14 19:28:24 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
9 | 9874 | Coffee_002-1631722736.IDE | 60959516 | 769.299896 | 2000-03-03 20:02:24 | 2.698 | 1338.396 | 0.059 | 5.606 | 0.104 | 100.339 | 24.540 |
3 | 10309 | RMI-2000-1632515143.ide | 5909632 | 60.250855 | 1970-01-01 00:00:24 | 0.332 | 17.287 | 0.079 | 1.247 | 0.005 | 100.467 | 21.806 |
[ ]:
fig = px.scatter(df_table,
x="recording_ts",
y="accelerationRMSFull",
size="recording_length",
color="serial_number_id",
hover_name="file_name",
log_y=True,
size_max=60)
fig.update_layout(
template_dark,
title_text ='Scatter Plot with Numeric "Color"',
xaxis_title_text ="Date of Recording",
yaxis_title_text ="Acceleration RMS (g)"
)
fig.show()
[ ]:
df_table['device'] = df_table["serial_number_id"].astype(str)
fig = px.scatter(df_table,
x="accelerationPeakFull",
y="velocityRMSFull",
size="recording_length",
color="device",
color_discrete_sequence=px.colors.qualitative.Light24, #I'll want to use a list with as many discrete values as I can
hover_name="file_name",
log_y=True,
log_x=True,
size_max=60)
fig.update_layout(
template_dark,
title_text ='Scatter Plot with Text "Color"',
xaxis_title_text ="Acceleration RMS (g)",
yaxis_title_text ="Velocity RMS (mm/s)"
)
fig.show()