Introduction to Plotly

Introduction

This notebook and accompanying webinar was developed and released by the enDAQ team. This is the third “chapter” of our series on Python for Mechanical Engineers:

  1. Get Started with Python

  2. Introduction to Numpy & Pandas for Data Analysis

  3. Introduction to Plotly for Plotting Data

  4. Introduction of the enDAQ Library

To sign up for future webinars and watch previous ones, visit our webinars page.

Recap of Python Introduction

  1. Python is popular for good reason

  2. There are many ways to interact with Python

    • Here we are in Google Colab based on Jupyter Notebooks

  3. There are many open source libraries to use

    • Today we are covering Plotly for plotting

e54c157a5ff64460a3a7145f19676f46

Why Do We Plot?

  • To Understand Relationships, Make an Observation

    • Interactivity Matters!

  • To Share & Present

    • Beautiful Matters!

all-plotly-images.png

Overview of Python Plotting Libraries

Like everything in Python, there are a few options for plotting data! We’ll focus on Plotly but quickly cover:

  • Matplotlib

  • Seaborn

  • ggplot

  • bokeh

  • Plotly

Data Source

For some shock data, we’ll use the motorcycle crash test data discussed in our blog post on pseudo velocity. This was filtered with a 150 low pass filter first to clean the signal using the endaq library (which uses SciPy under the hood).

[ ]:
import pandas as pd

df = pd.read_csv('https://info.endaq.com/hubfs/data/motorcycle-crash.csv',index_col=0)
[ ]:
df
X (500g) Y (500g) Z (500g)
timestamp
0.00000 -0.055582 0.033337 0.032217
0.00010 -0.056731 0.041513 0.032720
0.00020 -0.056972 0.049888 0.033106
0.00030 -0.056273 0.058399 0.033419
0.00040 -0.054605 0.066982 0.033706
... ... ... ...
0.19952 0.383610 0.117662 -0.292337
0.19962 0.411270 0.135782 -0.275515
0.19972 0.438324 0.153832 -0.258591
0.19982 0.464640 0.171657 -0.241783
0.19992 0.490098 0.189112 -0.225302

2000 rows × 3 columns

Matplotlib

The most popular plotting library and the default go-to. But… there is a new game in town!

[ ]:
import matplotlib.pyplot as plt

plt.plot(df)
plt.title('Motorcycle Crash Data')
plt.ylabel('Acceleration (g)')
plt.xlabel('Time (s)')
plt.legend(df.columns)
plt.grid(color='grey')
plt.show()
../_images/webinars_Webinar_Introduction_Plotly_8_0.png

Seaborn

A wrapper around Matplotlib to simplify the interface, beautify the plots, and support more stats-based analysis. Here is their documentation specifically on aesthetics, they care too!

[ ]:
import seaborn as sns

sns.set_theme()

p = sns.lineplot(data=df)
p.set_title('Motorcycle Crash Data')
p.set_ylabel('Acceleration (g)')
p.set_xlabel('Time (s)')
Text(0.5, 0, 'Time (s)')
../_images/webinars_Webinar_Introduction_Plotly_10_1.png

ggplot

Introduces a “grammer of graphics” logic to plotting data which allows explicit mapping of data to the visual representation. This is something plotly express excels at.

[ ]:
from plotnine import ggplot, aes, geom_line

df_ggplot = df.copy()
df_ggplot['time'] = df.index
df_ggplot = pd.melt(df_ggplot, id_vars='time')
df_ggplot

(
    ggplot(df_ggplot)  # What data to use
    + aes(x="time", y='value',color='variable')  # What variable to use
    + geom_line()  # Geometric object to use for drawing
)
/usr/local/lib/python3.7/dist-packages/plotnine/utils.py:1246: FutureWarning: is_categorical is deprecated and will be removed in a future version.  Use is_categorical_dtype instead
  if pdtypes.is_categorical(arr):
../_images/webinars_Webinar_Introduction_Plotly_12_1.png
<ggplot: (8785480354521)>

Bokeh

Finally, we have interactivity!

[ ]:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook

output_notebook()

p = figure(title='Motorcycle Crash Data',
           x_axis_label='Time (s)',
           y_axis_label='Acceleration (g)')

colors = ['red','green','blue']
for c,color in zip(df.columns,colors):
  p.line(df.index, df[c], legend_label = c, line_color = color, line_width = 2)

show(p)

Plotly

Interactive, beautiful, easy!

[ ]:
!pip install -U -q plotly
     |████████████████████████████████| 23.9 MB 1.5 MB/s

[ ]:
import plotly.express as px
import plotly.io as pio; pio.renderers.default = "iframe"

fig = px.line(df)
fig.update_layout(
    title_text = 'Motorcycle Crash Data',
    xaxis_title_text = "Time (s)",
    yaxis_title_text = "Acceleration (g)")
fig.show()

How Does Plotly Work?

There are three main components to how/why Plotly works:

  1. Python Library

  2. Figure Objects

  3. JavaScript Library

To illustrate this relationship, let’s make a Plotly Treemap!

[ ]:
df_tree = pd.DataFrame({
    'names': ["Plotly","Python", "Figure Object", "JavaScript", "Express", "Graph Objects","Data","Layout",'Frames',"Plotly.js"],
    'parents' : ["", "Plotly", "Plotly", "Plotly", "Python",'Express',"Figure Object","Figure Object","Figure Object","JavaScript"]
})
[ ]:
fig = px.treemap(
    names = df_tree.names,
    parents = df_tree.parents
)
fig.update_traces(root_color="lightgrey")
fig.update_layout(
  font_family='Open Sans',
  font_size=32)
fig.show()

Plotly Express was introduced in May 2019 and is a GAME CHANGER! Here is the introduction article:

plotly-introduces-express.png

Shoutout to Plotly’s Docs

Before we proceed, I want to really stress how good the docs are from Plotly. They have a TON of examples, and very good documentation. We’ll be focusing on Plotly Express.

plotly-express-plot-types.gif

They also have their own community forum that is pretty rich with examples and people helping each other out.

Styling Figures

Example Data

Here, to mix it up, we’ll calculate a shock response spectrum from the acceleration data to then be plotting on a log log scale to help show how to style that.

[ ]:
!pip install -q endaq
import endaq
import numpy as np
[ ]:
def get_log_freqs(df,init_freq=1,bins_per_octave=12):
  """Given a timebased dataframe, return a log spaced frequency array up to the Nyquist frequency"""
  fs = (df.shape[0]-1)/(df.index[-1]-df.index[0])
  return 2 ** np.arange(np.log2(init_freq),
                        np.log2(fs/2),
                        1/bins_per_octave)
[ ]:
df_pvss = endaq.calc.shock.pseudo_velocity(df,
                                           get_log_freqs(df,init_freq=1,bins_per_octave=12),
                                           damp=0.05, two_sided=False)
df_pvss = df_pvss*9.81*39.37 #convert to in/s
[ ]:
df_pvss
X (500g) Y (500g) Z (500g)
frequency (Hz)
1.000000 47.804683 473.402407 148.658683
1.059463 49.240508 493.307994 154.812945
1.122462 50.532563 513.039780 160.893921
1.189207 51.637517 532.374454 166.829370
1.259921 52.551121 551.046158 172.533775
... ... ... ...
3866.109185 0.437852 1.025865 0.681822
4096.000000 0.413256 0.968276 0.643536
4339.560834 0.390047 0.913923 0.607404
4597.604550 0.368145 0.862623 0.573305
4870.992343 0.347478 0.814205 0.541124

148 rows × 3 columns

Manually Defining the Theme

First let’s start by plotting with the standard theme, modifying only the titles. We’ll also define the scale on the x and y axis to be a log scale.

[ ]:
fig = px.line(df_pvss)
fig.update_layout(
    title_text = 'Motorcycle Crash Data',
    xaxis_title_text = "Natural Frequency (Hz)",
    yaxis_title_text = "Pseudo Velocity (in/s)",
    xaxis_type = "log",
    yaxis_type = "log")
fig.show()

Now let’s get crazy and customize all the “common” settings. But note that there are a LOT of different parameters that can be explicitely defined. Remember, Plotly has very thorough documentation, so check it out! * Figure Layout * X Axis * Y Axis

You will need to download Open Sans if you like the font like me too! To pick colors, I suggest Color Hex.

[ ]:
fig = px.line(df_pvss)
fig.update_layout(
  font_family='Open Sans',
  font_size=16,
  font_color='#404041',

  title_text = 'Motorcycle Crash Data',
  title_font_family = 'Showcard Gothic',
  title_font_size = 32,
  title_font_color = '#e77025',
  title_x = 0.5,

  xaxis_title_text = "Natural Frequency (Hz)",
  xaxis_title_font_family = 'Algerian',
  xaxis_title_font_size = 24,
  xaxis_title_font_color = '#7f3f98',
  xaxis_type = 'log',

  yaxis_title_text ="Pseudo Velocity (in/s)",
  yaxis_title_font_family = 'Playbill',
  yaxis_title_font_size = 24,
  yaxis_title_font_color = '#be1e2d',
  yaxis_type = 'log',

  legend_bgcolor = 'yellow',
  legend_title_text = 'Legend',
  legend_title_font_size = 24,
  legend_orientation = 'v',
  legend_y = 1.0,
  legend_yanchor = 'top',
  legend_x = 1.0,
  legend_xanchor = 'right',

  plot_bgcolor = '#f3f3f3',

  width = 800,
  height = 600)
fig.show()

One thing that Plotly does which is REALLY cool is that they use “magic underscore notation” which means that this:

yaxis_title_font_family = 'Open Sans ExtraBold'

is the equivalent of this:

{'yaxis':
 {'title':
  {'font':
   {'family': 'Open Sans ExtraBold'}
  }
 }
}

Now I am particular about my plots and how they look, I think aesthetics matter! So I can create a few custom themes that can be added to figures when making them.

[ ]:
template_light = dict(
  template="presentation",

  font_family='Open Sans',
  font_size=16,
  font_color='#404041',

  title_font_family = 'Open Sans ExtraBold',
  title_font_size = 24,
  title_x = 0.5,

  xaxis_title_font_family = 'Open Sans ExtraBold',
  xaxis_title_font_size = 20,

  yaxis_title_font_family = 'Open Sans ExtraBold',
  yaxis_title_font_size = 20,

  legend_title='',
  legend_orientation='h',
  legend_y = -0.2,

  plot_bgcolor = '#f3f3f3',
  yaxis_gridcolor = '#dad9d8',
  yaxis_linecolor = '#404041',
  yaxis_mirror = True,
  xaxis_gridcolor = '#dad9d8',
  xaxis_linecolor = '#404041',
  xaxis_mirror = True,
  )

template_dark = template_light.copy()
template_dark['template'] = "plotly_dark"
template_dark['font_color'] = '#f3f3f3'
template_dark['plot_bgcolor'] = "#111111"
template_dark['yaxis_linecolor'] = "#404041"
template_dark['xaxis_linecolor'] = "#404041"
template_dark['yaxis_gridcolor'] = "#404041"
template_dark['xaxis_gridcolor'] = "#404041"

I also tend to make a lot of similar plots, so it can be helpful to define some axes labels and types as variables.

[ ]:
template_pvss = dict(
  xaxis_title_text = "Natural Frequency (Hz)",
  xaxis_type = 'log',
  yaxis_title_text ="Pseudo Velocity (in/s)",
  yaxis_type = 'log'
)

template_psd = dict(
  xaxis_title_text = "Frequency (Hz)",
  xaxis_type = 'log',
  yaxis_title_text ="Acceleration (g^2/Hz)",
  yaxis_type = 'log'
  )

template_accel = dict(
  xaxis_title_text = "Time (s)",
  yaxis_title_text ="Acceleration (g)",
  )
[ ]:
fig = px.line(df_pvss)
fig.update_layout(
  {**template_light, **template_pvss},
  title_text = 'Custom Light Theme')
fig.show()
[ ]:
fig = px.line(df_pvss)
fig.update_layout(
  {**template_dark, **template_pvss},
  title_text = 'Custom Dark Theme')
fig.show()

Themes

As you may have noticed, I used some templates in my custom theme. There are a few to pick from and you can make your own as well. This is well documented with examples on Plotly’s website.

[ ]:
themes = ["plotly", "plotly_white", "plotly_dark", "ggplot2", "seaborn", "simple_white", 'presentation', "none"]

for theme in themes:
  fig = px.line(df_pvss)
  fig.update_layout(
    template_pvss,
    template = theme,
    title_text = theme,
    )
  fig.show()

Colors

Plotly also supports a whole range of different color schemes you can implement. There are a lot of built in ones and remember, great documentation!

[ ]:
px.colors.qualitative.swatches().show()
[ ]:
px.colors.qualitative.Alphabet
['#AA0DFE',
 '#3283FE',
 '#85660D',
 '#782AB6',
 '#565656',
 '#1C8356',
 '#16FF32',
 '#F7E1A0',
 '#E2E2E2',
 '#1CBE4F',
 '#C4451C',
 '#DEA0FD',
 '#FE00FA',
 '#325A9B',
 '#FEAF16',
 '#F8A19F',
 '#90AD1C',
 '#F6222E',
 '#1CFFCE',
 '#2ED9FF',
 '#B10DA1',
 '#C075A6',
 '#FC1CBF',
 '#B00068',
 '#FBE426',
 '#FA0087']
[ ]:
px.colors.sequential.swatches_continuous().show()
[ ]:
px.colors.sequential.Blackbody
['rgb(0,0,0)',
 'rgb(230,0,0)',
 'rgb(230,210,0)',
 'rgb(255,255,255)',
 'rgb(160,200,255)']

Here I’m going to make my own swatch of some kind with a custom color scale.

[ ]:
colors = ['#EE7F27', '#6914F0', '#2DB473', '#D72D2D', '#3764FF', '#FAC85F','#27eec0','#b42d4d','#82d72d','#e35ffa']
fig = px.bar(x=np.arange(10),
             y=np.zeros(10)+1,
             color=colors,
             color_discrete_sequence=colors,
             height=200)
fig.update_layout(template_dark,
                  font_color="#111111",
                  legend_font_color="white")
fig.show()

When creating a figure you can specify the color sequence you’d like as a parameter. You can use a custom list like what I’m showing here or you can call one of the px.colors.qualitative. lists.

[ ]:
fig = px.line(df_pvss,
              color_discrete_sequence=colors)

fig.update_layout(
  {**template_dark, **template_pvss},
  title_text = 'Custom Colors & Dashes')

fig.show()

Saving Figures

To save static images, first you need to install kaleido.

[ ]:
!pip install -U kaleido
Collecting kaleido
  Downloading kaleido-0.2.1-py2.py3-none-manylinux1_x86_64.whl (79.9 MB)
     |████████████████████████████████| 79.9 MB 85 kB/s
Installing collected packages: kaleido
Successfully installed kaleido-0.2.1

In colab only you’ll also need to do this install.

[ ]:
!wget https://github.com/plotly/orca/releases/download/v1.2.1/orca-1.2.1-x86_64.AppImage -O /usr/local/bin/orca
!chmod +x /usr/local/bin/orca
!apt-get install xvfb libgtk2.0-0 libgconf-2-4

Now saving an image is easy and they support a few types. More information and options available in their docs.

[ ]:
fig.write_image("fig.png")
fig.write_image("fig.jpeg")
fig.write_image("fig.svg")
fig.write_image("fig.pdf")

You can also keep the interactivity by downloading an HTML file, either with or without the plotly javascript library. You’ll need the library if you want to view the plots when you don’t have internet access. For more, see their docs on saving HTML files.

[ ]:
fig.write_html('fig-full.html')
fig.write_html('fig-small.html',full_html=False,include_plotlyjs='cdn')

Plot Types

Remember Plotly has GREAT documentation, and LOTS of plot types. Below is a GIF of their main page on all the different plot types:

plotly-python-plot-types.gif

Now here are the gallery of plots all in plotly.express meaning they are super easy to create! Again, check out the docs:

plotly-express-plot-types.gif

Now we’ll go through a few relevant examples for mechanical engineers and vibration analysis.

Bubble

[ ]:
df_table = pd.read_csv('https://info.endaq.com/hubfs/data/endaq-cloud-table.csv')
df_table = df_table[['serial_number_id', 'file_name', 'file_size', 'recording_length', 'recording_ts',
         'accelerationPeakFull', 'psuedoVelocityPeakFull', 'accelerationRMSFull',
         'velocityRMSFull', 'displacementRMSFull', 'pressureMeanFull', 'temperatureMeanFull']].copy()

df_table['recording_ts'] = pd.to_datetime(df_table['recording_ts'], unit='s')
df_table = df_table.sort_values(by=['recording_ts'], ascending=False)
df_table
serial_number_id file_name file_size recording_length recording_ts accelerationPeakFull psuedoVelocityPeakFull accelerationRMSFull velocityRMSFull displacementRMSFull pressureMeanFull temperatureMeanFull
11 11456 50_Joules_900_lbs-1629315312.ide 1597750 20.201752 2021-07-26 19:56:39 231.212 2907.650 2.423 54.507 1.066 98.745 24.175
10 11456 100_Joules_900_lbs-1629315313.ide 1596714 20.200623 2021-07-26 19:21:55 218.634 2961.256 2.877 53.875 1.053 98.751 24.180
22 9695 Tilt_000000-1625156721.IDE 719403 23.355163 2021-07-01 16:21:01 0.378 330.946 0.044 11.042 0.345 99.510 26.410
7 11162 Calibration-Shake-1632515140.IDE 2218130 27.882690 2021-05-17 19:16:10 8.783 1142.282 2.712 46.346 0.617 102.251 24.545
17 11071 surgical-instrument-1625829182.ide 541994 6.951172 2021-04-22 16:53:10 5.739 387.312 1.568 24.418 0.242 99.879 21.889
8 10916 FUSE_HSTAB_000005-1632515139.ide 537562 18.491791 2021-04-22 16:13:24 0.202 53.375 0.011 1.504 0.036 90.706 18.874
2 10118 Bolted-1632515144.ide 6149229 29.396118 2021-04-21 21:44:07 15.343 148.276 2.398 14.101 0.154 99.652 23.172
20 9680 LOC__6__DAQ41551_25_01-1625170793.IDE 8664238 63.878937 2021-03-25 04:53:27 564.966 2357.599 54.408 145.223 3.088 102.875 26.031
19 9680 LOC__4__DAQ41551_15_05-1625170794.IDE 6927958 64.486054 2021-03-25 04:22:10 585.863 2153.020 46.528 148.591 2.615 105.750 32.202
18 9680 LOC__3__DAQ41551_11_01_02-1625170795.IDE 2343292 28.456818 2021-03-25 04:06:19 622.040 8907.949 94.197 372.049 9.580 105.682 33.452
21 9680 LOC__2__DAQ38060_06_03_05-1625170793.IDE 1519172 27.057647 2021-03-25 02:54:22 995.670 5845.241 131.087 323.287 3.144 104.473 25.616
12 11046 Drive-Home_07-1626805222.ide 36225758 634.732056 2021-03-19 19:35:57 23.805 356.128 0.097 6.117 0.135 101.988 28.832
5 11046 Drive-Home_01-1632515142.ide 3632799 61.755371 2021-03-19 18:35:55 0.479 40.197 0.021 1.081 0.023 100.284 29.061
14 10030 200922_Moto_Max_Run5_Control_Larry-1626297441.ide 4780893 99.325134 2020-09-22 23:47:35 29.864 1280.349 3.528 55.569 1.060 NaN NaN
0 9695 train-passing-1632515146.ide 10492602 73.612335 2020-04-29 18:20:36 7.513 419.944 0.372 6.969 0.061 104.620 23.432
15 9695 ford_f150-1626296561.ide 96097059 1207.678344 2020-03-13 23:35:08 NaN NaN NaN NaN NaN NaN NaN
4 9295 Seat-Base_21-1632515142.ide 5248836 83.092255 2019-12-08 10:16:50 1.085 251.009 0.130 7.318 0.190 98.930 17.820
1 9316 Seat-Top_09-1632515145.ide 10491986 172.704559 2019-12-08 10:14:31 1.105 86.595 0.082 1.535 0.040 98.733 20.133
16 7530 Motorcycle-Car-Crash-1626277852.ide 10489262 151.069336 2019-07-03 17:02:52 480.737 12831.590 1.732 143.437 3.988 100.363 26.989
6 0 HiTest-Shock-1632515141.ide 2655894 20.331848 2018-12-04 15:22:54 619.178 6058.093 11.645 167.835 4.055 101.126 9.538
13 5120 Mining-SSX28803_06-1626457584.IDE 402920686 3238.119202 2018-09-14 19:28:24 NaN NaN NaN NaN NaN NaN NaN
9 9874 Coffee_002-1631722736.IDE 60959516 769.299896 2000-03-03 20:02:24 2.698 1338.396 0.059 5.606 0.104 100.339 24.540
3 10309 RMI-2000-1632515143.ide 5909632 60.250855 1970-01-01 00:00:24 0.332 17.287 0.079 1.247 0.005 100.467 21.806
[ ]:
fig = px.scatter(df_table,
           x="recording_ts",
           y="accelerationRMSFull",
           size="recording_length",
           color="serial_number_id",
           hover_name="file_name",
           log_y=True,
           size_max=60)

fig.update_layout(
  template_dark,
  title_text ='Scatter Plot with Numeric "Color"',
  xaxis_title_text ="Date of Recording",
  yaxis_title_text ="Acceleration RMS (g)"
)

fig.show()
[ ]:
df_table['device'] = df_table["serial_number_id"].astype(str)

fig = px.scatter(df_table,
           x="accelerationPeakFull",
           y="velocityRMSFull",
           size="recording_length",
           color="device",
           color_discrete_sequence=px.colors.qualitative.Light24, #I'll want to use a list with as many discrete values as I can
           hover_name="file_name",
           log_y=True,
           log_x=True,
           size_max=60)

fig.update_layout(
  template_dark,
  title_text ='Scatter Plot with Text "Color"',
  xaxis_title_text ="Acceleration RMS (g)",
  yaxis_title_text ="Velocity RMS (mm/s)"
)

fig.show()