enDAQ Custom Analysis¶
Introduction¶
This notebook and accompanying webinar was developed and released by the enDAQ team. This is the fifth “chapter” of our series on Python for Mechanical Engineers:
Introduction of the enDAQ Library
There are lots of examples in this!
More Examples! (Today’s Webinar)
Frequency Analysis (FFTs and PSDs)
Simple Shock Response Spectrums
Peak Analysis
Preview
endaq.batch
To sign up for future webinars and watch previous ones, visit our webinars page.
Installation¶
Available on PyPi via: > pip install endaq
For the most recent features that are still under development, you can also use pip to install endaq directly from GitHub: > pip install git+https://github.com/MideTechnology/endaq-python.git@development
[ ]:
!pip install -q endaq
!pip install -q kaleido #this is for rendering images with plotly
exit() #needed in Colab because they pre-load some libraries, wouldn't be neccessary if running locally
|████████████████████████████████| 71 kB 3.0 MB/s
|████████████████████████████████| 11.3 MB 13.0 MB/s
|████████████████████████████████| 62 kB 749 kB/s
|████████████████████████████████| 38.1 MB 1.3 MB/s
|████████████████████████████████| 93 kB 1.0 MB/s
|████████████████████████████████| 83 kB 1.0 MB/s
|████████████████████████████████| 25.3 MB 51.8 MB/s
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas~=1.1.0; python_version >= "3.0", but you have pandas 1.3.4 which is incompatible.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.26.0 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.
|████████████████████████████████| 79.9 MB 1.2 MB/s
[ ]:
import endaq
endaq.plot.utilities.set_theme(theme='endaq')
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio; pio.renderers.default = "iframe"
import pandas as pd
import numpy as np
import scipy
PSDs & FFTs¶
Simple Sine Wave¶
[ ]:
time = np.linspace(0,2,200,endpoint=False)
sine_waves = pd.DataFrame(index=time)
sine_waves['0.8g @ 2 Hz'] = 0.8*np.sin(2*np.pi*2 * time)
sine_waves['1g @ 3 Hz'] = np.sin(2*np.pi*3 * time)
sine_waves['0.6g @ 5 Hz'] = 0.6*np.sin(2*np.pi*5 * time)
sine_waves['0.5g @ 4 & 6 Hz'] = 0.5*np.sin(2*np.pi*4 * time) + 0.5*np.sin(2*np.pi*6 * time)
sine_waves['0.3g @ 7 Hz'] = 0.3*np.sin(2*np.pi*7 * time)
sine_waves
0.8g @ 2 Hz | 1g @ 3 Hz | 0.6g @ 5 Hz | 0.5g @ 4 & 6 Hz | 0.3g @ 7 Hz | |
---|---|---|---|---|---|
0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
0.01 | 0.100267 | 0.187381 | 0.185410 | 0.308407 | 0.127734 |
0.02 | 0.198952 | 0.368125 | 0.352671 | 0.583150 | 0.231154 |
0.03 | 0.294500 | 0.535827 | 0.485410 | 0.794687 | 0.290575 |
0.04 | 0.385403 | 0.684547 | 0.570634 | 0.921177 | 0.294686 |
... | ... | ... | ... | ... | ... |
1.95 | -0.470228 | -0.809017 | -0.600000 | -0.951057 | -0.242705 |
1.96 | -0.385403 | -0.684547 | -0.570634 | -0.921177 | -0.294686 |
1.97 | -0.294500 | -0.535827 | -0.485410 | -0.794687 | -0.290575 |
1.98 | -0.198952 | -0.368125 | -0.352671 | -0.583150 | -0.231154 |
1.99 | -0.100267 | -0.187381 | -0.185410 | -0.308407 | -0.127734 |
200 rows × 5 columns
[ ]:
fig = px.line(sine_waves)
fig.update_layout(
title_text='Comparison of Fabricated Sine Waves',
yaxis_title_text='Acceleration (g)',
xaxis_title_text='Time (s)',
legend_title_text=''
)
Now to compute the PSD on this using endaq.calc.psd.welch()
, see docs
[ ]:
psd = endaq.calc.psd.welch(sine_waves, bin_width=0.5)
[ ]:
fig = px.line(psd)
fig.update_layout(
title_text='Comparison of the PSD of Two Fabricated Sine Waves',
yaxis_title_text='Acceleration (g^2/Hz)',
xaxis_title_text='Frequency (Hz)',
xaxis_type='log',
legend_title_text=''
)
PSDs are the more appropriate way to analyze vibration data for a number of reasons (see blog) but we typically see people want to see the FFT because it’s units are easier to intuitively understand.
In my experience the best way to do this is to compute a PSD using our function with a few modifications: - Use scaling as parseval
which means it is g^2
instead of g^2/Hz
- Use a boxcar window so in effect there is no windowing - Define a very fine bin width - Set the overlap between FFTs to 0 (not necessary if the bin width is small enough) - Scale from g^2 as RMS to g-peak via **2 * (2**0.5)
[ ]:
fft = endaq.calc.psd.welch(
sine_waves,
scaling='parseval',
window='boxcar',
noverlap=0,
bin_width=0.1,
)
fft = fft**0.5 * (2**0.5) #scale from g^2 as RMS to g-peak
fig = px.line(fft)
fig.update_layout(
title_text='Comparison of the FFT (from PSD) of Two Fabricated Sine Waves',
yaxis_title_text='Acceleration (g)',
xaxis_title_text='Frequency (Hz)',
xaxis_type='log',
legend_title_text='',
template='endaq_light'
)
/usr/local/lib/python3.7/dist-packages/scipy/signal/spectral.py:1966: UserWarning:
nperseg = 1000 is greater than input length = 200, using nperseg = 200
Real Sine Wave¶
Let’s load a IDE file with get_doc
docs.
[ ]:
doc = endaq.ide.get_doc('https://info.endaq.com/hubfs/100Hz_shake_cal.ide')
endaq.ide.get_channel_table(doc)
channel | name | type | units | start | end | duration | samples | rate | |
---|---|---|---|---|---|---|---|---|---|
0 | 8.0 | X (100g) | Acceleration | g | 00:00.0218 | 00:25.0532 | 00:25.0314 | 126575 | 5000.07 Hz |
1 | 8.1 | Y (100g) | Acceleration | g | 00:00.0218 | 00:25.0532 | 00:25.0314 | 126575 | 5000.07 Hz |
2 | 8.2 | Z (100g) | Acceleration | g | 00:00.0218 | 00:25.0532 | 00:25.0314 | 126575 | 5000.07 Hz |
3 | 80.0 | X (40g) | Acceleration | g | 00:00.0218 | 00:25.0547 | 00:25.0328 | 100934 | 3985.00 Hz |
4 | 80.1 | Y (40g) | Acceleration | g | 00:00.0218 | 00:25.0547 | 00:25.0328 | 100934 | 3985.00 Hz |
5 | 80.2 | Z (40g) | Acceleration | g | 00:00.0218 | 00:25.0547 | 00:25.0328 | 100934 | 3985.00 Hz |
6 | 36.0 | Pressure/Temperature:00 | Pressure | Pa | 00:00.0217 | 00:25.0702 | 00:25.0485 | 27 | 1.06 Hz |
7 | 36.1 | Pressure/Temperature:01 | Temperature | °C | 00:00.0217 | 00:25.0702 | 00:25.0485 | 27 | 1.06 Hz |
8 | 70.0 | X | Quaternion | q | 00:00.0284 | 00:24.0682 | 00:24.0397 | 2424 | 99.35 Hz |
9 | 70.1 | Y | Quaternion | q | 00:00.0284 | 00:24.0682 | 00:24.0397 | 2424 | 99.35 Hz |
10 | 70.2 | Z | Quaternion | q | 00:00.0284 | 00:24.0682 | 00:24.0397 | 2424 | 99.35 Hz |
11 | 70.3 | W | Quaternion | q | 00:00.0284 | 00:24.0682 | 00:24.0397 | 2424 | 99.35 Hz |
12 | 59.0 | Control Pad Pressure | Pressure | Pa | 00:00.0251 | 00:25.0399 | 00:25.0148 | 252 | 10.02 Hz |
13 | 59.1 | Control Pad Temperature | Temperature | °C | 00:00.0251 | 00:25.0399 | 00:25.0148 | 252 | 10.02 Hz |
14 | 59.2 | Relative Humidity | Relative Humidity | RH | 00:00.0251 | 00:25.0399 | 00:25.0148 | 252 | 10.02 Hz |
15 | 76.0 | Lux | Light | Ill | 00:00.0000 | 00:25.0278 | 00:25.0278 | 100 | 3.96 Hz |
16 | 76.1 | UV | Light | Index | 00:00.0000 | 00:25.0278 | 00:25.0278 | 100 | 3.96 Hz |
Now get the actual data out of one channel with to_pandas()
docs.
Then generate a plot with rolling_min_max_envelope()
(docs) that will instead of plotting all data points make a shaded plot which will look identical to plotting all points but be more responsive and faster (not entirely necessary on this dataset, but becomes so with larger ones).
[ ]:
accel = endaq.ide.to_pandas(doc.channels[80], time_mode='seconds')
accel = accel-accel.median() #remove DC offset
endaq.plot.rolling_min_max_envelope(
accel,
plot_as_bars=True,
desired_num_points=1000,
opacity=0.7
)
Here we’ll plot the raw data around the peak event with around_peak()
docs.
[ ]:
endaq.plot.around_peak(accel, num=500)
[ ]:
fft = endaq.calc.psd.welch(
accel[4.5:9.5], #just do it in time of first dwell
scaling='parseval',
window='boxcar',
noverlap=0,
bin_width=0.5,
)
fft = fft**0.5 * (2**0.5) #scale from g^2 as RMS to g-peak
fig = px.line(fft[:500])
fig.update_layout(
title_text='FFT (from PSD) of Real World 10.3g Sine Wave',
yaxis_title_text='Acceleration (g)',
xaxis_title_text='Frequency (Hz)',
legend_title_text='',
)
Comparing FFT Options¶
FFT Option Overview¶
There are a lot of different ways to compute a FFT in Python. I’ve been using Welch’s method so far but I want to compare that to some other more “direct” methods. So let’s compare them! First let’s conveniently wrap our manipulation of Welch’s method PSD into a FFT inside a function.
[ ]:
def welch_fft(df, bin_width=0.5):
fft = endaq.calc.psd.welch(df, bin_width=bin_width, scaling='parseval', window='boxcar', noverlap=0)
return fft**0.5 * (2**0.5) #scale from g^2 as RMS to g-peak
Now let’s wrap something around Numpy’s FFT functions for a real discrete Fourier transform.
Notice here we actually have phase information! So we’ll return that too.
[ ]:
def numpy_fft(df):
"""
Using Numpy's rfft functions compute a discrete Fourier Transform
"""
freq = np.fft.rfftfreq(len(df), d=endaq.calc.utils.sample_spacing(df))
df_fft = pd.DataFrame(
np.fft.rfft(df.to_numpy(), axis=0),
index=pd.Series(freq, name="frequency (Hz)"),
columns=df.columns
)
df_mag = df_fft.apply(np.abs, raw=True) / len(df_fft)
df_phase = df_fft.apply(np.angle, raw=True)
return df_mag, df_phase
Now we will use the FFTW algorithm which is available in the pyFFTW library under a GPL license (which makes it potentially difficult for us to use because we use the more premissive MIT license).
First let’s download it.
[ ]:
!pip install -q pyfftw
|████████████████████████████████| 2.6 MB 5.3 MB/s
Now let’s use it in a function which allows for a drop-in replacement to the Numpy code. This algorithm is generally regarded as the fatest for computing discrete Fourier transforms - so we’ll put it to the test!
[ ]:
import pyfftw
def fftw_fft(df):
"""
Using the FFTW algorithm, compute a discrete Fourier Transform
"""
freq = pyfftw.interfaces.numpy_fft.rfftfreq(len(df), d=endaq.calc.utils.sample_spacing(df))
df_fft = pd.DataFrame(
pyfftw.interfaces.numpy_fft.rfft(df.to_numpy(), axis=0),
index=pd.Series(freq, name="frequency (Hz)"),
columns=df.columns
)
df_mag = df_fft.apply(np.abs, raw=True) / len(df_fft)
df_phase = df_fft.apply(np.angle, raw=True)
return df_mag, df_phase
Now let’s see a FFT result with this library. Note though this will be relatively large to plot…
[ ]:
fft, phase = fftw_fft(accel[4.5:9.5])
fig = px.line(fft[80:120])
fig.update_layout(
title_text='FFT using FFTW of Real World 10.3g Sine Wave',
yaxis_title_text='Acceleration (g)',
xaxis_title_text='Frequency (Hz)',
legend_title_text='',
)
Remember one of the benefits of the DFT is to get phase. Although not particularly useful for this dataset let’s plot it, notice the shift for the Z axis at the drive frequency.
[ ]:
fig = px.line(phase[80:120])
fig.update_layout(
title_text='Phase using FFTW of Real World 10.3g Sine Wave',
yaxis_title_text='Phase Angle (radians)',
xaxis_title_text='Frequency (Hz)',
legend_title_text='',
)
FFT Option Comparison¶
Now let’s do the fun part to compare the three approaches! First let’s make a sine wave with a bit over 1M points.
[ ]:
time = np.linspace(0,200,2**20,endpoint=False)
sine_waves = pd.DataFrame(index=time)
sine_waves['10g @ 100 Hz'] = 10*np.sin(2*np.pi*100 * time)
sine_waves['8g @ 99 Hz'] = 8*np.sin(2*np.pi*99 * time)
sine_waves['6g @ 100.25 Hz'] = 6*np.sin(2*np.pi*100.25 * time)
[ ]:
fig = px.line(sine_waves[:0.01])
fig.update_layout(
title_text='Comparison of Fabricated Sine Waves',
yaxis_title_text='Acceleration (g)',
xaxis_title_text='Time (s)',
legend_title_text=''
)
[ ]:
from time import process_time
import timeit
pyfftw.forget_wisdom() #FFTW will be faster on subsequent runs
lengths = [2**14, 16411, 2**17, 131101, 2**20, 1000003]
ffts = pd.DataFrame()
times = pd.DataFrame()
def melt_fft(fft,name,L):
fft = fft.reset_index().melt(id_vars='frequency (Hz)')
fft['Algo'] = name
fft['Length'] = L
return fft
def time_fft(func):
return np.array(timeit.repeat(func, number=1, repeat=10)).mean()
def time_pyfftw(x, threads=1):
pyfftw.forget_wisdom()
rfft = pyfftw.builders.rfft(x, threads=threads, auto_align_input=True)
rfft()
return np.array(timeit.repeat(lambda: rfft(), number=1, repeat=10)).mean()
for L in lengths:
x = sine_waves.iloc[:L, 0].to_numpy()
t1 = process_time()
wfft_bin = welch_fft(sine_waves.iloc[:L], bin_width=0.5)
t2 = process_time()
wfft = welch_fft(sine_waves.iloc[:L], bin_width=1/(L*endaq.calc.utils.sample_spacing(sine_waves)))
t3 = process_time()
nfft, phase = numpy_fft(sine_waves.iloc[:L])
t4 = process_time()
fftw, phase = fftw_fft(sine_waves.iloc[:L])
t5 = process_time()
times_t = pd.DataFrame(
{'Welch w/ 0.5 Hz Bin':t2-t1,
'Welch': t3-t2,
'Numpy':t4-t3,
'FFTW':t5-t4,
'Numpy - No OH': time_fft(lambda: np.fft.rfft(x)),
'FFTW - No OH': time_pyfftw(x, threads=1)
},
index=pd.Series([L],name='Length')
)
times = pd.concat([times,times_t.reset_index().melt(id_vars='Length')])
wfft_bin = melt_fft(wfft_bin.copy(),'Welch w/ 0.5 Hz Bin',L)
wfft = melt_fft(wfft.copy(),'Welch',L)
nfft = melt_fft(nfft.copy(),'Numpy',L)
fftw = melt_fft(fftw.copy(),'FFTW',L)
ffts = pd.concat([ffts,wfft_bin,wfft,nfft,fftw])
[ ]:
times['str_len'] = times['Length'].astype('str')
fig = px.bar(
times,
x='str_len',
y='value',
color='variable',
log_y=True,
barmode='group',
labels={'value':'Computation Time (s)','str_len':'Array Length','variable':''}
)
fig.show()
[ ]:
fig = px.scatter(
times,
x='str_len',
y='value',
color='variable',
log_y=True,
log_x=True,
labels={'value':'Computation Time (s)','str_len':'Array Length','variable':''}
)
fig.show()
So what does this mean!? FFTW is the fastest as expected, but only if we first structure the data in a more efficient way. But typically you will not have the data structured in this “optimal” way for FFTW which means the time it takes to restucture it is real.
Long story short for this audience, using Welch’s method is fastest!
[ ]:
fig = px.line(
ffts[(ffts['frequency (Hz)']>95) & (ffts['frequency (Hz)']<105)],
x='frequency (Hz)',
y='value',
color='Algo',
facet_col='Length',
facet_row='variable',
title='FFTs: Acceleration (g) vs Frequency (Hz)',
labels={'value':'','frequency (Hz)':''}
)
fig.for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))
fig.update_layout(width=1200,height=1000)
fig.show()