endaq.calc

endaq.calc is a module comprising a collection of common calculations for vibration analysis. It leverages the standard Python scientific stack (NumPy, SciPy, Pandas) in order to enable engineers to perform domain-specific calculations in a few lines of code, without having to first master Python and its scientific stack in their entireties.

endaq.calc.fft

endaq.calc.fft.aggregate_fft(df, **kwargs)

Calculate the FFT using scipy.signal.welch() with a specified frequency spacing. The data returned is in the same units as the data input.

Parameters:
  • df – The input data

  • bin_width – The desired width of the resulting frequency bins, in Hz; defaults to 1 Hz

  • kwargs – Other parameters to pass directly to scipy.signal.welch()

Returns:

A periodogram of the input data in the same units as the input.

See also

endaq.calc.fft.dct(df, nfft=None, norm=None, **kwargs)

Calculate the DCT of the data in df, using SciPy’s DCT method from scipy.fft.dct().

Parameters:
  • df (DataFrame) – the input data

  • nfft (int | None) – Length of the transformed axis of the output. If nfft is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.

  • norm (Literal[None, 'unit', 'forward', 'ortho', 'backward'] | None) – Normalization mode. Default is “unit”, meaning a normalization of 2/n is applied on the forward transform, and a normalization of 1/2 is applied on the idct. The “unit” normalization means that the units of the FFT are the same as the units of the data put into it and that a sinusoid of amplitude A will peak with amplitude A in the frequency domain. “forward” instead applies a normalization of 1/n on the forward transforms and no normalization is applied on the idct. “backward” applies no normalization on the forward transform and 1/n on the backward. For norm="ortho", both directions are scaled by 1/sqrt(n).

  • kwargs – Further keywords passed to scipy.fft.dct(). Note that the nfft parameter of this function is passed to scipy.fft.dct() as n.

Returns:

The DCT of each channel in df.

Return type:

DataFrame

See also

endaq.calc.fft.dst(df, nfft=None, norm=None, **kwargs)

Calculate the DST of the data in df, using SciPy’s DST method from scipy.fft.dst().

Parameters:
  • df (DataFrame) – the input data

  • nfft (int | None) – Length of the transformed axis of the output. If nfft is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.

  • norm (Literal[None, 'unit', 'forward', 'ortho', 'backward'] | None) – Normalization mode. Default is “unit”, meaning a normalization of 2/n is applied on the forward transform, and a normalization of 1/2 is applied on the idst. The “unit” normalization means that the units of the FFT are the same as the units of the data put into it and that a sinusoid of amplitude A will peak with amplitude A in the frequency domain. “forward” instead applies a normalization of 1/n on the forward transforms and no normalization is applied on the idst. “backward” applies no normalization on the forward transform and 1/n on the backward. For norm="ortho", both directions are scaled by 1/sqrt(n).

  • kwargs – Further keywords passed to scipy.fft.dst(). Note that the nfft parameter of this function is passed to scipy.fft.dst() as n.

Returns:

The DST of each channel in df.

Return type:

DataFrame

See also

endaq.calc.fft.fft(df, output=None, nfft=None, norm=None, optimize=True)

Perform the FFT of the data in df, using SciPy’s FFT method from scipy.fft.fft(). If the in df is all real, then the output will be symmetrical between positive and negative frequencies, and it is instead recommended that you use the endaq.calc.fft.fft() method.

Parameters:
  • df (DataFrame) – The input data

  • output (Literal[None, 'magnitude', 'angle', 'complex'] | None) – The type of the output of the FFT. Default is “magnitude”. “magnitude” will return the absolute value of the FFT, “angle” will return the phase angle in radians, “complex” will return the complex values of the FFT.

  • nfft (int | None) – Length of the transformed axis of the output. If nfft is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.

  • norm (Literal[None, 'unit', 'forward', 'ortho', 'backward'] | None) – Normalization mode. Default is “unit”, meaning a normalization of 2/n is applied on the forward transform, and a normalization of 1/2 is applied on the ifft. The “unit” normalization means that the units of the FFT are the same as the units of the data put into it and that a sinusoid of amplitude A will peak with amplitude A in the frequency domain. “forward” instead applies a normalization of 1/n on the forward transforms and no normalization is applied on the ifft. “backward” applies no normalization on the forward transform and 1/n on the backward. For norm="ortho", both directions are scaled by 1/sqrt(n).

  • optimize (bool) – If optimize is set to True, the length of the FFT will automatically be padded to a length which can be calculated more quickly. Default is True.

  • kwargs – Further keywords passed to scipy.fft.fft(). Note that the nfft parameter of this function is passed to scipy.fft.fft() as n.

Returns:

The FFT of each channel in df.

Return type:

DataFrame

See also

endaq.calc.fft.rfft(df, output=None, nfft=None, norm=None, optimize=True)

Perform the real valued FFT of the data in df, using SciPy’s RFFT method from scipy.fft.rfft().

Parameters:
  • df (DataFrame) – The input data

  • output (Literal[None, 'magnitude', 'angle', 'complex'] | None) – The type of the output of the FFT. Default is “magnitude”. “magnitude” will return the absolute value of the FFT, “angle” will return the phase angle in radians, “complex” will return the complex values of the FFT.

  • nfft (int | None) – Length of the transformed axis of the output. If nfft is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.

  • norm (Literal[None, 'unit', 'forward', 'ortho', 'backward'] | None) – Normalization mode. Default is “unit”, meaning a normalization of 2/n is applied on the forward transform, and a normalization of 1/2 is applied on the ifft. The “unit” normalization means that the units of the FFT are the same as the units of the data put into it and that a sinusoid of amplitude A will peak with amplitude A in the frequency domain. “forward” instead applies a normalization of 1/n on the forward transforms and no normalization is applied on the ifft. “backward” applies no normalization on the forward transform and 1/n on the backward. For norm="ortho", both directions are scaled by 1/sqrt(n).

  • optimize (bool) – If optimize is set to True, the length of the FFT will automatically be padded to a length which can be calculated more quickly. Default is True.

  • kwargs – Further keywords passed to scipy.fft.rfft(). Note that the nfft parameter of this function is passed to scipy.fft.rfft() as n.

Returns:

The RFFT of each channel in df.

Return type:

DataFrame

See also

endaq.calc.fft.rolling_fft(df, bin_width=1.0, num_slices=100, indexes=None, index_values=None, slice_width=None, add_resultant=True, disable_warnings=True, **kwargs)

Compute FFTs for defined slices of a time series data set using aggregate_fft()

Parameters:
  • df (DataFrame) – the input dataframe with an index defining the time in seconds or datetime

  • bin_width (float) – the bin width or resolution in Hz for the FFT

  • num_slices (int) – the number of slices to split the time series into, default is 100, this is ignored if indexes is defined

  • indexes (array | None) – the center index locations (not value) of each slice to compute the FFT

  • index_values (array | None) – the index values of each peak event to quantify (slower but more intuitive than using indexes)

  • slice_width (float | None) – the time in seconds to center about each slice index, if none is provided it will calculate one based upon the number of slices

  • add_resultant (bool) – if True the root sum of squares of each FFT column will also be computed

  • disable_warnings (bool) – if True (default) it disables the warnings on the PSD length

  • kwargs – Other parameters to pass directly to aggregate_fft()

Returns:

a dataframe containing all the FFTs, stacked on each other

Return type:

DataFrame

See example use cases and syntax at spectrum_over_time() which visualizes the output of this function in Heatmaps, Waterfall plots, Surface plots, and Animations

endaq.calc.filters

endaq.calc.filters.band_limited_noise(min_freq=0.0, max_freq=None, duration=1.0, sample_rate=1000.0, norm='peak')

Generate a time series with noise in a defined frequency range.

Parameters:
  • min_freq (float) – minimum frequency (Hz) where noise starts, default to 0

  • max_freq (float) – maximum frequency (Hz) where noise ends, default to 1/2 the sample rate

  • duration (float) – the duration of the time series returned, in seconds

  • sample_rate (float) – sample rate (Hz) of the time series

  • norm (Literal[('rms', 'peak')]) – how to normalize the amplitude so that one of the following is equal to 1: * “rms” - root mean square * “peak” - peak value, default

Returns:

a dataframe of the generated time series

Return type:

pd.DataFrame

See also

endaq.calc.filters.bessel(df, low_cutoff=1.0, high_cutoff=None, half_order=3, tukey_percent=0.0, norm='mag')

Apply a lowpass and/or a highpass Bessel filter to an array.

This function uses Bessel filter designs, and implements the filter(s) as bi-directional digital biquad filters, split into second-order sections.

Parameters:
  • df (pd.DataFrame) – the input data; cutoff frequencies are relative to the timestamps in df.index

  • low_cutoff (Optional[float]) – the low-frequency cutoff, if any; frequencies below this value are rejected, and frequencies above this value are preserved

  • high_cutoff (Optional[float]) – the high-frequency cutoff, if any; frequencies above this value are rejected, and frequencies below this value are preserved

  • half_order (int) – half of the order of the filter; higher orders provide more aggressive stopband reduction

  • tukey_percent (float) – the alpha parameter of a preconditioning Tukey filter; if 0 (default), no filter is applied

  • norm (Literal[('phase', 'delay', 'mag')]) – how to normalize relative to the critical frequency: * “phase” - “phase-matched” case which is the default in SciPy & MATLAB * “delay” - the “natural” type obtained by solving Bessel polynomials * “mag” - gain magnitude is -3 dB at the cutoff frequency, default for this implementation to match Butterworth

Returns:

the filtered data

Return type:

pd.DataFrame

See also

endaq.calc.filters.butterworth(df, low_cutoff=1.0, high_cutoff=None, half_order=3, tukey_percent=0.0)

Apply a lowpass and/or a highpass Butterworth filter to an array.

This function uses Butterworth filter designs, and implements the filter(s) as bi-directional digital biquad filters, split into second-order sections.

Parameters:
  • df (DataFrame) – the input data; cutoff frequencies are relative to the timestamps in df.index

  • low_cutoff (float | None) – the low-frequency cutoff, if any; frequencies below this value are rejected, and frequencies above this value are preserved

  • high_cutoff (float | None) – the high-frequency cutoff, if any; frequencies above this value are rejected, and frequencies below this value are preserved

  • half_order (int) – half of the order of the filter; higher orders provide more aggressive stopband reduction

  • tukey_percent (float) – the alpha parameter of a preconditioning Tukey filter; if 0 (default), no filter is applied

Returns:

the filtered data

Return type:

DataFrame

See also

endaq.calc.filters.cheby1(df, low_cutoff=1.0, high_cutoff=None, half_order=3, tukey_percent=0.0, rp=3.0)

Apply a lowpass and/or a highpass Chebyshev type I filter to an array.

This function uses Chebyshev type I filter designs, and implements the filter(s) as bi-directional digital biquad filters, split into second-order sections.

Parameters:
  • df (DataFrame) – the input data; cutoff frequencies are relative to the timestamps in df.index

  • low_cutoff (float | None) – the low-frequency cutoff, if any; frequencies below this value are rejected, and frequencies above this value are preserved

  • high_cutoff (float | None) – the high-frequency cutoff, if any; frequencies above this value are rejected, and frequencies below this value are preserved

  • half_order (int) – half of the order of the filter; higher orders provide more aggressive stopband reduction

  • tukey_percent (float) – the alpha parameter of a preconditioning Tukey filter; if 0 (default), no filter is applied

  • rp (float) – the maximum ripple allowed in the passband, specified in decibels

Returns:

the filtered data

Return type:

DataFrame

See also

endaq.calc.filters.cheby2(df, low_cutoff=1.0, high_cutoff=None, half_order=3, tukey_percent=0.0, rs=30.0)

Apply a lowpass and/or a highpass Chebyshev type II filter to an array.

This function uses Chebyshev type II filter designs, and implements the filter(s) as bi-directional digital biquad filters, split into second-order sections.

Parameters:
  • df (DataFrame) – the input data; cutoff frequencies are relative to the timestamps in df.index

  • low_cutoff (float | None) – the low-frequency cutoff, if any; frequencies below this value are rejected, and frequencies above this value are preserved

  • high_cutoff (float | None) – the high-frequency cutoff, if any; frequencies above this value are rejected, and frequencies below this value are preserved

  • half_order (int) – half of the order of the filter; higher orders provide more aggressive stopband reduction

  • tukey_percent (float) – the alpha parameter of a preconditioning Tukey filter; if 0 (default), no filter is applied

  • rs (float) – the minimum attenuation allowed in the stopband, specified in decibels

Returns:

the filtered data

Return type:

DataFrame

See also

endaq.calc.filters.ellip(df, low_cutoff=1.0, high_cutoff=None, half_order=3, tukey_percent=0.0, rp=3.0, rs=30.0)

Apply a lowpass and/or a highpass Elliptic filter to an array.

This function uses Elliptic filter designs, and implements the filter(s) as bi-directional digital biquad filters, split into second-order sections.

Parameters:
  • df (DataFrame) – the input data; cutoff frequencies are relative to the timestamps in df.index

  • low_cutoff (float | None) – the low-frequency cutoff, if any; frequencies below this value are rejected, and frequencies above this value are preserved

  • high_cutoff (float | None) – the high-frequency cutoff, if any; frequencies above this value are rejected, and frequencies below this value are preserved

  • half_order (int) – half of the order of the filter; higher orders provide more aggressive stopband reduction

  • tukey_percent (float) – the alpha parameter of a preconditioning Tukey filter; if 0 (default), no filter is applied

  • rp (float) – the maximum ripple allowed in the passband, specified in decibels

  • rs (float) – the minimum attenuation allowed in the stopband, specified in decibels

Returns:

the filtered data

Return type:

DataFrame

See also

endaq.calc.filters.rolling_mean(df, duration=5.0)

Remove the rolling mean of an input time series dataframe

Parameters:
  • df (DataFrame) – the input data

  • duration (float) – the rolling window size in seconds to use - if None is given, the entire mean is removed

Returns:

a dataframe of the filtered data

Return type:

DataFrame

endaq.calc.integrate

endaq.calc.integrate.integrals(df, n=1, zero='start', highpass_cutoff=None, tukey_percent=0.0)

Calculate n integrations of the given data.

Parameters:
  • df (DataFrame) – the data to integrate, indexed with timestamps

  • n (int) – the number of integrals to calculate; defaults to 1

  • zero (Literal['start', 'mean', 'median'] | ~typing.Callable) – the output quantity driven to zero by the integration constant; “start” (default) chooses an integration constant of -output[0], “mean” chooses -np.mean(output) & “median” chooses -np.median(output)

  • highpass_cutoff (float | None) – the cutoff frequency for the initial highpass filter; this is used to remove artifacts caused by DC trends

  • tukey_percent (float) – the alpha parameter of a preconditioning Tukey filter; if 0 (default), no filter is applied

Returns:

a length n+1 list of the kth-order integrals from 0 to n (inclusive)

Return type:

List[DataFrame]

See also

endaq.calc.integrate.iter_integrals(df, zero='start', highpass_cutoff=None, tukey_percent=0.0)

Iterate over conditioned integrals of the given original data.

Parameters:
  • df (DataFrame) – the input data

  • zero (Literal['start', 'mean', 'median'] | ~typing.Callable) – the output quantity driven to zero by the integration constant; “start” (default) chooses an integration constant of -output[0], “mean” chooses -np.mean(output) & “median” chooses -np.median(output)

  • highpass_cutoff (float | None) – the cutoff frequency of a preconditioning highpass filter; if None, no filter is applied

  • tukey_percent (float) – the alpha parameter of a preconditioning Tukey filter; if 0 (default), no filter is applied

Returns:

an iterable over the data’s successive integrals; the first item is the preconditioned input data

Return type:

Iterable[DataFrame]

See also

endaq.calc.psd

endaq.calc.psd.differentiate(df, n=1.0)

Perform time-domain differentiation on periodogram data.

Parameters:
  • df (DataFrame) – a periodogram

  • n (float) – the time derivative order; negative orders represent integration

Returns:

a periodogram of the time-differentiated data

Return type:

DataFrame

endaq.calc.psd.rolling_psd(df, bin_width=1.0, octave_bins=None, fstart=1.0, scaling=None, agg='mean', freq_splits=None, num_slices=100, indexes=None, index_values=None, slice_width=None, add_resultant=True, disable_warnings=True, **kwargs)

Compute PSDs for defined slices of a time series data set using welch()

Parameters:
  • df (DataFrame) – the input dataframe with an index defining the time in seconds or datetime

  • bin_width (float) – the bin width or resolution in Hz for the PSD, defaults to 1, this is ignored if octave_bins is defined

  • octave_bins (float | None) – the number of frequency bins in each octave, defaults to None

  • fstart (float) – the lowest frequency for an octave PSD, defaults to 1

  • scaling (Literal[None, 'density', 'spectrum', 'parseval', 'unit', 'rms'] | None) – the scaling of the output; “density” & “spectrum” correspond to the same options in scipy.signal.welch; “parseval” will maintain the “energy” between the input & output, s.t. welch(df, scaling="parseval").sum(axis="rows") is roughly equal to df.abs().pow(2).sum(axis="rows"); “unit” will maintain the units and scale of the input data. “rms” will use “parseval” for the PSD calculations and set agg to “sum”, but then take the square root at the end

  • agg (Literal['mean', 'sum'] | ~typing.Callable[[~numpy.ndarray, int], float]) – the method for aggregating values into bins (only used if converting to octave or jagged); ‘mean’ preserves the PSD’s area-under-the-curve, ‘sum’ preserves the PSD’s “energy”

  • freq_splits (array | None) – the boundaries of the frequency bins to pass to to_jagged()

  • num_slices (int) – the number of slices to split the time series into, default is 100, this is ignored if indexes is defined

  • indexes (array | None) – the center index locations (not value) of each slice to compute the PSD

  • index_values (array | None) – the index values of each peak event to quantify (slower but more intuitive than using indexes)

  • slice_width (float | None) – the time in seconds to center about each slice index, if none is provided it will calculate one based upon the number of slices

  • add_resultant (bool) – if True (default) the root sum of squares of each PSD column will also be computed

  • disable_warnings (bool) – if True (default) it disables the warnings on the PSD length

  • kwargs – Other parameters to pass directly to psd.welch()

Returns:

a dataframe containing all the PSDs, stacked on each other

Return type:

DataFrame

See example use cases and syntax at spectrum_over_time() which visualizes the output of this function in Heatmaps, Waterfall plots, Surface plots, and Animations

endaq.calc.psd.spectrogram(df, num_slices=100, scaling=None, bin_width=1.0, octave_bins=None, fstart=1.0, agg='mean', freq_splits=None, add_resultant=True, disable_warnings=True, **kwargs)

Compute a spectrogram for a time series data set using scipy.signal.spectrogram()

Parameters:
  • df (DataFrame) – the input dataframe with an index defining the time in seconds or datetime

  • num_slices (int) – the number of slices to split the time series into, default is 100

  • scaling (Literal[None, 'density', 'spectrum', 'parseval', 'unit', 'rms'] | None) – the scaling of the output; “density” & “spectrum” correspond to the same options in scipy.signal.welch; “parseval” will maintain the “energy” between the input & output, s.t. welch(df, scaling="parseval").sum(axis="rows") is roughly equal to df.abs().pow(2).sum(axis="rows"); “unit” will maintain the units and scale of the input data. “rms” will use “parseval” for the PSD calculations and set agg to “sum”, but then take the square root at the end

  • bin_width (float) – the bin width or resolution in Hz for the PSD or FFT, defaults to 1

  • octave_bins (float | None) – the number of frequency bins in each octave, defaults to None

  • fstart (float) – the lowest frequency for an octave spaced output, defaults to 1

  • agg (Literal['mean', 'sum'] | ~typing.Callable[[~numpy.ndarray, int], float]) – the method for aggregating values into bins (only used if converting to octave or jagged); ‘mean’ preserves the PSD’s area-under-the-curve, ‘sum’ preserves the PSD’s “energy”

  • freq_splits (array | None) – the boundaries of the frequency bins to pass to to_jagged()

  • add_resultant (bool) – if True (default) the root sum of squares of each PSD column will also be computed

  • disable_warnings (bool) – if True (default) it disables the warnings in helper functions

  • kwargs – Other parameters to pass directly to scipy.signal.spectrogram()

Returns:

a dataframe containing all the spectrograms “melted” with columns defining the value, frequency, timeslice, and original column from the input dataframe

Return type:

DataFrame

See also

endaq.calc.psd.to_jagged(df, freq_splits, agg='mean')

Calculate a periodogram over non-uniformly spaced frequency bins.

Parameters:
  • df (DataFrame) – the returned values from endaq.calc.psd.welch()

  • freq_splits (array) – the boundaries of the frequency bins; must be strictly increasing

  • agg (Literal['mean', 'sum'] | ~typing.Callable[[~numpy.ndarray, int], float]) – the method for aggregating values into bins; ‘mean’ preserves the PSD’s area-under-the-curve, ‘sum’ preserves the PSD’s “energy”

Returns:

a periodogram with the given frequency spacing

Return type:

DataFrame

endaq.calc.psd.to_octave(df, fstart=1.0, octave_bins=12.0, **kwargs)

Calculate a periodogram over log-spaced frequency bins.

Parameters:
  • df (DataFrame) – the returned values from endaq.calc.psd.welch()

  • fstart (float) – the first frequency bin, in Hz; defaults to 1 Hz

  • octave_bins (float) – the number of frequency bins in each octave; defaults to 12

  • kwargs – other parameters to pass directly to to_jagged()

Returns:

a periodogram with the given logarithmic frequency spacing

Return type:

DataFrame

endaq.calc.psd.vc_curves(accel_psd, fstart=1.0, octave_bins=12.0)

Calculate Vibration Criterion (VC) curves from an acceleration periodogram.

Parameters:
  • accel_psd (DataFrame) – a periodogram of the input acceleration

  • fstart (float) – the first frequency bin

  • octave_bins (float) – the number of frequency bins in each octave; defaults to 12

Returns:

the Vibration Criterion (VC) curve of the input acceleration

Return type:

DataFrame

endaq.calc.psd.welch(df, bin_width=1.0, scaling=None, **kwargs)

Perform scipy.signal.welch with a specified frequency spacing.

Parameters:
  • df (DataFrame) – the input data

  • bin_width (float) – the desired width of the resulting frequency bins, in Hz; defaults to 1 Hz

  • scaling (Literal[None, 'density', 'spectrum', 'parseval'] | None) – the scaling of the output; “density” & “spectrum” correspond to the same options in scipy.signal.welch; “parseval” will maintain the “energy” between the input & output, s.t. welch(df, scaling="parseval").sum(axis="rows") is roughly equal to df.abs().pow(2).sum(axis="rows"); “unit” will maintain the units and scale of the input data.

  • kwargs – other parameters to pass directly to scipy.signal.welch

Returns:

a periodogram

Return type:

DataFrame

See also

endaq.calc.rotation

endaq.calc.rotation.quaternion_to_euler(df, mode='x-y-z')

Convert quaternion data in the dataframe df to euler angles. This can be done with either intrinsic or extrinsic rotations, determined automatically based on mode.

Parameters:
  • df (DataFrame) – The input quaternions to convert. Must have columns labelled ‘X’, ‘Y’, ‘Z’, and ‘W’.

  • mode (str) – The order of the axes to rotate. The default is intrinsic rotation about x-y-z.

Returns:

A dataframe with the euler-angles of the quaternion data.

Return type:

DataFrame

endaq.calc.shock

class endaq.calc.shock.HalfSineWavePulse(amplitude, duration)

The output data type for enveloping_half_sine().

The significant data members are amplitude and duration, which can simply be unpacked as if from a plain tuple:

ampl, T = enveloping_half_sine(df_pvss)

However, users can also elect to use the other methods of this class to generate other kinds of outputs.

Note

This class is not intended to be instantiated manually.

to_time_series(tstart=None, tstop=None, dt=None, tpulse=None)

Generate a time-series of the half-sine pulse.

Parameters:
  • tstart (float | None) – the starting time of the resulting waveform; if None (default), the range starts at tpulse

  • tstop (float | None) – the ending time of the resulting waveform; if None (default), the range ends at tpulse + duration

  • dt (float | None) – the sampling period of the resulting waveform; defaults to 1/20th of the pulse duration

  • tpulse (float | None) –

    the starting time of the pulse within the resulting waveform; if None (default), the pulse starts at either:

    • tstart, if provided

    • tstop - self.duration.max(), if tstop is provided

    • 0.0 otherwise

Returns:

a time-series of the half-sine pulse

Return type:

DataFrame

endaq.calc.shock.absolute_acceleration(accel, omega, damp=0.0)

Calculate the absolute acceleration for a SDOF system.

The absolute acceleration follows the transfer function:

H(s) = L{x”(t)}(s) / L{y”(t)}(s) = X(s)/Y(s)

for the PDE:

x” + (2ζω)x’ + (ω²)x = (2ζω)y’ + (ω²)y

Parameters:
  • accel (DataFrame) – the absolute acceleration y”

  • omega (float) – the natural frequency ω of the SDOF system

  • damp (float) – the damping coefficient ζ of the SDOF system

Returns:

the absolute acceleration x” of the SDOF system

Return type:

DataFrame

See also

endaq.calc.shock.enveloping_half_sine(pvss, damp=0.0)

Characterize a half-sine pulse whose PVSS envelopes the input.

Parameters:
  • pvss (DataFrame) – the PVSS to envelope

  • damp (float) – the damping factor used to generate the input PVSS

Returns:

a tuple of amplitudes and periods, each pair of which describes a half-sine pulse

Return type:

HalfSineWavePulse

endaq.calc.shock.pseudo_velocity(accel, omega, damp=0.0)

Calculate the pseudo-velocity for a SDOF system.

The pseudo-velocity follows the transfer function:

H(s) = L{ωz(t)}(s) / L{y”(t)}(s) = (ω/s²)(Z(s)/Y(s))

for the PDE:

z” + (2ζω)z’ + (ω²)z = -y”

Parameters:
  • accel (DataFrame) – the absolute acceleration y”

  • omega (float) – the natural frequency ω of the SDOF system

  • damp (float) – the damping coefficient ζ of the SDOF system

Returns:

the pseudo-velocity of the SDOF system

Return type:

DataFrame

See also

endaq.calc.shock.relative_displacement(accel, omega, damp=0.0)

Calculate the relative displacement for a SDOF system.

The relative displacement follows the transfer function:

H(s) = L{z(t)}(s) / L{y”(t)}(s) = (1/s²)(Z(s)/Y(s))

for the PDE:

z” + (2ζω)z’ + (ω²)z = -y”

Parameters:
  • accel (DataFrame) – the absolute acceleration y”

  • omega (float) – the natural frequency ω of the SDOF system

  • damp (float) – the damping coefficient ζ of the SDOF system

Returns:

the relative displacement z of the SDOF system

Return type:

DataFrame

See also

endaq.calc.shock.relative_displacement_static(accel, omega, damp=0.0)

Calculate the relative displacement expressed as equivalent static acceleration for a SDOF system.

The relative displacement as static acceleration follows the transfer function:

H(s) = L{ω²z(t)}(s) / L{y”(t)}(s) = (ω²/s²)(Z(s)/Y(s))

for the PDE:

z” + (2ζω)z’ + (ω²)z = -y”

Parameters:
  • accel (DataFrame) – the absolute acceleration y”

  • omega (float) – the natural frequency ω of the SDOF system

  • damp (float) – the damping coefficient ζ of the SDOF system

Returns:

the relative displacement of the SDOF system expressed as equivalent static acceleration

Return type:

DataFrame

See also

endaq.calc.shock.relative_velocity(accel, omega, damp=0.0)

Calculate the relative velocity for a SDOF system.

The relative velocity follows the transfer function:

H(s) = L{z’(t)}(s) / L{y”(t)}(s) = (1/s)(Z(s)/Y(s))

for the PDE:

z” + (2ζω)z’ + (ω²)z = -y”

Parameters:
  • accel (DataFrame) – the absolute acceleration y”

  • omega (float) – the natural frequency ω of the SDOF system

  • damp (float) – the damping coefficient ζ of the SDOF system

Returns:

the relative velocity z’ of the SDOF system

Return type:

DataFrame

See also

endaq.calc.shock.rolling_shock_spectrum(df, damp=0.05, mode='srs', add_resultant=True, freqs=None, init_freq=0.5, bins_per_octave=12.0, num_slices=100, indexes=None, index_values=None, slice_width=None, disable_warnings=True)

Compute Shock Response Spectrums for defined slices of a time series data set using shock_spectrum()

Parameters:
  • df (DataFrame) – the input dataframe with an index defining the time in seconds or datetime

  • damp (float) – the damping coefficient ζ, related to the Q-factor by ζ = 1/(2Q); defaults to 0.05

  • mode (Literal['srs', 'pvss']) –

    the type of spectrum to calculate:

    • ’srs’ (default) specifies the Shock Response Spectrum (SRS)

    • ’pvss’ specifies the Pseudo-Velocity Shock Spectrum (PVSS)

  • add_resultant (bool) – if True (default) the column-wise resultant will also be computed and returned with the spectra per-column

  • freqs (ndarray | None) – the natural frequencies across which to calculate the spectrum, if None (the default) it uses init_freq and bins_per_octave to define them

  • init_freq (float) – the initial frequency in the sequence; if None, use the frequency corresponding to the data’s duration, default is 0.5 Hz

  • bins_per_octave (float) – the number of frequencies per octave, default is 12

  • num_slices (int) – the number of slices to split the time series into, default is 100, this is ignored if indexes is defined

  • indexes (array | None) – the center index locations (not value) of each slice to compute the shock spectrum

  • index_values (array | None) – the index values of each peak event to quantify (slower but more intuitive than using indexes)

  • slice_width (float | None) – the time in seconds to center about each slice index, if none is provided it will calculate one based upon the number of slices

  • disable_warnings (bool) – if True (default) it disables the warnings on the initial frequency

Returns:

a dataframe containing all the shock spectrums, stacked on each other

Return type:

DataFrame

See example use cases and syntax at spectrum_over_time() which visualizes the output of this function in Heatmaps, Waterfall plots, Surface plots, and Animations

endaq.calc.shock.shock_spectrum(accel, freqs=None, init_freq=0.5, bins_per_octave=12.0, damp=0.05, mode='srs', max_time=2.0, peak_threshold=0.1, two_sided=False, aggregate_axes=False)

Calculate the shock spectrum of an acceleration signal. Note this defaults to first find peak events, then compute the spectrum on those peaks to speed up processing time.

Parameters:
  • accel (DataFrame) – the absolute acceleration y”

  • freqs (ndarray | None) – the natural frequencies across which to calculate the spectrum, if None (the default) it uses init_freq and bins_per_octave to define them

  • init_freq (float) – the initial frequency in the sequence; if None, use the frequency corresponding to the data’s duration, default is 0.5 Hz

  • bins_per_octave (float) – the number of frequencies per octave, default is 12

  • damp (float) – the damping coefficient ζ, related to the Q-factor by ζ = 1/(2Q); defaults to 0.05

  • mode (Literal['srs', 'pvss']) –

    the type of spectrum to calculate:

    • ’srs’ (default) specifies the Shock Response Spectrum (SRS)

    • ’pvss’ specifies the Pseudo-Velocity Shock Spectrum (PVSS)

  • max_time (float) – the maximum duration in seconds to compute the shock spectrum for, if the time duration is greater than find_peaks() is used to find peak locations, then the shock spectrums at each peak is calculated with rolling_shock_spectrum() with max_time defining the slice_width. Set max_time to None to force the function to not do the peak finding.

  • peak_threshold (float) – if the duration is greater than max_time all peaks that are greater than peak_threshold will be calculated, and the aggregate max per frequency will be reported

  • two_sided (bool) – whether to return for each frequency: both the maximum negative and positive shocks (True), or simply the maximum absolute shock (False; default)

  • aggregate_axes (bool) – whether to calculate the column-wise resultant (True) or calculate spectra along each column independently (False; default)

Returns:

the shock spectrum

Return type:

DataFrame

See also

endaq.calc.stats

endaq.calc.stats.L2_norm(array, axis=None, keepdims=False)

Compute the L2 norm (a.k.a. the Euclidean Norm).

Parameters:
  • array (np.ndarray) – the input array

  • axis (Union[None, SupportsIndex, Sequence[SupportsIndex]]) – the axis/axes along which to aggregate; if None (default), the L2 norm is computed along the flattened array

  • keepdims (bool) – if True, the axes which are reduced are left in the result as dimensions of size one; if False (default), the reduced axes are removed

Returns:

an array containing the computed values

Return type:

np.ndarray

endaq.calc.stats.find_peaks(df, time_distance=1.0, add_resultant=False, threshold=None, threshold_reference='peak', threshold_multiplier=0.1, use_abs=True, display_plots=False)

Find the peak events of a given time series using the maximum across all input columns

Parameters:
  • df (DataFrame) – the input dataframe with an index defining the time in seconds or datetime

  • time_distance (float) – the minimum time in seconds between events, default is 1.0

  • add_resultant (bool) – add a resultant (root sum of the squares) to the time series prior to finding peaks, calculated from the other input dataframe columns

  • threshold (float | None) – if None (default) this is ignored, but if defined this value is passed as the minimum threshold to define a shock event

  • threshold_reference (Literal['rms', 'peak']) –

    if the threshold isn’t defined, calculate it from:

    • ”peak” (default) the overall peak

    • ”rms” the RMS value of the time series

  • threshold_multiplier (float) – if the threshold isn’t defined, multiply this by the threshold_reference, suggestions are: * 0.1 (default) when using peak, to get all events greater than 10% of the overall peak * 4.0 when using RMS, a typical Gaussian signal has a kurtosis of 3.0

  • use_abs (bool) – use the absolute value of the data to define peak events, default is True

  • display_plots (bool) – display a plotly figure of the time series with the peak events plotted over it (default as False)

Returns:

an array of index locations

Return type:

DataFrame

See also

Here’s an example implementation using a 60M dataset that loads the data, finds the peaks, and plots with the peak events identified all very quickly

import endaq
endaq.plot.utilities.set_theme()
import plotly.graph_objects as go

#Get Accel
accel = endaq.ide.get_primary_sensor_data('https://info.endaq.com/hubfs/ford_f150.ide',measurement_type='accel',
    time_mode='datetime')

#Filter
accel = endaq.calc.filters.butterworth(accel,low_cutoff=2)

#Get Peak Indexes
indexes = endaq.calc.stats.find_peaks(
    accel, time_distance=2,
    threshold_reference="rms", threshold_multiplier=5.0)

#Generate a Dataframe with Just the Peak Events
df_peaks = accel.iloc[indexes]

#Generate Shaded Bar Plot of All Data
fig = endaq.plot.rolling_min_max_envelope(accel, plot_as_bars=True, opacity=0.7)

#Add Peaks & Display
fig.add_trace(
    go.Scatter(
        x=df_peaks.index,
        y=df_peaks.abs().max(axis=1).to_numpy(),
        mode='markers', name='Peak Events', marker_symbol='x', marker_color='white'
        )
    )
fig.show()

(Source code, html)

endaq.calc.stats.max_abs(array, axis=None, keepdims=False)

Compute the maximum of the absolute value of an array.

This function should be equivalent to, but generally use less memory than np.amax(np.abs(array)).

Specifically, it generates the absolute-value maximum from np.amax(array) and -np.amin(array). Thus instead of allocating space for the intermediate array np.abs(array), it allocates for the axis-collapsed smaller arrays np.amax(array) & np.amin(array).

Note

This method does not work on complex-valued arrays.

Parameters:
  • array (np.ndarray) – the input data

  • axis (Union[None, SupportsIndex, Sequence[SupportsIndex]]) – the axis/axes along which to aggregate; if None (default), the absolute maximum is computed along the flattened array

  • keepdims (bool) – if True, the axes which are reduced are left in the result as dimensions with size one; if False (default), the reduced axes are removed

Returns:

an array containing the computed values

Return type:

np.ndarray

endaq.calc.stats.rms(array, axis=None, keepdims=False)

Calculate the root-mean-square (RMS) along a given axis.

Parameters:
  • array (np.ndarray) – the input array

  • axis (Union[None, SupportsIndex, Sequence[SupportsIndex]]) – the axis/axes along which to aggregate; if None (default), the RMS is computed along the flattened array

  • keepdims (bool) – if True, the axes which are reduced are left in the result as dimensions with size one; if False (default), the reduced axes are removed

Returns:

an array containing the computed values

Return type:

np.ndarray

endaq.calc.stats.rolling_metrics(df, indexes=None, index_values=None, num_slices=100, slice_width=None, **kwargs)

Quantify a series of time slices of a given time series

Parameters:
  • df (DataFrame) – the input dataframe with an index defining the time in seconds or datetime

  • indexes (array | None) – the index locations (not value) of each peak event to quantify like what is returned by find_peaks()

  • index_values (array | None) – the index values of each peak event to quantify (slower but more intuitive than using indexes)

  • num_slices (int) – the number of slices to split the time series into, default is 100, this is ignored if indexes or index_values are defined

  • slice_width (float | None) – the time in seconds to center about each slice index, if none is provided it will calculate one based upon the number of slices

  • kwargs – Other parameters to pass directly to shock_vibe_metrics()

Returns:

a dataframe containing all the metrics, one computed per column of the input dataframe, and one per peak event

Return type:

DataFrame

Here’s a continuation of the example shown in find_peaks() that generates a table of metrics for a few defined time stamps, and then a row of subplots for each metric calculated.

import endaq
endaq.plot.utilities.set_theme()
import plotly.express as px
import pandas as pd

# Get Accel
accel = endaq.ide.get_primary_sensor_data('https://info.endaq.com/hubfs/ford_f150.ide',measurement_type='accel',
    time_mode='datetime')

# Filter
accel = endaq.calc.filters.butterworth(accel,low_cutoff=2)

# Calculate for 3 Specific Times
metrics = endaq.calc.stats.rolling_metrics(
    accel,
    index_values = pd.DatetimeIndex(['2020-03-13 23:40:13', '2020-03-13 23:45:00', '2020-03-13 23:50:00'],tz='UTC'),
    slice_width=5.0)

# Simplify Timestamp Column
metrics.timestamp = metrics.timestamp.astype(str).map(lambda x: x[10:19])

# Generate Plot Table of Metrics
table_plot = endaq.plot.table_plot(metrics)
table_plot.show()

# Calculate for 50 Equally Spaced & Sized Slices, Turning off Pseudo Velocity (Only Recommended for Smaller Time Slices)
metrics = endaq.calc.stats.rolling_metrics(
    accel, num_slices=50, highpass_cutoff=2,
    tukey_percent=0.0, include_pseudo_velocity=False)


# Generate Row with Subplots for each metric
metrics_fig = px.scatter(
    metrics,
    x='timestamp',
    y='value',
    color='variable',
    facet_col='calculation',
    facet_col_spacing=0.03
)
metrics_fig.update_yaxes(title_text='', matches=None, showticklabels=True).update_xaxes(title_text='')
metrics_fig.update_layout(width=3000, legend_y=1.2, legend_title_text='').for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))
metrics_fig.show()

(Source code, html, html)

(html)

(html)

endaq.calc.stats.rolling_rms(df, window_len, *args, **kwargs)

Calculate a rolling root-mean-square (RMS) over a pandas DataFrame.

This function is equivalent to, but computationally faster than the following:

df.rolling(window_len).apply(endaq.calc.stats.rms)
Parameters:
  • df (DataFrame | Series) – the input data

  • window_len (int) – the length of the rolling window

  • args – the positional arguments to pass into df.rolling().mean

  • kwargs – the keyword arguments to pass into df.rolling().mean

Returns:

the rolling-windowed RMS

Return type:

DataFrame | Series

See also

endaq.calc.stats.shock_vibe_metrics(df, tukey_percent=0.1, highpass_cutoff=None, accel_units='gravity', disp_units='in', freq_splits=(0, 65, 300, 1500, None), detrend='median', zero='start', include_integration=True, include_pseudo_velocity=False, damp=0.05, init_freq=1.0, bins_per_octave=12, include_resultant=False, display_plots=False)
Compute the following shock and vibration metrics for a given time series dataframe
  • Peak Absolute Acceleration

  • RMS Acceleration

  • Peak Frequency

  • RMS Acceleration in Defined Frequency Ranges with freq_splits

  • Peak Absolute Velocity

  • RMS Velocity

  • Peak Absolute Displacement

  • RMS Displacement

  • Peak Pseudo Velocity & Corresponding Frequency

Parameters:
  • df (DataFrame) – the input dataframe with an index defining the time in seconds or datetime and units of accel_units

  • tukey_percent (float) –

    the portion of the time series to apply a Tukey window (a taper that forces beginning and end to 0), default is 0.1

    • Note that the RMS metrics will only be computed on the portion of time that isn’t tapered

  • highpass_cutoff (float | None) – the cutoff frequency of a preconditioning highpass filter; if None, no filter is applied. For shock events, it is recommended to set this to None (the default), but it is recommended for vibration.

  • accel_units (str) – the units to display acceleration as, default is “gravity” which will be shortened to ‘g’ in labels, the unit conversion is handled using convert_units()

  • disp_units (str) – the units to display displacement as and velocity (divided by seconds), default is “in”, the unit conversion is handled using convert_units()

  • freq_splits (ndarray | list | tuple) – the boundaries of the frequency bins for the RMS calculations; must be strictly increasing, if None is given for the last value (the default) it will set this as the sampling rate

  • detrend (Literal['start', 'mean', 'median', None]) –

    the output quantity driven to zero prior to the calculations

    • None does nothing

    • ”start” forces the first datapoint to 0,

    • ”mean” chooses -np.mean(output)

    • ”median” (default) chooses -np.median(output)

  • zero (Literal['start', 'mean', 'median']) –

    the output quantity driven to zero inside each integration call

    • ”start” (default) forces the first datapoint to 0,

    • ”mean” chooses -np.mean(output)

    • ”median” (default) chooses -np.median(output)

  • include_integration (bool) – if True, include the calculations of velocity and displacement. Defaults to True.

  • include_pseudo_velocity (bool) – if True, include the more time-consuming calculation of pseudo velocity. Defaults to False.

  • damp (float) – the damping coefficient used in the shock response calculation ζ, related to the Q-factor by ζ = 1/(2Q); defaults to 0.05

  • init_freq (float) – the initial frequency in the sequence for the shock response calculation; if None, use the frequency corresponding to the data’s duration, default is 1.0 Hz

  • bins_per_octave (float) – the number of frequencies per octave for the shock response calculation

  • include_resultant (bool) – add a resultant (root sum of the squares) for each metric, calculated from the other input dataframe columns

  • display_plots (bool) – display plotly figures of the min/max envelope of acceleration, velocity, displacement and PVSS (default as False)

Returns:

a dataframe containing all the metrics, one computed per column of the input dataframe

Return type:

DataFrame

Here is an example calculating and displaying these metrics for the bearing dataset discussed in our blog Top 12 Vibration Metrics to Monitor & How to Calculate Them

import endaq
endaq.plot.utilities.set_theme('endaq_light')
import pandas as pd
import plotly.express as px

# Get Acceleration Data
accel = pd.read_csv('https://info.endaq.com/hubfs/Plots/bearing_data.csv', index_col=0)

# Calculate Metrics
metrics = endaq.calc.stats.shock_vibe_metrics(accel, include_resultant=False, freq_splits=[0, 65, 300, None])

# Generate Figure with Bar Plots
fig = px.bar(
    metrics,
    x="variable", y="value", color='variable',
    hover_data = ["units"],
    facet_col="calculation", facet_col_wrap=3)
fig.update_yaxes(matches=None, visible=False)
fig.update_xaxes(visible=False)
fig.for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))
fig.show()

(Source code, html)

endaq.calc.utils

endaq.calc.utils.convert_units(units_in, units_out, df=None)

Using the Pint library apply a unit conversion to a provided unit-unaware dataframe.

Parameters:
  • units_in (str) – a text string defining the base units to convert from like “in” for inches

  • units_out (str) – a text string defining the destination units to convert to like “mm” for millimeters

  • df (DataFrame | None) – the input dataframe, if none the unit conversion is only applied from units_in to units_out

Returns:

a dataframe with the values scaled according to the unit conversion, if no dataframe is provided then a scaler value is returned

Return type:

DataFrame

Some examples are provided below which includes a table of common unit conversions. A full list is available from the Pint library.

import endaq
endaq.plot.utilities.set_theme()
import pandas as pd
import plotly.express as px

# Simple conversion factor from inches to millimeters
in_2_mm = endaq.calc.utils.convert_units('in', 'mm')

# Get idelib dataset
doc = endaq.ide.get_doc('https://info.endaq.com/hubfs/data/All-Channels.ide')

# Get acceleration data in 'g' and convert to in/s^2
accel_in_gs = endaq.ide.get_primary_sensor_data(doc=doc, measurement_type='accel')
accel_in_inches = endaq.calc.utils.convert_units('gravity', 'in/s**2', accel_in_gs)

# Get temperature in Celsius and convert to Fahrenheit
temp_in_C = endaq.ide.get_primary_sensor_data(doc=doc, measurement_type='temp')
temp_in_F = endaq.calc.utils.convert_units('degC', 'degF', temp_in_C)

# Merge C & F in one dataframe
temp_in_C.columns = ['Temperature in Degrees C']
temp_in_F.columns = ['Temperature in Degrees F']
temp = pd.concat([temp_in_C, temp_in_F], axis=1)

# Display plot with both C and F
fig = px.line(
    temp.reset_index().melt(id_vars='timestamp'),
    x='timestamp',
    y='value',
    facet_col='variable',
    facet_col_spacing=0.07
).for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))
fig.update_yaxes(matches=None, showticklabels=True, title_text='').update_xaxes(title_text='')
fig.show()

# Get Table of Unit Conversions
df = pd.read_csv('https://info.endaq.com/hubfs/Unit-Conversion-Examples.csv')
df['output'] = 0
for i in df.index:
    df.loc[i, 'output'] = endaq.calc.utils.convert_units(
        units_in = df.loc[i, 'units_in'],
        units_out = df.loc[i, 'units_out'])

# Generate Plot Table
plot_table = endaq.plot.table_plot(df, num_round=6)
plot_table.show()

(Source code, html, html)

(html)

(html)

endaq.calc.utils.logfreqs(df, init_freq=None, bins_per_octave=12.0)

Calculate a sequence of log-spaced frequencies for a given dataframe.

Parameters:
  • df (DataFrame) – the input data

  • init_freq (float | None) – the initial frequency in the sequence; if None (default), use the frequency corresponding to the data’s duration

  • bins_per_octave (float) – the number of frequencies per octave

Returns:

an array of log-spaced frequencies

Return type:

ndarray

endaq.calc.utils.resample(df, sample_rate=None)

Resample a dataframe to a desired sample rate (in Hz)

Parameters:
  • df (DataFrame) – The DataFrame to resample, indexed by time

  • sample_rate (float | None) – The desired sample rate to resample the given data to. If one is not supplied, then it will use the same as it currently does, but make the time stamps uniformly spaced

Returns:

The resampled data in a DataFrame

Return type:

DataFrame

endaq.calc.utils.sample_spacing(data, convert='to_seconds')

Calculate the average spacing between individual samples.

For time indices, this calculates the sampling period dt.

Parameters:
  • data (ndarray | DataFrame) – the input data; either a pandas DataFrame with the samples spaced along its index, or a 1D-array-like of sample times

  • convert (Literal[None, 'to_seconds']) – if “to_seconds” (default), convert any time objects into floating-point seconds

endaq.calc.utils.to_dB(data, reference, squared=False)

Scale data into units of decibels.

Decibels are a log-scaled ratio of some value against a reference; typically this is expressed as follows:

dB=10log10(xxref)

By convention, “decibel” units tend to operate on units of power. For units that are proportional to power when squared (e.g., volts, amps, pressure, etc.), their “decibel” representation is typically doubled (i.e., dB=20log10(...)). Users can specify which scaling to use with the squared parameter.

Note

Decibels can NOT be calculated from negative values.

For example, to calculate dB on arbitrary time-series data, typically data is first aggregated via:

and the non-negative result from the aggregation is then scaled into decibels.

Parameters:
  • data (ndarray) – the input data

  • reference (float | Literal['SPL', 'audio_intensity']) – the reference value corresponding to 0dB

  • squared (bool) – whether the input data & reference value are pre-squared; defaults to False