endaq.ide
¶
The submodule endaq.ide
contains utility functions for accessing data in Mide Instrumentation Data Exchange
(.IDE
) files, the native format generated by enDAQ data recorders.
- endaq.ide.extract_time(doc, out, start=0, end=None, channels=None, **kwargs)¶
Efficiently extract data within a certain interval from an IDE file, writing it to another file. Note that due to the way data is stored in an IDE, the exported interval will be slightly wider than the specified start and end times; this ensures the data is copied verbatim and without loss.
The start and end times, if used, may be specified in several ways:
int/float (Microseconds from the recording start)
str (formatted as a time from the recording start, e.g., MM:SS, HH:MM:SS, DDd HH:MM:SS). More examples:
":01"
or":1"
or"1s"
(1 second)"22:11"
(22 minutes, 11 seconds)"3:22:11"
(3 hours, 22 minutes, 11 seconds)"1d 3:22:11"
(1 day, 3 hours, 22 minutes, 11 seconds)
datetime.timedelta or pandas.Timedelta (time from the recording start)
datetime.datetime (an explicit UTC time)
- Parameters:
doc – A Dataset or the name of a local IDE file. Dataset objects do not have to be fully imported.
out – A filename or stream to which to save the extracted data.
start – The starting time. Defaults to the start of the recording.
end – The ending time. Defaults to the end of the recording.
channels – A list of channel IDs to specifically export. If None, all channels will be exported. Note excluded channels will still appear in the new IDE’s channels dictionary, but the file will contain no data for them.
- Returns:
The total number of bytes written, and total number of ChannelDataBlock elements copied.
- endaq.ide.filter_channels(channels, measurement_type=<MeasurementType: Any/all>)¶
Filter a list of Channel and/or SubChannel instances by their measurement type(s).
- Parameters:
channels – A list or dictionary of channels/subchannels to filter.
measurement_type – A
MeasurementType
, a measurement type ‘key’ string, or a string of multiple keys generated by adding and/or subtractingMeasurementType
objects. Any ‘subtracted’ types will be excluded.
- endaq.ide.get_channel_table(dataset, measurement_type=<MeasurementType: Any/all>, start=0, end=None, formatting=None, index=True, precision=4, timestamps=False, **kwargs)¶
Get summary data for all SubChannel objects in a Dataset that contain one or more type of sensor data. By using the optional start and end parameters, information can be retrieved for a specific interval of time.
The start and end times, if used, may be specified in several ways:
int/float (Microseconds from the recording start)
str (formatted as a time from the recording start, e.g., MM:SS, HH:MM:SS, DDd HH:MM:SS). More examples:
":01"
or":1"
or"1s"
(1 second)"22:11"
(22 minutes, 11 seconds)"3:22:11"
(3 hours, 22 minutes, 11 seconds)"1d 3:22:11"
(1 day, 3 hours, 22 minutes, 11 seconds)
datetime.timedelta or pandas.Timedelta (time from the recording start)
datetime.datetime (an explicit UTC time)
- Parameters:
dataset (Union[idelib.dataset.Dataset, list]) – A idelib.dataset.Dataset or a list of channels/subchannels from which to build the table.
measurement_type – A
MeasurementType
, a measurement type ‘key’ string, or a string of multiple keys generated by adding and/or subtractingMeasurementType
objects to filter the results. Any ‘subtracted’ types will be excluded.start (Union[int, float, str, datetime.datetime, datetime.timedelta]) – The starting time. Defaults to the start of the recording.
end (Optional[int, float, str, datetime.datetime, datetime.timedelta]) – The ending time. Defaults to the end of the recording.
formatting (Optional[dict]) – A dictionary of additional style/formatting items (see pandas.DataFrame.style.format()). If False, no additional formatting is applied.
index (bool) – If True, show the index column on the left.
precision (int) – The default decimal precision to display. Can be changed later.
timestamps (bool) – If True, show the start and end as raw microsecond timestamps.
- Returns:
A table (pandas.io.formats.style.Styler) of summary data.
- Return type:
pandas.DataFrame
- endaq.ide.get_channels(dataset, measurement_type=<MeasurementType: Any/all>, subchannels=True)¶
Get a list of Channel or SubChannel instances from a Dataset by their measurement type(s).
- Parameters:
dataset – The Dataset from which to retrieve the list.
measurement_type – A
MeasurementType
, a measurement type ‘key’ string, or a string of multiple keys generated by adding and/or subtractingMeasurementType
objects. Any ‘subtracted’ types will be excluded.subchannels – If False, get only Channel objects. If True, get only SubChannel objects.
- Returns:
A list of matching SubChannel instances from the Dataset.
- endaq.ide.get_doc(name=None, filename=None, url=None, parsed=True, start=0, end=None, localfile=None, params=None, headers=None, cookies=None, timeout=60, **kwargs)¶
Retrieve an IDE file from either a file or URL.
Note: name, filename, and url are mutually exclusive arguments. One and only one must be specified. Attempting to supply more than one will generate an error.
Example usage:
get_doc("my_recording.ide") get_doc("https://example.com/remote_recording.ide") get_doc(filename="my_recording.ide") get_doc(url="https://example.com/remote_recording.ide") get_doc(filename="my_recording.ide", start="1:23")
The start and end times, if used, may be specified in several ways:
int/float (Microseconds from the recording start)
str (formatted as a time from the recording start, e.g., MM:SS, HH:MM:SS, DDd HH:MM:SS). More examples:
":01"
or":1"
or"1s"
(1 second)"22:11"
(22 minutes, 11 seconds)"3:22:11"
(3 hours, 22 minutes, 11 seconds)"1d 3:22:11"
(1 day, 3 hours, 22 minutes, 11 seconds)
datetime.timedelta or pandas.Timedelta (time from the recording start)
datetime.datetime (an explicit UTC time)
- Parameters:
name – The name or URL of the IDE. The method of fetching it will be automatically chosen based on how it is formatted.
filename – The name of an IDE file. Supplying a name this way will force it to be read from a file, avoiding the possibility of accidentally trying to retrieve it via URL.
url – The URL of an IDE file. Supplying a name this way will force it to be read from a URL, avoiding the possibility of accidentally trying to retrieve it from a local file.
parsed – If True (default), the IDE will be fully parsed after it is fetched. If False, only the file metadata will be initially loaded, and a call to idelib.importer.readData(). This can save time.
start – The starting time. Defaults to the start of the recording. Only applicable if parsed is True.
end – The ending time. Defaults to the end of the recording. Only applicable if parsed is True.
localfile – The name of the file to which to write data received from a URL. If none is supplied, a temporary file will be used. Only applicable when opening a URL.
params – Additional URL request parameters. Only applicable when opening a URL.
headers – Additional URL request headers. Only applicable when opening a URL.
cookies – Additional browser cookies for use in the URL request. Only applicable when opening a URL.
timeout – Seconds to wait for a response to the URL request. Only applicable when opening a URL.
- Returns:
The fetched IDE data.
Additionally, get_doc() will accept the keyword arguments for idelib.importer.importFile() or idelib.importer.openFile()
- endaq.ide.get_measurement_type(channel)¶
Get the appropriate
MeasurementType
object for a given SubChannel.Calling with a Channel returns a list of
MeasurementType
objects, with one for each child SubChannel.- Parameters:
channel – A Channel or SubChannel instance (e.g., from a Dataset).
- Returns:
A
MeasurementType
object (for a SubChannel), or a list ofMeasurementType
objects (one for each child) if a Channel was supplied.
- endaq.ide.get_primary_sensor_data(name='', doc=None, measurement_type=<MeasurementType: Any/all>, criteria='samples', time_mode='datetime', tz='utc', least=False, force_data_return=False)¶
Get the data from the primary sensor in a given .ide file using
to_pandas()
- Parameters:
name (str) – The file location to pull the data from, see
get_doc()
for more. This can be a local file location or a URL.doc (idelib.dataset.Dataset) – An open Dataset object, see
get_doc()
for more. If one is provided it will not attempt to use name to load a new one.measurement_type (Union[str, MeasurementType]) – The sensor type to return data from, see
measurement
for more. The default is “any”, but to get the primary accelerometer for example, set this to “accel”.criteria (Literal[('samples', 'rate', 'duration')]) –
How to determine the “primary” sensor using the summary information provided by
get_channel_table()
:”sample” - the number of samples, default behavior
”rate” - the sampling rate in Hz
”duration” - the duration from start to the end of data from that sensor
time_mode (Literal[('seconds', 'timedelta', 'datetime')]) –
how to temporally index samples; each mode uses either relative times (with respect to the start of the recording) or absolute times (i.e., date-times):
”seconds” - a pandas.Float64Index of relative timestamps, in seconds
”timedelta” - a pandas.TimeDeltaIndex of relative timestamps
”datetime” - a pandas.DateTimeIndex of absolute timestamps
tz (Union[pytz.timezone, dateutil.tz.tzfile, datetime.tzinfo, Literal[('device', 'local', 'utc')]]) –
Optional time zone information for displaying the “datetime” time mode. It can be a standard time zone object (pytz.timezone, dateutil.tz.tzfile, datetime.tzinfo) or one of three special strings:
”utc” - standard UTC time (default).
- ”local” - the current computer’s local time zone (note: may not be the
user’s actual time zone when used on enDAQ Cloud).
- ”device” - the time zone specified by the original recording device’s
configured UTC offset.
least (bool) – If set to True it will return the channels ranked lowest by the given criteria.
force_data_return (bool) – If set to True and the specified measurement_type is not included in the file, it will return the data from any sensor instead of raising an error which is the default behavior.
- Returns:
a pandas.DataFrame containing the sensor’s data
- Return type:
pd.DataFrame
Here are some examples:
#Get sensor with the most samples, may return data of mixed units data = get_primary_sensor_data('https://info.endaq.com/hubfs/data/All-Channels.ide') #Instead get just the primary accelerometer data defined by number of samples accel = get_primary_sensor_data('https://info.endaq.com/hubfs/data/All-Channels.ide', measurement_type='accel')
- endaq.ide.to_pandas(channel, time_mode='datetime', tz='utc')¶
Read IDE data into a pandas DataFrame.
- Parameters:
channel (Union[idelib.dataset.Channel, idelib.dataset.SubChannel]) – a Channel object, as produced from Dataset.channels or
endaq.ide.get_channels()
time_mode (Literal[('seconds', 'timedelta', 'datetime')]) –
how to temporally index samples; each mode uses either relative times (with respect to the start of the recording) or absolute times (i.e., date-times):
”seconds” - a pandas.Float64Index of relative timestamps, in seconds
”timedelta” - a pandas.TimeDeltaIndex of relative timestamps
”datetime” - a pandas.DateTimeIndex of absolute timestamps
tz (Union[pytz.timezone, dateutil.tz.tzfile, datetime.tzinfo, Literal[('device', 'local', 'utc')]]) –
Optional time zone information for displaying the “datetime” time mode. It can be a standard time zone object (pytz.timezone, dateutil.tz.tzfile, datetime.tzinfo) or one of three special strings:
”utc” - standard UTC time (default).
- ”local” - the current computer’s local time zone (note: may not be the
user’s actual time zone when used on enDAQ Cloud).
- ”device” - the time zone specified by the original recording device’s
configured UTC offset.
- Returns:
a pandas.DataFrame containing the channel’s data
- Return type:
pd.DataFrame