Minimal Working Example¶
This is a quick guide for creating and running a simple pipeline to extract missing, outgoing, and incoming call features for 24 hr (00:00:00 to 23:59:59) and night (00:00:00 to 05:59:59) time segments of every day of data of one participant that was monitored on the US East coast with an Android smartphone.
- Install RAPIDS and make sure your
condaenvironment is active (see Installation) - Download this CSV file and save it as
data/external/aware_csv/calls.csv -
Make the changes listed below for the corresponding Configuration step (we provide an example of what the relevant sections in your
config.ymlwill look like after you are done)Required configuration changes (click to expand)
-
Supported data streams.
Based on the docs, we decided to use the
aware_csvdata stream because we are processing aware data saved in a CSV file. We will use this label in a later step; there’s no need to type it or save it anywhere yet. -
Create your participants file.
Since we are processing data from a single participant, you only need to create a single participant file called
p01.yamlindata/external/participant_files. This participant file only has aPHONEsection because this hypothetical participant was only monitored with a smartphone. Note that for a real analysis, you can do this automatically with a CSV file-
Add
p01to[PIDS]inconfig.yaml -
Create a file in
data/external/participant_files/p01.yamlwith the following content:PHONE: DEVICE_IDS: [a748ee1a-1d0b-4ae9-9074-279a2b6ba524] # the participant's AWARE device id PLATFORMS: [android] # or ios LABEL: MyTestP01 # any string START_DATE: 2020-01-01 # this can also be empty END_DATE: 2021-01-01 # this can also be empty
-
-
Select what time segments you want to extract features on.
-
Set
[TIME_SEGMENTS][FILE]todata/external/timesegments_periodic.csv -
Create a file in
data/external/timesegments_periodic.csvwith the following contentlabel,start_time,length,repeats_on,repeats_value daily,00:00:00,23H 59M 59S,every_day,0 night,00:00:00,5H 59M 59S,every_day,0
-
-
Choose the timezone of your study.
We will use the default time zone settings since this example is processing data collected on the US East Coast (
America/New_York)TIMEZONE: TYPE: SINGLE SINGLE: TZCODE: America/New_York -
Modify your device data stream configuration
-
Set
[PHONE_DATA_STREAMS][USE]toaware_csv. -
We will use the default value for
[PHONE_DATA_STREAMS][aware_csv][FOLDER]since we already stored the test calls CSV file there.
-
-
Select what sensors and features you want to process.
-
Set
[PHONE_CALLS][CONTAINER]tocalls.csvin theconfig.yamlfile. -
Set
[PHONE_CALLS][PROVIDERS][RAPIDS][COMPUTE]toTruein theconfig.yamlfile.
-
Example of the
config.yamlsections after the changes outlined aboveThis will be your
config.yamlafter following the instructions above. Click on the numbered markers to know more.PIDS: [p01] # TIMEZONE: TYPE: SINGLE # SINGLE: TZCODE: America/New_York # ... other irrelevant sections TIME_SEGMENTS: &time_segments TYPE: PERIODIC # FILE: "data/external/timesegments_periodic.csv" # INCLUDE_PAST_PERIODIC_SEGMENTS: FALSE PHONE_DATA_STREAMS: USE: aware_csv # aware_csv: FOLDER: data/external/aware_csv # # ... other irrelevant sections ############## PHONE ########################################################### ################################################################################ # ... other irrelevant sections # Communication call features config, TYPES and FEATURES keys need to match PHONE_CALLS: CONTAINER: calls.csv # PROVIDERS: RAPIDS: COMPUTE: True # CALL_TYPES: ... -
-
Run RAPIDS
./rapids -j1 - The call features for daily and morning time segments will be in
data/processed/features/all_participants/all_sensor_features.csv