Rune Labs V2 Stream API

An HTTP API to query timeseries data.

Overview

V1 vs V2

Like the V1 Stream API, the V2 API can be used to query timeseries which are derived from data uploaded to the Rune Platform. However, V2 introduces new concepts that streamline how data is organized, stored, and accessed. While the data from both APIs is derived from the same uploaded datasets, the underlying storage is entirely independent.

Definitions

Patient

The person about whom data was measured. This may include a StrivePD user, a person with a neural implant, etc. Each patient has a unique identifier ("patient ID"). Patient IDs are generated by Rune and cannot be changed.

Device

The sensor serving as the data source (e.g. a DBS implant, iOS application, wearable device, etc.). A patient may have multiple devices: each one has a unique identifier ("device ID"). Device IDs are generated by Rune and cannot be changed. They do not correspond to any identifier or information from the device itself.

Stream

A single timeseries, defined by a stream type and a unique combination of parameters (key/value pairs). Each stream has a unique identifier ("stream ID").

To access data from this API, you must know which stream IDs you're interested in querying. Use Rune's GraphQL API to discover the stream IDs that are available to you.

Stream Type

A categorization of streams, both semantically and structurally. Roughly speaking, it represents the physical quantity being measured (voltage, acceleration, etc), including the unit of measurement. It also describes the shape of the stream’s data, as one or more dimensions. Think of a dimension as a column in a table, where each value in the timeseries is a row. Stream type definitions also include human-readable labels and descriptions.

Algorithm

A label associated with each stream, which describes how data was processed to produce the stream values (including how it is parameterized). Each algorithm label is formatted as algorithm-name.version (e.g. ingest-strive-applewatch-motion.0).

The term "algorithm" does not necessarily imply meaningful data manipulation or calculations (though it can). For example, the ingest-strive-applewatch-motion algorithm simply ingests the raw accelerometry data that is recorded on an Apple Watch and uploaded by the Strive iOS application.

Stream Parameters

Key/value pairs that are used to label streams.

All streams are labeled with the following:

patient_id — the anonymized ID of the person whose data this is.
device_id — the device from which the stream was recorded.
algorithm — describes how data was processed to produce the stream (see "Algorithm" definition).

Streams may also have any number of parameters, with other key/value pairs. By convention, all of our streams include the following parameters:

category – a broad categorization of the data type.
- E.g. neural, vital_sign, etc
measurement – a specific label for what is being measured
- E.g. heart_rate, step_count, etc

For details about how streams are parameterized, refer to the documentation for a particular algorithm.

Authentication

Authentication is required for all API requests. The V2 API uses the same authentication mechanisms as V1: see the V1 API documentation for details.

Errors

Standard HTTP status codes are used to indicate success or failure. In the event of most errors, the response body is JSON, with the following format:

{
  "success": false,
  "error": {
    "message": "Human-readable error description",
    "type": "EnumeratedErrorKind"
  }
}

Pagination

There is a limit on the amount of data that is returned in a response. This page size is determined by the server. If the queried time range contains more than one page’s worth of data, you must make multiple API requests to fetch the complete result.

If an API response includes a X-Rune-Next-Page-Token header, make another request to fetch the next page of data. In the next request, provide the value of this header in the page_token query parameter. Keep the rest of the query parameters identical. When the end of pagination has been reached, the X-Rune-Next-Page-Token header field will not exist in the response.

Each pagination token is valid for 5 minutes.

Single Stream

Fetch data for a single stream by its stream ID.

Stream Data

A successful request returns data from a single stream. The response format is determined largely by the stream type of the data: CSV columns (or JSON keys) are named for the dimensions of the corresponding stream type.

There is always a column/key that contains the duration of each measurement in the timeseries (in nanoseconds): "measurement_duration_ns". The duration value is extracted from the device data, with the following order of precedence:

Some devices specify a start and end time for each measured value. Whenever this is the case, the start time of each interval is used as the timestamp for each observation in the timeseries, and the duration is end_time – start_time.
Other devices specify a sampling rate for a particular measurement. When this is the case (and the true duration is unknown), the duration of each observation is set to 1/sample rate.
Otherwise, the duration is 0 (which indicates that it is unknown).

path Parameters

stream_id

required

string

Example: a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571

Unique ID for a Stream.

query Parameters

start_time	float Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (`>= start_time`). It is invalid to specify both `start_time` and `start_time_ns`.
start_time_ns	int The same as `start_time`, but expressed as integer nanoseconds. It is invalid to specify both `start_time` and `start_time_ns`.
end_time	float Unix timestamp (in seconds, with optional fraction) of the end of the time range to query. This is an exclusive boundary (`< end_time`). It is invalid to specify both `end_time` and `end_time_ns`.
end_time_ns	int The same as `end_time`, but expressed as integer nanoseconds. It is invalid to specify both `end_time` and `end_time_ns`.
format	string Default: "csv" Enum: "json" "csv" Determines the content type of the response.
limit	int The maximum number of timestamps to return, across all pages of the response.
page_token	string <byte> A token provided in the request to obtain the subsequent page of records in a dataset. The value is obtained from the `X-Rune-Next-Page-Token` response header field. See Pagination for details.
timestamp	string Default: "unix" Enum: "unix" "unixns" "iso" Determines how timestamps are formatted in the response: `unix` - Unix seconds, with up to 7 fractional digit precision. `unixns` - Unix nanoseconds (integer). `iso` - Formatted as RFC3339, including the `timezone` offset.
timezone	integer Default: 0 Example: timezone=-28800 The timezone offset, in seconds, used to calculate string-based timestamp formats such as `datetime` and `iso`. For example, PST (UTC-0800) is represented as `-28800`. If omitted, the timezone is UTC.
timezone_name	string Default: "" Example: timezone_name=America/Los_Angeles The name from the IANA timezone database used to calculate string-based timestamp formats such as `datetime` and `iso`. For example, `America/Los_Angeles`. While `timezone` uses a fixed offset, `timezone_name` will return the correct UTC offset for a given date/time in order to account for daylight saving time. If omitted, the timezone is UTC unless a `timezone` parameter is provided. If both a `timezone` and a `timezone_name` are provided the request will return an error.
translate_enums	string Default: false Enum: true false When fetching a stream that includes an `enum` data type, this determines whether to translate enums into their string representation. By default, enums are returned as integer values. All possible enum values for a given stream can be queried from the GraphQL API, as part of the stream metadata.

Responses

Response samples

200
400
401
404
500

Content type

text/csv

time,acceleration,measurement_duration_ns
1648412619.2589417,-0.00386,20000000
1648412620.2600197,-0.00816,20000000
1648412621.2671237,-0.01420,20000000
1648412622.2715627,-0.00312,20000000
1648412623.2800017,-0.00118,20000000

Stream Availability

Returns a boolean timeseries, indicating whether stream data exists in a given window of time. The resolution parameter determines the granularity of this timeseries (i.e. the interval between each timestamp in the series).

The timeseries is aligned such that each timestamp (as Unix seconds), is evenly divisible by the resolution. If the requested start_time is not evenly divisible by the resolution, the query range is extended to the nearest preceding aligned time. If the end_time is not evenly divisible by the resolution, the query range is extended to the nearest following aligned time. This ensures that the returned timestamps are evenly spaced and that each one represents a duration of time equal to the resolution (rather than getting truncated).

path Parameters

stream_id

required

string

Example: a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571

Unique ID for a Stream.

query Parameters

start_time required	float Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (`>= start_time`).
end_time required	float Unix timestamp (in seconds, with optional fraction) of the end of the time range to query. This is an exclusive boundary (`< end_time`).
resolution required	integer [ 300 .. 86400 ] Interval between returned timestamps, in seconds.
format	string Default: "csv" Enum: "json" "csv" Determines the content type of the response.
limit	int The maximum number of timestamps to return, across all pages of the response.
page_token	string <byte> A token provided in the request to obtain the subsequent page of records in a dataset. The value is obtained from the `X-Rune-Next-Page-Token` response header field. See Pagination for details.
partition_size	integer If set, break up result data for each stream into subsets of `partition_size` number of points. The last partition returned may not necessarily contain `partition_size` points. Only applicable for JSON-formatted responses (`format=json`). See response example.
timestamp	string Default: "unix" Enum: "unix" "unixns" "iso" Determines how timestamps are formatted in the response: `unix` - Unix seconds, with up to 7 fractional digit precision. `unixns` - Unix nanoseconds (integer). `iso` - Formatted as RFC3339, including the `timezone` offset.
timezone	integer Default: 0 Example: timezone=-28800 The timezone offset, in seconds, used to calculate string-based timestamp formats such as `datetime` and `iso`. For example, PST (UTC-0800) is represented as `-28800`. If omitted, the timezone is UTC.
timezone_name	string Default: "" Example: timezone_name=America/Los_Angeles The name from the IANA timezone database used to calculate string-based timestamp formats such as `datetime` and `iso`. For example, `America/Los_Angeles`. While `timezone` uses a fixed offset, `timezone_name` will return the correct UTC offset for a given date/time in order to account for daylight saving time. If omitted, the timezone is UTC unless a `timezone` parameter is provided. If both a `timezone` and a `timezone_name` are provided the request will return an error.

Responses

Response samples

200
400
401
404
500

Content type

application/json

Example

default

{"approx_available_duration_s": 900,
"cardinality": 5,
"data": {"time": [1648412600,
1648412900,
1648413200,
1648413500,
1648413800
],
"availability": [1,
1,
0,
0,
1
]
}
}

Daily Aggregate

Returns an average aggregated day divided into intervals. This endpoint is useful for powering visualizations that highlight daily patterns based on time of day.

Daily aggregates are only supported for stream types with two dimensions (i.e. a timestamp and a numeric value).

To calculate a daily aggregate, all data points within n_days from start_time are considered. Each data point is grouped together into intervals by rounding down the time portion of the timestamp to the nearest resolution seconds. Then, the mean is calculated from the data points in each interval to arrive at an aggregate value.

If no data points appeared within an interval for any of the days being aggregated, a null is returned for the value.

Consider the following example data points across two days, imagining that we want to take a daily aggregate for two days with a resolution of six hours:

2022-06-01T07:26:00Z: 5.0 (in the second 6-hour window of day 1)
2022-06-01T14:39:00Z: 47.3 (in the third 6-hour window of day 1)
2022-07-01T08:01:00Z: 10.0 (in the second 6-hour window of day 2)
2022-07-01T20:42:00Z: 2.7 (in the fourth 6-hour window of day 1)

Note that there are no data points anywhere in the first 6-hour window in either of the days.

If we query for two days from midnight UTC on June 1st (start_time=1654056000) with a 6-hour resolution (resolution=21600) across two days (n_days=2), we will receive a response with the following corresponding items in the arrays under the "offset" and "values" keys:

"00:00": null (since no data points fell within this interval)
"06:00": 7.5 (the mean of data points (1) and (3) above, which both fell in this interval)
"12:00": 47.3 (the mean of the single data point (2) above that fell in this interval)
"18:00": 2.7 (the mean of the single data point (4) above that fell in this interval)

The array under the "n_days_with_data" key also contains one item corresponding to each item in the "offset" array, giving the number of days in which any data points fell within that interval. Given the same example data points, the "n_days_with_data" array would contain the following counts (again listed with the corresponding offset):

"00:00": 0 (since no data points fell in this interval)
"06:00": 2 (since two data points, (1) and (3) above, fell in this interval)
"12:00": 1 (since a single data point, (2) above, fell in this interval)
"18:00": 1 (since a single data point, (4) above, fell in this interval)

The endpoint also returns a number of summary statistics about the data points under the top-level "summary" key. These meanings of these fields are described below.

path Parameters

stream_id

required

string

Example: a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571

Unique ID for a Stream.

query Parameters

start_time required	float Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (data points considered in calculating averages will be `>= start_time`). The `start_time` must be evenly divisible by `resolution`.
resolution required	int [ 60 .. 86400 ] Length of time (in seconds) for the window used to calculate intervals. Must evenly divide into a 24 hour period.
n_days required	int [ 1 .. 14 ] Number of days across which to compute daily aggregate. Each day is a 24 hour period beginning from the `start_time`.

Responses

Response samples

200
400
401
404
500

Content type

application/json

{"approx_available_duration_s": 900,
"data": {"offset": ["03:00",
"06:00",
"09:00",
"12:00",
"15:00",
"18:00"
],
"values": [42.1,
0.123,
47.0001,
null,
943,
0
],
"n_days_with_data": [14,
8,
10,
0,
7,
1
]
},
"cardinality": 6,
"summary": {"n_days_with_data_total": 14,
"duration_mean_per_day": 900000000.5,
"duration_min_per_day": 300000000,
"duration_max_per_day": 1800000000,
"value_mean": 655.5,
"value_min": 2.321,
"value_max": 1001.1,
"value_med": 303.8,
"value_std": 1230.123
}
}

Aggregate Window

Downsamples data into windows of resolution seconds and applies the specified aggregate_function to the data points that fall within each window.

Downsampled data is returned as a "time" array containing timestamps respresenting the the first timestamp (inclusive) of the window and a corresponding "aggregate_values" array where each item represents the aggregated value of all values that fell within that window.

Corresponding values in the "duration_sum" array represent the sum (in nanoseconds) of the durations of all of the data points in the window. This can be used to determine the amount of data that was used to calculate the aggregate value (for example, so that the client can decide not to display data points from days with less than 6 hours of data).

The response also contains summary statistics of the values in "aggregate_values".

path Parameters

stream_id

required

string

Example: a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571

Unique ID for a Stream.

query Parameters

start_time required	any Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (data points considered will be `>= start_time`). The value will be automatically truncated to be evenly divisible by `resolution` seconds.
end_time required	float Unix timestamp (in seconds, with optional fraction) of the end of the time range to query. This is an exclusive boundary (data points considered will be `< end_time`). The value will be automatically padded to be evenly divisible by `resolution` seconds.
resolution required	int [ 60 .. 604800 ] Length of time (in seconds) for the window used to aggregate over. Must evenly divide into a 24 hour period.
aggregate_function required	string Enum: "sum" "mean" Determines the aggregate function to apply to the values of data points within each window. Possible values are: `sum`: The total of all values is calculated. Aggregate value will be `null` if there were no data points in the window. `mean`: The mean of all values is calculated. Aggregate value will be `null` if there were no data points in the window.
timestamp	string Default: "unix" Enum: "unix" "unixns" "iso" Determines how timestamps are formatted in the response: `unix` - Unix seconds, with up to 7 fractional digit precision. `unixns` - Unix nanoseconds (integer). `iso` - Formatted as RFC3339, including the `timezone` offset.
timezone	integer Default: 0 Example: timezone=-28800 The timezone offset, in seconds, used to calculate string-based timestamp formats such as `datetime` and `iso`. For example, PST (UTC-0800) is represented as `-28800`. If omitted, the timezone is UTC.
timezone_name	string Default: "" Example: timezone_name=America/Los_Angeles The name from the IANA timezone database used to calculate string-based timestamp formats such as `datetime` and `iso`. For example, `America/Los_Angeles`. While `timezone` uses a fixed offset, `timezone_name` will return the correct UTC offset for a given date/time in order to account for daylight saving time. If omitted, the timezone is UTC unless a `timezone` parameter is provided. If both a `timezone` and a `timezone_name` are provided the request will return an error.

Responses

Response samples

200
400
401
404
500

Content type

application/json

{"cardinality": 5,
"data": {"time": [1661990400,
1662076800,
1662163200,
1662249600,
1662336000
],
"aggregate_values": [1.11,
null,
30,
4.5,
5
],
"duration_sum": [3600000000000,
0,
60000000000,
10800000000000,
7200000000000
]
},
"summary": {"value_mean": 10.1525,
"value_min": 1.11,
"value_max": 30,
"value_med": 4.75,
"value_std": 13.34402581682155
}
}

Multi Stream

Fetch data for multiple streams by their stream IDs.

Batch Availability

Returns a boolean timeseries, indicating whether data exists in a given window of time for a batch of streams. The resolution parameter determines the granularity of this timeseries (i.e. the interval between each timestamp in the series). The batch_operation parameter determines whether data is considered available when data exists for all streams, or when data exists for any of the streams.

The timeseries is aligned such that each timestamp (as Unix seconds), is evenly divisible by the resolution. If the requested start_time is not evenly divisible by the resolution, the query range is extended to the nearest preceding aligned time. If the end_time is not evenly divisible by the resolution, the query range is extended to the nearest following aligned time. This ensures that the returned timestamps are evenly spaced and that each one represents a duration of time equal to the resolution (rather than getting truncated).

query Parameters

stream_id required	Array of strings Example: stream_id=a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571&stream_id=37319fc03b749037aec18a1ef0a12a7f31bc300e22dee9928990b98c274cd528 Multiple unique IDs for streams.
batch_operation required	string Enum: "any" "all" String value that determines what type of batch calculation will determine availability for the batch of streams. If `batch_operation = all`, availability values will equal 1 when data is available for all streams in the given interval. If `batch_operation = any`, availability values will equal 1 when data is available for any of the requested streams in the given interval.
start_time required	float Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (`>= start_time`).
end_time required	float Unix timestamp (in seconds, with optional fraction) of the end of the time range to query. This is an exclusive boundary (`< end_time`).
timezone_name	string Default: "" Example: timezone_name=America/Los_Angeles The name from the IANA timezone database used to calculate string-based timestamp formats such as `datetime` and `iso`. For example, `America/Los_Angeles`. While `timezone` uses a fixed offset, `timezone_name` will return the correct UTC offset for a given date/time in order to account for daylight saving time. If omitted, the timezone is UTC unless a `timezone` parameter is provided. If both a `timezone` and a `timezone_name` are provided the request will return an error. If `timezone_name` is provided, aggregate queries with resolution equivalent to multiples of a day (86400 seconds) will group result windows by calendar day(s) in the provided timezone.
resolution required	integer [ 300 .. 86400 ] Interval between returned timestamps, in seconds.
format	string Default: "csv" Enum: "json" "csv" Determines the content type of the response.
limit	int The maximum number of timestamps to return, across all pages of the response.
page_token	string <byte> A token provided in the request to obtain the subsequent page of records in a dataset. The value is obtained from the `X-Rune-Next-Page-Token` response header field. See Pagination for details.
partition_size	integer If set, break up result data for each stream into subsets of `partition_size` number of points. The last partition returned may not necessarily contain `partition_size` points. Only applicable for JSON-formatted responses (`format=json`). See response example.
timestamp	string Default: "unix" Enum: "unix" "unixns" "iso" Determines how timestamps are formatted in the response: `unix` - Unix seconds, with up to 7 fractional digit precision. `unixns` - Unix nanoseconds (integer). `iso` - Formatted as RFC3339, including the `timezone` offset.
timezone	integer Default: 0 Example: timezone=-28800 The timezone offset, in seconds, used to calculate string-based timestamp formats such as `datetime` and `iso`. For example, PST (UTC-0800) is represented as `-28800`. If omitted, the timezone is UTC.

Responses

Response samples

200
400
401
404
500

Content type

application/json

Example

default

{"approx_available_duration_s": 900,
"cardinality": 5,
"data": {"time": [1648412600,
1648412900,
1648413200,
1648413500,
1648413800
],
"availability": [1,
1,
0,
0,
1
]
}
}