Rune Labs V2 Stream API

An HTTP API to query timeseries data.

Overview

V1 vs V2

Like the V1 Stream API, the V2 API can be used to query timeseries which are derived from data uploaded to the Rune Platform. However, V2 introduces new concepts that streamline how data is organized, stored, and accessed. While the data from both APIs is derived from the same uploaded datasets, the underlying storage is entirely independent.

Definitions

Patient

The person about whom data was measured. This may include a StrivePD user, a person with a neural implant, etc. Each patient has a unique identifier ("patient ID"). Patient IDs are generated by Rune and cannot be changed.

Device

The sensor serving as the data source (e.g. a DBS implant, iOS application, wearable device, etc.). A patient may have multiple devices: each one has a unique identifier ("device ID"). Device IDs are generated by Rune and cannot be changed. They do not correspond to any identifier or information from the device itself.

Stream

A single timeseries, defined by a stream type and a unique combination of parameters (key/value pairs). Each stream has a unique identifier ("stream ID").

To access data from this API, you must know which stream IDs you're interested in querying. Use Rune's GraphQL API to discover the stream IDs that are available to you.

Stream Type

A categorization of streams, both semantically and structurally. Roughly speaking, it represents the physical quantity being measured (voltage, acceleration, etc), including the unit of measurement. It also describes the shape of the stream’s data, as one or more dimensions. Think of a dimension as a column in a table, where each value in the timeseries is a row. Stream type definitions also include human-readable labels and descriptions.

Algorithm

A label associated with each stream, which describes how data was processed to produce the stream values (including how it is parameterized). Each algorithm label is formatted as algorithm-name.version (e.g. ingest-strive-applewatch-motion.0).

The term "algorithm" does not necessarily imply meaningful data manipulation or calculations (though it can). For example, the ingest-strive-applewatch-motion algorithm simply ingests the raw accelerometry data that is recorded on an Apple Watch and uploaded by the Strive iOS application.

Stream Parameters

Key/value pairs that are used to label streams.

All streams are labeled with the following:

  • patient_id — the anonymized ID of the person whose data this is.
  • device_id — the device from which the stream was recorded.
  • algorithm — describes how data was processed to produce the stream (see "Algorithm" definition).

Streams may also have any number of parameters, with other key/value pairs. By convention, all of our streams include the following parameters:

  • category – a broad categorization of the data type.
    • E.g. neural, vital_sign, etc
  • measurement – a specific label for what is being measured
    • E.g. heart_rate, step_count, etc

For details about how streams are parameterized, refer to the documentation for a particular algorithm.

Authentication

Authentication is required for all API requests. The V2 API uses the same authentication mechanisms as V1: see the V1 API documentation for details.

Errors

Standard HTTP status codes are used to indicate success or failure. In the event of most errors, the response body is JSON, with the following format:

{
  "success": false,
  "error": {
    "message": "Human-readable error description",
    "type": "EnumeratedErrorKind"
  }
}

Pagination

There is a limit on the amount of data that is returned in a response. This page size is determined by the server. If the queried time range contains more than one page’s worth of data, you must make multiple API requests to fetch the complete result.

If an API response includes a X-Rune-Next-Page-Token header, make another request to fetch the next page of data. In the next request, provide the value of this header in the page_token query parameter. Keep the rest of the query parameters identical. When the end of pagination has been reached, the X-Rune-Next-Page-Token header field will not exist in the response.

Each pagination token is valid for 5 minutes.

Single Stream

Fetch data for a single stream by its stream ID.

Stream Data

A successful request returns data from a single stream. The response format is determined largely by the stream type of the data: CSV columns (or JSON keys) are named for the dimensions of the corresponding stream type.

There is always a column/key that contains the duration of each measurement in the timeseries (in nanoseconds): "measurement_duration_ns". The duration value is extracted from the device data, with the following order of precedence:

  • Some devices specify a start and end time for each measured value. Whenever this is the case, the start time of each interval is used as the timestamp for each observation in the timeseries, and the duration is end_time – start_time.
  • Other devices specify a sampling rate for a particular measurement. When this is the case (and the true duration is unknown), the duration of each observation is set to 1/sample rate.
  • Otherwise, the duration is 0 (which indicates that it is unknown).
path Parameters
stream_id
required
string
Example: a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571

Unique ID for a Stream.

query Parameters
start_time
float

Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (>= start_time).

It is invalid to specify both start_time and start_time_ns.

start_time_ns
int

The same as start_time, but expressed as integer nanoseconds.

It is invalid to specify both start_time and start_time_ns.

end_time
float

Unix timestamp (in seconds, with optional fraction) of the end of the time range to query. This is an exclusive boundary (< end_time).

It is invalid to specify both end_time and end_time_ns.

end_time_ns
int

The same as end_time, but expressed as integer nanoseconds.

It is invalid to specify both end_time and end_time_ns.

format
string
Default: "csv"
Enum: "json" "csv"

Determines the content type of the response.

limit
int

The maximum number of timestamps to return, across all pages of the response.

page_token
string <byte>

A token provided in the request to obtain the subsequent page of records in a dataset. The value is obtained from the X-Rune-Next-Page-Token response header field. See Pagination for details.

timestamp
string
Default: "unix"
Enum: "unix" "unixns" "iso"

Determines how timestamps are formatted in the response:

  • unix - Unix seconds, with up to 7 fractional digit precision.
  • unixns - Unix nanoseconds (integer).
  • iso - Formatted as RFC3339, including the timezone offset.
timezone
integer
Default: 0
Example: timezone=-28800

The timezone offset, in seconds, used to calculate string-based timestamp formats such as datetime and iso. For example, PST (UTC-0800) is represented as -28800. If omitted, the timezone is UTC.

timezone_name
string
Default: ""
Example: timezone_name=America/Los_Angeles

The name from the IANA timezone database used to calculate string-based timestamp formats such as datetime and iso. For example, America/Los_Angeles. While timezone uses a fixed offset, timezone_name will return the correct UTC offset for a given date/time in order to account for daylight saving time. If omitted, the timezone is UTC unless a timezone parameter is provided. If both a timezone and a timezone_name are provided the request will return an error.

translate_enums
string
Default: false
Enum: true false

When fetching a stream that includes an enum data type, this determines whether to translate enums into their string representation. By default, enums are returned as integer values.

All possible enum values for a given stream can be queried from the GraphQL API, as part of the stream metadata.

Responses

Response samples

Content type
time,acceleration,measurement_duration_ns
1648412619.2589417,-0.00386,20000000
1648412620.2600197,-0.00816,20000000
1648412621.2671237,-0.01420,20000000
1648412622.2715627,-0.00312,20000000
1648412623.2800017,-0.00118,20000000

Stream Availability

Returns a boolean timeseries, indicating whether stream data exists in a given window of time. The resolution parameter determines the granularity of this timeseries (i.e. the interval between each timestamp in the series).

The timeseries is aligned such that each timestamp (as Unix seconds), is evenly divisible by the resolution. If the requested start_time is not evenly divisible by the resolution, the query range is extended to the nearest preceding aligned time. If the end_time is not evenly divisible by the resolution, the query range is extended to the nearest following aligned time. This ensures that the returned timestamps are evenly spaced and that each one represents a duration of time equal to the resolution (rather than getting truncated).

path Parameters
stream_id
required
string
Example: a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571

Unique ID for a Stream.

query Parameters
start_time
required
float

Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (>= start_time).

end_time
required
float

Unix timestamp (in seconds, with optional fraction) of the end of the time range to query. This is an exclusive boundary (< end_time).

resolution
required
integer [ 300 .. 86400 ]

Interval between returned timestamps, in seconds.

format
string
Default: "csv"
Enum: "json" "csv"

Determines the content type of the response.

limit
int

The maximum number of timestamps to return, across all pages of the response.

page_token
string <byte>

A token provided in the request to obtain the subsequent page of records in a dataset. The value is obtained from the X-Rune-Next-Page-Token response header field. See Pagination for details.

partition_size
integer

If set, break up result data for each stream into subsets of partition_size number of points. The last partition returned may not necessarily contain partition_size points.

Only applicable for JSON-formatted responses (format=json). See response example.

timestamp
string
Default: "unix"
Enum: "unix" "unixns" "iso"

Determines how timestamps are formatted in the response:

  • unix - Unix seconds, with up to 7 fractional digit precision.
  • unixns - Unix nanoseconds (integer).
  • iso - Formatted as RFC3339, including the timezone offset.
timezone
integer
Default: 0
Example: timezone=-28800

The timezone offset, in seconds, used to calculate string-based timestamp formats such as datetime and iso. For example, PST (UTC-0800) is represented as -28800. If omitted, the timezone is UTC.

timezone_name
string
Default: ""
Example: timezone_name=America/Los_Angeles

The name from the IANA timezone database used to calculate string-based timestamp formats such as datetime and iso. For example, America/Los_Angeles. While timezone uses a fixed offset, timezone_name will return the correct UTC offset for a given date/time in order to account for daylight saving time. If omitted, the timezone is UTC unless a timezone parameter is provided. If both a timezone and a timezone_name are provided the request will return an error.

Responses

Response samples

Content type
Example
{
  • "approx_available_duration_s": 900,
  • "cardinality": 5,
  • "data": {
    }
}

Daily Aggregate

Returns an average aggregated day divided into intervals. This endpoint is useful for powering visualizations that highlight daily patterns based on time of day.

Daily aggregates are only supported for stream types with two dimensions (i.e. a timestamp and a numeric value).

To calculate a daily aggregate, all data points within n_days from start_time are considered. Each data point is grouped together into intervals by rounding down the time portion of the timestamp to the nearest resolution seconds. Then, the mean is calculated from the data points in each interval to arrive at an aggregate value.

If no data points appeared within an interval for any of the days being aggregated, a null is returned for the value.

Consider the following example data points across two days, imagining that we want to take a daily aggregate for two days with a resolution of six hours:

  1. 2022-06-01T07:26:00Z: 5.0 (in the second 6-hour window of day 1)
  2. 2022-06-01T14:39:00Z: 47.3 (in the third 6-hour window of day 1)
  3. 2022-07-01T08:01:00Z: 10.0 (in the second 6-hour window of day 2)
  4. 2022-07-01T20:42:00Z: 2.7 (in the fourth 6-hour window of day 1)

Note that there are no data points anywhere in the first 6-hour window in either of the days.

If we query for two days from midnight UTC on June 1st (start_time=1654056000) with a 6-hour resolution (resolution=21600) across two days (n_days=2), we will receive a response with the following corresponding items in the arrays under the "offset" and "values" keys:

  1. "00:00": null (since no data points fell within this interval)
  2. "06:00": 7.5 (the mean of data points (1) and (3) above, which both fell in this interval)
  3. "12:00": 47.3 (the mean of the single data point (2) above that fell in this interval)
  4. "18:00": 2.7 (the mean of the single data point (4) above that fell in this interval)

The array under the "n_days_with_data" key also contains one item corresponding to each item in the "offset" array, giving the number of days in which any data points fell within that interval. Given the same example data points, the "n_days_with_data" array would contain the following counts (again listed with the corresponding offset):

  1. "00:00": 0 (since no data points fell in this interval)
  2. "06:00": 2 (since two data points, (1) and (3) above, fell in this interval)
  3. "12:00": 1 (since a single data point, (2) above, fell in this interval)
  4. "18:00": 1 (since a single data point, (4) above, fell in this interval)

The endpoint also returns a number of summary statistics about the data points under the top-level "summary" key. These meanings of these fields are described below.

path Parameters
stream_id
required
string
Example: a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571

Unique ID for a Stream.

query Parameters
start_time
required
float

Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (data points considered in calculating averages will be >= start_time).

The start_time must be evenly divisible by resolution.

resolution
required
int [ 60 .. 86400 ]

Length of time (in seconds) for the window used to calculate intervals. Must evenly divide into a 24 hour period.

n_days
required
int [ 1 .. 14 ]

Number of days across which to compute daily aggregate. Each day is a 24 hour period beginning from the start_time.

Responses

Response samples

Content type
application/json
{
  • "approx_available_duration_s": 900,
  • "data": {
    },
  • "cardinality": 6,
  • "summary": {
    }
}

Aggregate Window

Downsamples data into windows of resolution seconds and applies the specified aggregate_function to the data points that fall within each window.

Downsampled data is returned as a "time" array containing timestamps respresenting the the first timestamp (inclusive) of the window and a corresponding "aggregate_values" array where each item represents the aggregated value of all values that fell within that window.

Corresponding values in the "duration_sum" array represent the sum (in nanoseconds) of the durations of all of the data points in the window. This can be used to determine the amount of data that was used to calculate the aggregate value (for example, so that the client can decide not to display data points from days with less than 6 hours of data).

The response also contains summary statistics of the values in "aggregate_values".

path Parameters
stream_id
required
string
Example: a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571

Unique ID for a Stream.

query Parameters
start_time
required
any

Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (data points considered will be >= start_time).

The value will be automatically truncated to be evenly divisible by resolution seconds.

end_time
required
float

Unix timestamp (in seconds, with optional fraction) of the end of the time range to query. This is an exclusive boundary (data points considered will be < end_time).

The value will be automatically padded to be evenly divisible by resolution seconds.

resolution
required
int [ 60 .. 604800 ]

Length of time (in seconds) for the window used to aggregate over. Must evenly divide into a 24 hour period.

aggregate_function
required
string
Enum: "sum" "mean"

Determines the aggregate function to apply to the values of data points within each window. Possible values are:

  • sum: The total of all values is calculated. Aggregate value will be null if there were no data points in the window.

  • mean: The mean of all values is calculated. Aggregate value will be null if there were no data points in the window.

timestamp
string
Default: "unix"
Enum: "unix" "unixns" "iso"

Determines how timestamps are formatted in the response:

  • unix - Unix seconds, with up to 7 fractional digit precision.
  • unixns - Unix nanoseconds (integer).
  • iso - Formatted as RFC3339, including the timezone offset.
timezone
integer
Default: 0
Example: timezone=-28800

The timezone offset, in seconds, used to calculate string-based timestamp formats such as datetime and iso. For example, PST (UTC-0800) is represented as -28800. If omitted, the timezone is UTC.

timezone_name
string
Default: ""
Example: timezone_name=America/Los_Angeles

The name from the IANA timezone database used to calculate string-based timestamp formats such as datetime and iso. For example, America/Los_Angeles. While timezone uses a fixed offset, timezone_name will return the correct UTC offset for a given date/time in order to account for daylight saving time. If omitted, the timezone is UTC unless a timezone parameter is provided. If both a timezone and a timezone_name are provided the request will return an error.

Responses

Response samples

Content type
application/json
{
  • "cardinality": 5,
  • "data": {
    },
  • "summary": {
    }
}

Multi Stream

Fetch data for multiple streams by their stream IDs.

Batch Availability

Returns a boolean timeseries, indicating whether data exists in a given window of time for a batch of streams. The resolution parameter determines the granularity of this timeseries (i.e. the interval between each timestamp in the series). The batch_operation parameter determines whether data is considered available when data exists for all streams, or when data exists for any of the streams.

The timeseries is aligned such that each timestamp (as Unix seconds), is evenly divisible by the resolution. If the requested start_time is not evenly divisible by the resolution, the query range is extended to the nearest preceding aligned time. If the end_time is not evenly divisible by the resolution, the query range is extended to the nearest following aligned time. This ensures that the returned timestamps are evenly spaced and that each one represents a duration of time equal to the resolution (rather than getting truncated).

query Parameters
stream_id
required
Array of strings
Example: stream_id=a1379454321e63222f547d8f929a2451d0f295cfbd77fabf9ce8090e1c31c571&stream_id=37319fc03b749037aec18a1ef0a12a7f31bc300e22dee9928990b98c274cd528

Multiple unique IDs for streams.

batch_operation
required
string
Enum: "any" "all"

String value that determines what type of batch calculation will determine availability for the batch of streams. If batch_operation = all, availability values will equal 1 when data is available for all streams in the given interval. If batch_operation = any, availability values will equal 1 when data is available for any of the requested streams in the given interval.

start_time
required
float

Unix timestamp (in seconds, with optional fraction) of the start of the time range to query. This is an inclusive boundary (>= start_time).

end_time
required
float

Unix timestamp (in seconds, with optional fraction) of the end of the time range to query. This is an exclusive boundary (< end_time).

timezone_name
string
Default: ""
Example: timezone_name=America/Los_Angeles

The name from the IANA timezone database used to calculate string-based timestamp formats such as datetime and iso. For example, America/Los_Angeles. While timezone uses a fixed offset, timezone_name will return the correct UTC offset for a given date/time in order to account for daylight saving time. If omitted, the timezone is UTC unless a timezone parameter is provided. If both a timezone and a timezone_name are provided the request will return an error. If timezone_name is provided, aggregate queries with resolution equivalent to multiples of a day (86400 seconds) will group result windows by calendar day(s) in the provided timezone.

resolution
required
integer [ 300 .. 86400 ]

Interval between returned timestamps, in seconds.

format
string
Default: "csv"
Enum: "json" "csv"

Determines the content type of the response.

limit
int

The maximum number of timestamps to return, across all pages of the response.

page_token
string <byte>

A token provided in the request to obtain the subsequent page of records in a dataset. The value is obtained from the X-Rune-Next-Page-Token response header field. See Pagination for details.

partition_size
integer

If set, break up result data for each stream into subsets of partition_size number of points. The last partition returned may not necessarily contain partition_size points.

Only applicable for JSON-formatted responses (format=json). See response example.

timestamp
string
Default: "unix"
Enum: "unix" "unixns" "iso"

Determines how timestamps are formatted in the response:

  • unix - Unix seconds, with up to 7 fractional digit precision.
  • unixns - Unix nanoseconds (integer).
  • iso - Formatted as RFC3339, including the timezone offset.
timezone
integer
Default: 0
Example: timezone=-28800

The timezone offset, in seconds, used to calculate string-based timestamp formats such as datetime and iso. For example, PST (UTC-0800) is represented as -28800. If omitted, the timezone is UTC.

Responses

Response samples

Content type
Example
{
  • "approx_available_duration_s": 900,
  • "cardinality": 5,
  • "data": {
    }
}