Skip to content

Experimental

rerun.experimental

Experimental features for Rerun.

These features are not yet stable and may change in future releases without going through the normal deprecation cycle.

Lens = DeriveLens | MutateLens module-attribute

Union of all lens types.

class Chunk

A single chunk of data from a recording.

entity_path property

The entity path this chunk belongs to.

id property

The unique ID of this chunk.

is_empty property

Whether the chunk has zero rows.

is_static property

Whether the chunk contains only static data (no timelines).

num_columns property

The number of columns in this chunk.

num_rows property

The number of rows in this chunk.

timeline_names property

The names of all timelines in this chunk.

def apply_lenses(lenses)

Apply one or more lenses to this chunk, returning transformed chunks.

Each lens matches by input component. Columns not consumed by any matching lens are forwarded unchanged as a separate chunk.

If no lens matches the chunk (including when an empty list of lenses is passed), the original chunk is returned unchanged.

PARAMETER DESCRIPTION
lenses

One or more Lens objects.

TYPE: Sequence[Lens] | Lens

RETURNS DESCRIPTION
A list of [`Chunk`][] objects.
def apply_selector(source, selector)

Apply a selector to a single component, returning a new chunk with the component transformed.

All other columns (timelines, other components) are preserved unchanged. The source component's existing descriptor is preserved.

For better performance, prefer MutateLens with apply_lenses which processes multiple transformations in a single pass.

PARAMETER DESCRIPTION
source

A ComponentDescriptor or component identifier string for the input column to transform.

TYPE: ComponentDescriptor | str

selector

A Selector or selector query string to apply to the component.

TYPE: Selector | str

RETURNS DESCRIPTION
A new [`Chunk`][rerun.experimental.Chunk] with the component transformed.
RAISES DESCRIPTION
ValueError

If the source component is not found in the chunk or the selector fails to evaluate.

def format(*, width=240, redact=False, trim_metadata_keys=True)

Format this chunk as a human-readable table string.

PARAMETER DESCRIPTION
width

Fixed width for the table. Default: 240.

TYPE: int DEFAULT: 240

redact

If True, redact non-deterministic values (RowIds, ChunkIds, etc.) for stable snapshot testing. Default: False.

TYPE: bool DEFAULT: False

trim_metadata_keys

If True, trim the rerun: / sorbet: prefix from metadata keys. Default: True.

TYPE: bool DEFAULT: True

def from_columns(entity_path, indexes, columns) classmethod

Create a Chunk from columns, mirroring the rerun.send_columns API.

A fresh chunk ID and sequential row IDs are auto-generated.

PARAMETER DESCRIPTION
entity_path

The entity path for this chunk (e.g., "/camera/image").

TYPE: str

indexes

The time columns for this chunk. Each TimeColumnLike provides a timeline name and a PyArrow array of timestamps. You typically use TimeColumn here. Pass an empty iterable for static data.

TYPE: Iterable[TimeColumnLike]

columns

The component columns for this chunk. Each ComponentColumn provides a component descriptor and a PyArrow array of component data.

TYPE: Iterable[ComponentColumn]

RAISES DESCRIPTION
ValueError

If timeline and component column lengths don't match.

Example
chunk = Chunk.from_columns(
    "/robots/arm",
    indexes=[rr.TimeColumn("frame", sequence=[0, 1, 2])],
    columns=rr.Points3D.columns(positions=[[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
)
def from_dataframe(dataframe, *, index=AUTO_INDEX, entity_path=None) classmethod

Lazily turn an Arrow-backed dataframe into chunks.

Accepts a Table, a RecordBatch, a RecordBatchReader, or any object implementing the Arrow C stream interface (__arrow_c_stream__) — most notably a datafusion.DataFrame (an optional dependency).

Yields each chunk of Chunk.from_record_batch applied to every record batch in turn. See that method for the index and entity_path semantics.

RAISES DESCRIPTION
TypeError

If dataframe is not a pyarrow Table, a pyarrow RecordBatch, a pyarrow RecordBatchReader, or an Arrow-C-stream object (such as a datafusion.DataFrame).

ValueError
def from_record_batch(record_batch, *, index=AUTO_INDEX, entity_path=None) classmethod

Interpret an Arrow RecordBatch as Rerun chunk data.

Each column of the batch is classified as a row-id column, index (timeline) column, or a component column. Component columns are then grouped per entity path, and one chunk per entity path is emitted.

The rerun:* arrow metadata, if it exists, drives the kind of each input column, as well as the entity/archetype/component type for component columns.

If present, the row id column and chunk id metadata indicate that the batch represents a fully identified chunk, e.g. as produced by Chunk.to_record_batch. Both the row ids and chunk id are preserved under the following conditions: - both are present in the input batch - index is omitted - entity_path is omitted

If any of these conditions are not met, it means that either the batch is not fully identified, or that the chunk data is reinterpreted (e.g. entity path rewriting). In that case, fresh row ids and chunk id are generated and used instead of the input ones.

PARAMETER DESCRIPTION
record_batch

The Arrow record batch to interpret. Component columns may be either lists (one component batch per row) or plain arrays (wrapped as single-element lists automatically).

TYPE: RecordBatch

index

Determines which columns are index (timeline) columns. Each promoted column's time type is taken from its Arrow datatype: int64 → sequence, timestamp(ns) → timestamp, duration(ns) → duration.

  • Omitted (the default): derive the index columns from the batch's Rerun metadata. The batch is treated as temporal if it carries index metadata. A batch with no index metadata is ambiguous and raises an error — unless it is an already-identified chunk (it carries a row-id column and a chunk id), which round-trips as-is and may therefore be static. Pass index=None to force a static interpretation.
  • A column name, or list of column names: treat exactly these columns as timelines. The remaining (non-row-id) columns become components.
  • None: produce static chunks (no timeline). Any index metadata or promoted index column is then a contradiction and is rejected.

Note

Static chunks with multiple rows are legitimate in some cases, but only the last row is visible from typical latest-at queries. An info-level message is emitted when this happens — except for an already-identified chunk that is preserved as-is (see above), which is passed through without this check.

TYPE: str | list[str] | None | _AutoIndex DEFAULT: AUTO_INDEX

entity_path

Default entity path for component columns that do not otherwise specify one. Resolution order per component column is: its rerun:entity_path metadata, then the batch-level rerun:entity_path metadata, then the column-name convention (see Notes), then this argument, then the root entity (/).

TYPE: str | None DEFAULT: None

RETURNS DESCRIPTION
One chunk per distinct entity path described by the batch, in first-seen column order.
RAISES DESCRIPTION
ValueError

In any of the following cases:

  • index was omitted and the batch carries no index metadata (an ambiguous raw batch). Pass index=<column> for temporal data or index=None for static data.
  • index=None was given but the batch also carries index metadata or names an index column (contradiction).
  • index names a column that is not present in the batch.
  • The batch contains no component columns (there is nothing to log).
  • A column promoted to an index contains null values. Time columns must be dense; static data is expressed with index=None, not with null times.
  • An index column has an Arrow datatype that is not a supported time type.
  • The batch is a fully-identified chunk (it carries both a row-id column and a chunk id) but resolves to more than one entity path. An identified chunk is preserved as a single chunk; drop the chunk-id metadata and/or the row-id column to reinterpret it into one chunk per entity (with freshly-minted ids).
Notes

Column-name convention. When a component column carries no rerun:entity_path / rerun:component metadata, its entity path is read from the column name: if the name starts with / and contains a :, the first part of the column name is interpreted as the entity path and the rest as the component identifier. Example: /point:Points3D:positions and /metadata:foo.

Limitations/Future work

A batch that mixes static and temporal rows — aka where some index values are null — are rejected. Handling this case requires row-splitting and generating a mix of temporal and static chunks.

Recording-property columns (named property:…, mapping to the /__properties entity) are not recognized by the column-name convention and are not mapped back to that entity.

def to_record_batch()

Convert this chunk to an Arrow RecordBatch.

def with_entity_path(entity_path)

Return a copy of this chunk with a new entity path.

A fresh chunk ID is generated to avoid aliasing the original chunk in downstream caches and indices. Row IDs, timelines, and components are preserved as-is.

PARAMETER DESCRIPTION
entity_path

The new entity path for the returned chunk (e.g. "/left/camera/image").

TYPE: str

class ChunkStore

A fully-materialized, in-memory chunk store.

Build one from chunks via ChunkStore.from_chunks, or fully materialize an IndexedReader via reader.stream().collect(). For lazy, on-demand chunk loading, see LazyStore.

Use stream() to process chunks through the lazy pipeline, or write_rrd() to persist to disk.

def __len__()

Return the number of chunks in this store.

def from_chunks(chunks) staticmethod

Build a ChunkStore from a sequence of chunks.

def reader(index, *, contents=None, include_semantically_empty_columns=False, include_tombstone_columns=False, fill_latest_at=False, using_index_values=None, ctx=None)

Build a DataFusion DataFrame over this store.

The returned DataFrame is data-equivalent to the result of round-tripping the same chunks through write_rrd → rr.server.Server → dataset.reader(), modulo the rerun_segment_id column (absent here because a single ChunkStore has no segment concept).

PARAMETER DESCRIPTION
index

The index (timeline) column to use, or None for the static-only view.

TYPE: str | None

contents

Entity-path filter. A ContentFilter built with the fluent API, a single entity-path expression, a list of expressions, or None for everything. An empty list returns no rows.

TYPE: ContentFilter | str | list[str] | None DEFAULT: None

include_semantically_empty_columns

Whether to include columns that are semantically empty.

TYPE: bool DEFAULT: False

include_tombstone_columns

Whether to include tombstone columns.

TYPE: bool DEFAULT: False

fill_latest_at

Whether to fill null values with the latest valid data.

TYPE: bool DEFAULT: False

using_index_values

Index values at which to resample data.

When specified, this argument changes the way rows are returned. Instead of returning the rows that exist in the data, one row is returned per index_value you provide. If the segment has no row at that index value, nulls are returned — or the latest prior value if fill_latest_at=True` (which is typically what you want for resampling).

Don't use this argument for plain index slicing — use a DataFusion filter on the index column instead. For example:

from datafusion import col, lit

# All rows in a time window.
store.reader(index="real_time").filter(
    (col("real_time") >= lit(t0)) & (col("real_time") <= lit(t1))
)

TYPE: IndexValuesLike | None DEFAULT: None

ctx

DataFusion SessionContext to register the table into. When None, uses datafusion.SessionContext.global_ctx() — the process-wide default. Pass an explicit ctx for isolation or a custom SessionConfig.

TYPE: SessionContext | None DEFAULT: None

def schema()

The schema describing all columns in this store.

def stream()

Return a lazy stream over all chunks in this store.

def summary()

Compact, deterministic summary of every chunk in the store.

Each line describes one chunk:

{entity_path}  rows={n}  static={True|False}  timelines=[…]  cols=[…]

Useful for snapshot testing.

def write_rrd(path, *, application_id, recording_id)

Write all chunks to an RRD file.

The caller must provide application_id and recording_id explicitly.

class DeriveLens

A derive lens that creates new component/time columns from an input component.

Derive lenses extract fields from a component and produce new columns, optionally at a different entity and/or with new time columns.

Pass scatter=True to enable 1:N row mapping (exploding lists).

Example usage::

lens = (
    DeriveLens("Imu:accel")
    .to_component(rr.Scalars.descriptor_scalars(), Selector(".x"))
)

To write to an explicit target entity::

lens = (
    DeriveLens("Imu:accel", output_entity="/out/x")
    .to_component(rr.Scalars.descriptor_scalars(), Selector(".x"))
)
def __init__(input_component, *, output_entity=None, scatter=False)

Create a new derive lens.

PARAMETER DESCRIPTION
input_component

The component identifier to match (e.g. "Imu:accel").

TYPE: str

output_entity

Optional target entity path. When set, output is written to this entity instead of the input entity.

TYPE: str | None DEFAULT: None

scatter

When True, use 1:N row mapping (explode lists).

TYPE: bool DEFAULT: False

def to_component(component, selector, *, cast_to=None)

Add a component output column.

PARAMETER DESCRIPTION
component

A ComponentDescriptor or a component identifier string for the output column (e.g. "Scalars:scalars").

TYPE: ComponentDescriptor | str

selector

A Selector or selector query string to apply to the input column.

TYPE: Selector | str

cast_to

How to cast the produced column to match the target component. By default (None) the column is emitted as-is. Pass "auto" to cast it to the component's canonical Arrow datatype, or an explicit pyarrow DataType to cast it to that type. Casting errors if the conversion is unsupported.

TYPE: DataType | Literal['auto'] | None DEFAULT: None

RETURNS DESCRIPTION
A new [`DeriveLens`][rerun.experimental.DeriveLens] with the component added.
def to_timeline(timeline_name, timeline_type, selector)

Add a time extraction column.

PARAMETER DESCRIPTION
timeline_name

Name of the timeline to create.

TYPE: str

timeline_type

Type of the timeline: "sequence", "duration_ns", or "timestamp_ns".

TYPE: Literal['sequence', 'duration_ns', 'timestamp_ns']

selector

A Selector or selector query string to extract time values (must produce Int64 arrays).

TYPE: Selector | str

RETURNS DESCRIPTION
A new [`DeriveLens`][rerun.experimental.DeriveLens] with the time column added.

class IndexedReader

Bases: StreamingReader, Protocol

Protocol for readers backed by an index/manifest.

Extends StreamingReader: every IndexedReader also supports stream() -> LazyChunkStream for pure-streaming processing.

Indexed readers expose a LazyStore view over the source via store() — the manifest is read up-front; chunks load on demand. To fully materialize into a ChunkStore, call stream().collect().

def store()

Return a LazyStore view of this source.

def stream()

Return a lazy stream over all chunks from this source.

class LazyChunkStream

A lazy, composable pipeline over chunks.

Builder methods (filter, drop, split, map, flat_map, lenses, merge) consume the input stream(s) and return new stream(s). A consumed stream cannot be used again; attempting to do so raises a ValueError. This prevents accidental reuse that would result in duplicate use of the same stream in a pipeline.

Terminal methods (to_chunks, __iter__, collect, write_rrd) do not consume the stream — they run the pipeline and leave the stream usable. Each call creates a fresh execution.

def __iter__()

Iterate over chunks one at a time (triggers execution).

def collect(*, optimize=None)

Run the pipeline and materialize all chunks into a ChunkStore.

By default, only the single-pass compaction that happens naturally during chunk insertion is applied. Pass optimize=OptimizationProfile.LIVE or optimize=OptimizationProfile.OBJECT_STORE to run additional optimization (extra convergence passes, video GoP rebatching) tuned for the chosen target.

PARAMETER DESCRIPTION
optimize

If None (default), no extra optimization is performed beyond the single pass that happens on insert.

Otherwise, apply the given profile after insertion.

TYPE: OptimizationProfile | None DEFAULT: None

Examples:

Run with the object-store-tuned profile:

store = reader.stream().collect(optimize=OptimizationProfile.OBJECT_STORE)
def drop(*, content=None, has_timeline=None, is_static=None, components=None)

Drop the matching portion of each chunk; keep the rest. Consumes this stream.

Complement of filter(): what filter() would keep is discarded, what it would discard is kept.

PARAMETER DESCRIPTION
content

Entity path filter. Accepts a single expression, a list of expressions, or a ContentFilter object.

TYPE: ContentFilter | str | Sequence[str] | None DEFAULT: None

has_timeline

Only drop chunks that have a column for this timeline.

TYPE: str | None DEFAULT: None

is_static

If True, drop only static chunks. If False, drop only temporal chunks.

TYPE: bool | None DEFAULT: None

components

Drop the listed component columns. Accepts ComponentDescriptor objects or str component identifiers (e.g. "Points3D:positions"). A single value or a list are both accepted.

TYPE: ComponentDescriptor | str | Sequence[ComponentDescriptor | str] | None DEFAULT: None

def filter(*, content=None, has_timeline=None, is_static=None, components=None)

Keep the matching portion of each chunk; drop the rest. Consumes this stream.

All criteria are combined with AND. For chunk-level predicates (content, has_timeline, is_static) the chunk either passes or is dropped entirely. For components, the chunk is split by component columns: only matching component columns are kept (timelines and entity path are preserved). When a list is given, any column matching any of the listed components is kept (OR semantics). Chunks that contain none of the listed components are dropped entirely.

If a chunk fails any predicate, it is dropped entirely -- no component splitting occurs.

PARAMETER DESCRIPTION
content

Entity path filter. Accepts a single expression, a list of expressions, or a ContentFilter object.

TYPE: ContentFilter | str | Sequence[str] | None DEFAULT: None

has_timeline

Only keep chunks that have a column for this timeline.

TYPE: str | None DEFAULT: None

is_static

If True, keep only static chunks. If False, keep only temporal chunks.

TYPE: bool | None DEFAULT: None

components

Keep only the listed component columns. Accepts ComponentDescriptor objects or str component identifiers (e.g. "Points3D:positions"). A single value or a list are both accepted.

TYPE: ComponentDescriptor | str | Sequence[ComponentDescriptor | str] | None DEFAULT: None

def flat_map(fn)

Apply a Python function to each chunk, producing zero or more output chunks. Consumes this stream.

Runs in Python (GIL-bound, sequential).

def from_iter(chunks) staticmethod

Wrap a Python iterable of Chunks into a LazyChunkStream.

Enables user-defined sources and the generator escape hatch.

def lenses(lenses, *, output_mode='drop_unmatched', content=None)

Apply lenses to transform chunk data. Consumes this stream.

Each lens matches chunks by entity path and input component, then transforms the data according to its output specifications.

PARAMETER DESCRIPTION
lenses

One or more Lens objects.

TYPE: Sequence[Lens] | Lens

output_mode

How to handle unmatched chunks:

  • "forward_all": forward both transformed and original data
  • "forward_unmatched": forward transformed if matched, otherwise original
  • "drop_unmatched": only forward transformed data (default)

TYPE: Literal['drop_unmatched', 'forward_unmatched', 'forward_all'] DEFAULT: 'drop_unmatched'

content

Optional entity path filter. When set, lenses are applied only to chunks whose entity path matches; non-matching chunks pass through unchanged regardless of output_mode.

TYPE: ContentFilter | str | Sequence[str] | None DEFAULT: None

def map(fn)

Apply a Python function to each chunk, producing exactly one output chunk. Consumes this stream.

Runs in Python (GIL-bound, sequential). For transforms that may produce zero or many chunks, use flat_map instead.

def merge(*streams) staticmethod

Merge chunks from multiple streams into one. Consumes all input streams.

All inputs execute concurrently. Chunks are yielded as they become available. Within each input, chunk order is preserved. Across inputs, ordering is non-deterministic.

def split(*, content=None, has_timeline=None, is_static=None, components=None)

Split into (matching, non_matching). Consumes this stream.

Equivalent to (stream.filter(…), stream.drop(…)), but the upstream executes only once. merge(matching, non_matching) reconstructs the original stream in a semantically lossless way (component-wise chunk splitting is not undone).

Both branches share the same upstream -- it executes once. Both branches MUST be consumed for the pipeline to complete (dropping an unconsumed branch is fine and unblocks the other).

PARAMETER DESCRIPTION
content

Entity path filter. Accepts a single expression, a list of expressions, or a ContentFilter object.

TYPE: ContentFilter | str | Sequence[str] | None DEFAULT: None

has_timeline

Only match chunks that have a column for this timeline.

TYPE: str | None DEFAULT: None

is_static

If True, match only static chunks. If False, match only temporal chunks.

TYPE: bool | None DEFAULT: None

components

Match the listed component columns. Accepts ComponentDescriptor objects or str component identifiers (e.g. "Points3D:positions"). A single value or a list are both accepted.

TYPE: ComponentDescriptor | str | Sequence[ComponentDescriptor | str] | None DEFAULT: None

def to_chunks()

Run the pipeline and return all chunks as a list.

def write_rrd(path, *, application_id, recording_id)

Run the pipeline and write all chunks to an RRD file.

The caller must provide application_id and recording_id explicitly.

class LazyStore

Index-based, on-demand chunk store.

The manifest is held in memory (so schema(), summary(), and __len__ work without loading any chunks), but chunk data is loaded only when requested.

Example: lazy = RrdReader("recording.rrd").store()

Use stream() to process chunks through the lazy pipeline, or write_rrd() to persist to disk. To fully materialize into a ChunkStore, call lazy.stream().collect().

def __len__()

Return the number of chunks described by the manifest.

def schema()

The schema describing all columns in this store, derived from the manifest.

def stream()

Return a lazy stream over all chunks in this store.

def summary()

Compact, deterministic summary of every chunk in the store.

Built from the manifest; no chunk data is loaded. Each line describes one chunk:

{entity_path}  rows={n}  static={True|False}  timelines=[…]  cols=[…]

Useful for snapshot testing.

def write_rrd(path, *, application_id, recording_id)

Write all chunks to an RRD file.

The caller must provide application_id and recording_id explicitly.

class McapReader

Read chunks from an MCAP file.

path property

The file path of the MCAP file.

def __init__(path, *, timeline_type='timestamp', timestamp_offset_ns=None, decoders=None, include_topic_regex=None, exclude_topic_regex=None)

Construct a new MCAP reader.

PARAMETER DESCRIPTION
path

Path to the .mcap file to read.

TYPE: str | Path

timeline_type

Whether to interpret the MCAP log_time column as wall-clock timestamps ("timestamp") or as nanosecond durations ("duration").

TYPE: Literal['timestamp', 'duration'] DEFAULT: 'timestamp'

timestamp_offset_ns

Optional offset in nanoseconds to add to all TimestampNs time columns.

TYPE: int | None DEFAULT: None

decoders

Optional list of MCAP decoder identifiers to enable. If omitted, all available decoders are enabled. Use McapReader.available_decoders to enumerate them.

TYPE: Sequence[str] | None DEFAULT: None

include_topic_regex

Optional list of regex patterns. If provided, only topics matching at least one pattern are decoded. Patterns use RE2 syntax and are not implicitly anchored.

TYPE: Sequence[str] | None DEFAULT: None

exclude_topic_regex

Optional list of regex patterns. Topics matching any pattern are skipped. Applied after includes. Same syntax as include_topic_regex.

TYPE: Sequence[str] | None DEFAULT: None

def available_decoders() staticmethod

Return the list of all supported MCAP decoder identifiers.

def stream()

Return a lazy stream over all chunks in the MCAP file.

class MetricsCollector

Accumulator yielded by query_metrics on __enter__.

Use last_query() / queries to read snapshots accumulated so far; both are non-destructive. On context-manager exit any remaining snapshots are drained into this collector and the scope is unbound from the ContextVar, so the collector is still readable after the scope ends.

queries property

Non-destructive snapshot of all queries captured so far.

def clear()

Drop all captured snapshots from both the Rust buffer and this collector.

def last_query()

Most recently captured query, or None if none yet.

class Mp4Reader

Read chunks from an MP4 file.

entity_path property

The entity path under which chunks are emitted.

path property

The file path of the MP4 file.

def __init__(path, *, mode='stream', chunk_by_gop=True, timeline_name='video', timeline_type='duration', allow_b_frames=False, entity_path=None)

Construct a new MP4 reader.

PARAMETER DESCRIPTION
path

Path to the .mp4 file to read.

TYPE: str | Path

mode

How to convert the mp4 into chunks.

  • "stream" (default): emit a static VideoStream(codec=…) chunk followed by per-GOP (or per-sample) VideoSample chunks. The mp4 must use a codec representable as VideoCodec (H264, H265, AV1, VP8, VP9). By default it must also not contain B-frames (DTS must equal PTS — see issue #10090); set allow_b_frames=True to opt in to B-frame inputs.
  • "asset": emit an AssetVideo blob chunk plus a VideoFrameReference index chunk, matching the behavior of rerun video.mp4.

TYPE: Literal['asset', 'stream'] DEFAULT: 'stream'

chunk_by_gop

Only meaningful when mode="stream". When True (default), each emitted Rerun chunk contains a keyframe plus all dependent samples up to (but not including) the next keyframe. When False, each sample becomes its own one-row Rerun chunk.

Passing chunk_by_gop=False together with mode="asset" raises ValueError.

TYPE: bool DEFAULT: True

timeline_name

Name of the timeline used for stream-mode samples and for the VideoFrameReference index chunk in asset mode. Defaults to "video".

TYPE: str DEFAULT: 'video'

timeline_type

How to interpret the timeline values. Applies to both modes (the stream-mode sample timeline and the asset-mode VideoFrameReference index timeline).

The emitted values are the mp4 PTS (nanoseconds since the start of the video) in both cases — only the declared Arrow type changes:

  • "duration" (default): the values are typed as a duration, the natural mp4 PTS interpretation.
  • "timestamp": the same PTS values, typed as nanoseconds since the Unix epoch. The reader does not shift them, so until you retag them — via a downstream .map(...) on the chunk stream with caller-supplied wall-clock times (e.g. from a trajectory file) — they render as timestamps near 1970.

TYPE: Literal['duration', 'timestamp'] DEFAULT: 'duration'

allow_b_frames

When False (default), mode="stream" rejects mp4s containing B-frames because the VideoStream archetype cannot yet model differing DTS/PTS (see issue #10090). Pass True when you intend to transcode the samples downstream and only need the reader to surface the raw sample bytes. The emitted time column is marked unsorted in that case.

TYPE: bool DEFAULT: False

entity_path

Entity path under which chunks are emitted. When None (default), the entity path is derived from the full file path, keeping the filename and extension (e.g. foo/video.mp4 becomes /foo/video.mp4), matching the behavior of rerun video.mp4.

TYPE: str | None DEFAULT: None

def stream()

Return a lazy stream over all chunks in the MP4 file.

class MutateLens

A mutate lens that modifies the input component in-place.

Mutate lenses apply a selector transformation to the input component, replacing it in the chunk. By default, new row IDs are generated. Pass keep_row_ids=True to preserve original row IDs.

Example usage::

lens = MutateLens("Imu:accel", Selector(".x"))
def __init__(input_component, selector, *, keep_row_ids=False)

Create a new mutate lens.

PARAMETER DESCRIPTION
input_component

The component identifier to modify in-place.

TYPE: str

selector

A Selector or selector query string to apply.

TYPE: Selector | str

keep_row_ids

When True, preserve the original row IDs.

TYPE: bool DEFAULT: False

class OptimizationProfile dataclass

Named optimization profile passed to LazyChunkStream.collect(optimize=...).

Two presets:

  • OptimizationProfile.LIVE: small chunks tuned for the live Viewer workflow.
  • OptimizationProfile.OBJECT_STORE: large chunks tuned for object-store-backed query and streaming (e.g. a catalog server).

The presets are fully concrete: every field has a value. Custom profiles built by calling OptimizationProfile(...) directly may pass None on the threshold fields to fall back to the SDK's internal default (OptimizationProfile.LIVE's thresholds).

LIVE class-attribute

Optimized for the live Viewer workflow: small chunks for low-latency rendering and fine-grained time-panel precision.

OBJECT_STORE class-attribute

Optimized for object-store-backed storage (e.g. a catalog server): larger chunks tuned for query throughput and streaming over the network.

extra_passes = 50 class-attribute instance-attribute

Number of extra convergence passes run after the initial insert.

fix_keyframe = False class-attribute instance-attribute

If True, any user-supplied VideoStream:is_keyframe data is dropped and re-derived from the encoded samples during video rebatching.

gop_batching = True class-attribute instance-attribute

If True (default), video stream chunks are rebatched to align with GoP (keyframe) boundaries after normal compaction.

GoP rebatching never splits a GoP across chunks, so streams with long keyframe intervals can produce chunks much larger than max_bytes.

max_bytes = None class-attribute instance-attribute

Chunk size threshold in bytes. None means use LIVE's default.

max_rows = None class-attribute instance-attribute

Maximum rows per sorted chunk. None means use LIVE's default.

max_rows_if_unsorted = None class-attribute instance-attribute

Maximum rows per unsorted chunk. None means use LIVE's default.

split_size_ratio = None class-attribute instance-attribute

If set, split chunks so no two archetype groups sharing a chunk differ in byte size by more than this factor. Values should be >= 1; at 1.0, every archetype is forced into its own chunk.

This keeps large columns (images, videos, blobs) out of the same chunk as small columns (scalars, transforms, text), so the viewer can fetch just the small columns without dragging along the large payload. Components belonging to the same archetype are always kept together.

A good starting value is 10.0. If None (default), no splitting is performed.

class ParquetReader

Read chunks from a Parquet file.

The reader turns raw parquet columns into grouped, time-indexed Chunks of struct/scalar components. To map those struct fields into Rerun archetypes (translation, rotation, scalars, …), apply lenses to the resulting .stream() — see DeriveLens:

Example
import rerun as rr
from rerun.experimental import ParquetReader, DeriveLens, Selector
    store = (
    ParquetReader(path, index_columns=[("frame_index", "sequence")])
    .stream()
    .lenses(
        [DeriveLens("data").to_component(rr.Scalars.descriptor_scalars(), Selector(".x"))],
        content="/obs",
    )
    .collect()
)
path property

The file path of the Parquet file.

def __init__(path, *, entity_path_prefix=None, column_grouping='prefix', delimiter='_', prefixes=None, use_structs=True, static_columns=None, index_columns=None)

Load a parquet file with configurable column grouping.

PARAMETER DESCRIPTION
path

Path to the .parquet file.

TYPE: str | Path

entity_path_prefix

Optional prefix for all entity paths (e.g. "/world").

TYPE: str | None DEFAULT: None

column_grouping

How to group columns into chunks. "prefix" splits column names on delimiter and groups by the first segment. "individual" gives each column its own chunk. "explicit_prefixes" groups columns by the explicit prefix strings in prefixes.

TYPE: str DEFAULT: 'prefix'

delimiter

Character used to split column names when column_grouping="prefix".

TYPE: str DEFAULT: '_'

prefixes

Explicit prefix strings for grouping columns. Required when column_grouping="explicit_prefixes". Columns starting with a prefix are grouped together; the prefix is stripped from the component name. Prefixes are tried longest-first to avoid ambiguity.

TYPE: list[str] | None DEFAULT: None

use_structs

When True (default) and column_grouping="prefix" or "explicit_prefixes", columns sharing a prefix are packed into a single Arrow Struct component. When False, each column becomes a separate component (the pre-struct layout). Ignored when column_grouping="individual".

TYPE: bool DEFAULT: True

static_columns

Column names whose values are constant across all rows. These are emitted once as timeless/static data. An error is raised if a listed column contains varying values.

TYPE: list[str] | None DEFAULT: None

index_columns

List of columns to use as timeline indices. Each entry is a tuple: (name, type) or (name, type, unit).

The type specifies the timeline kind:

  • "timestamp": time since epoch
  • "duration": elapsed time
  • "sequence": ordinal integer index

The unit describes what the raw integer values in the column represent (not a desired output unit). Rerun stores all timestamps in nanoseconds internally, so values are scaled accordingly. Supported: "ns" (default), "us", "ms", "s". Ignored for "sequence" type.

When omitted, a synthetic row_index sequence timeline is generated automatically (one entry per row).

TYPE: list[tuple[str, str] | tuple[str, str, str]] | None DEFAULT: None

def stream()

Return a lazy stream over all chunks in the Parquet file.

class QueryMetrics dataclass

One query's metrics, captured at the moment its last per-partition stream finished.

Mirrors the Rust-side re_datafusion::QuerySnapshot. The same numbers are produced via three transports: this dataclass (Python), DataFusion's EXPLAIN ANALYZE, and the PostHog analytics OTLP span. Field naming differs across the three:

  • Timing fields here are datetime.timedelta (total_duration, time_to_first_chunk, …). EXPLAIN ANALYZE uses DataFusion Time metrics, which print their own units. The OTLP analytics attributes keep an explicit _us suffix and carry integer microseconds (total_duration_us, time_to_first_chunk_us, …) because OTLP attribute values are scalar (i64 / f64 / bool / string) and can't carry a duration natively.
  • query_chunks_per_segment_mean is a float and does not appear in EXPLAIN ANALYZE, since DataFusion Count metrics are integer-only. The corresponding _min / _max integer fields are surfaced in all three transports.

Note: fetch_direct_max_attempt is the per-partition max summed across partitions (rather than the cross-partition true max), because DataFusion's MetricsSet::Count lacks a fetch_max aggregation. For single-partition queries the two are identical.

fetch_bytes property

Total bytes fetched across both gRPC and direct transports.

fetch_requests property

Total fetch requests across both gRPC and direct transports.

class RrdReader

Read chunks from an RRD file.

Use recordings() or blueprints() to discover what stores exist in the file, then stream() or store() to access a specific one. When no store is specified, the first recording store is used.

path property

The file path of the RRD file.

def blueprints()

List the blueprint entries in this RRD file.

def recordings()

List the recording entries in this RRD file.

def store(*, store=None)

Open a specific store as a LazyStore.

Reads the manifest immediately; chunk data is loaded on demand. Legacy RRDs without a footer/manifest are not supported here — use RrdReader(...).stream().collect() for those.

PARAMETER DESCRIPTION
store

Which store to load. If None, uses the first recording store.

TYPE: StoreEntry | None DEFAULT: None

RAISES DESCRIPTION
ValueError

If the specified store is not in this RRD file, or None was passed and the file contains no recording stores.

def stream(*, store=None)

Return a lazy stream over chunks from a store.

PARAMETER DESCRIPTION
store

Which store to stream. If None, uses the first recording store.

TYPE: StoreEntry | None DEFAULT: None

RAISES DESCRIPTION
ValueError

If the specified store is not in this RRD file, or None was passed and the file contains no recording stores.

class Selector

A jq-like query selector for Arrow arrays.

Selectors provide a path-based query language (inspired by jq) that operates on Arrow arrays in a columnar fashion.

Syntax overview:

  • .field — access a named field in a struct
  • [] — iterate over every element of a list
  • [N] — index into a list by position
  • ? — error suppression / optional operator
  • ! — assert non-null
  • | — pipe the output of one expression to another

Example usage::

selector = Selector(".location")
result = selector.execute(my_struct_array)

Selectors can also be piped into Python functions::

selector = Selector(".values").pipe(lambda arr: pa.compute.multiply(arr, 2))
result = selector.execute(my_struct_array)
def __init__(query)

Parse a selector from a query string.

PARAMETER DESCRIPTION
query

The selector query string (e.g. ".field", ".foo | .bar").

TYPE: str

def execute(source)

Execute this selector against a pyarrow array.

PARAMETER DESCRIPTION
source

The input Arrow array to query.

TYPE: Array

RETURNS DESCRIPTION
The result array, or None if the selector's error was suppressed.
def execute_per_row(source)

Execute this selector against each row of a pyarrow list array.

The output is guaranteed to have the same number of rows as the input.

PARAMETER DESCRIPTION
source

The input Arrow list array to query.

TYPE: Array

RETURNS DESCRIPTION
The result list array, or None if the selector's error was suppressed.
def pipe(func)

Pipe the output of this selector through a transformation function or another selector.

Returns a new selector; the original is not modified.

PARAMETER DESCRIPTION
func

A callable that accepts a pyarrow.Array and returns a pyarrow.Array or None, or another Selector to chain.

TYPE: Callable[[Array], Array | None] | Selector

RETURNS DESCRIPTION
A new [`Selector`][rerun.experimental.Selector] with the transformation applied.

class StoreEntry

Describes a store found in an RRD file.

application_id property

The application ID of the store.

kind property

Store kind: "recording" or "blueprint".

recording_id property

The recording ID of the store.

class StreamingReader

Bases: Protocol

Protocol for readers that produce a sequential stream of chunks.

All readers provide stream() -> LazyChunkStream. Readers for indexable formats will additionally satisfy IndexedReader, which adds store() -> LazyStore and load() -> ChunkStore.

def stream()

Return a lazy stream over all chunks from this source.

class ViewerClient

A connection to an instance of a Rerun viewer.

Use the connect classmethod to attach to an already-running viewer, or spawn to start a fresh one (e.g. in headless mode for CI screenshots).

Spawned-viewer teardown:

  • Explicit close always terminates the spawned viewer.
  • For an attached viewer (detach_process=False), exiting a with block or garbage-collecting the client also terminates the viewer.
  • A detached viewer keeps running through with exits and garbage collection. Only an explicit close() shuts it down.

Warning

This API is experimental and may change or be removed in future versions.

url property

The rerun+http://…/proxy URL of the viewer this client is connected to.

def __init__(url=_DEFAULT_URL, *, _pid=None, _kill_on_exit=False)

Low-level constructor.

Prefer ViewerClient.connect or ViewerClient.spawn.

PARAMETER DESCRIPTION
url

The URL to connect to. The scheme must be one of rerun://, rerun+http://, or rerun+https://, and the pathname must be /proxy — the same form accepted by rerun.connect_grpc. Defaults to rerun+http://127.0.0.1:9876/proxy.

TYPE: str DEFAULT: _DEFAULT_URL

_pid

Internal — set by spawn() to the pid of the launched viewer so that close() can terminate it.

TYPE: int | None DEFAULT: None

_kill_on_exit

Internal — set by spawn() to indicate that implicit teardown (__exit__, __del__) should call close(). See the class docstring for the full teardown rules.

TYPE: bool DEFAULT: False

def close()

Close the client, terminating the spawned viewer.

Emits a UserWarning and is a no-op if there is no spawned viewer to terminate (either the client never spawned one, or it has already been closed). Safe to call multiple times — only the first call has an effect.

def connect(url=_DEFAULT_URL) classmethod

Connect to an already-running viewer.

PARAMETER DESCRIPTION
url

The URL to connect to. The scheme must be one of rerun://, rerun+http://, or rerun+https://, and the pathname must be /proxy — the same form accepted by rerun.connect_grpc. Defaults to rerun+http://127.0.0.1:9876/proxy.

TYPE: str DEFAULT: _DEFAULT_URL

def save_screenshot(file_path, view_id=None)

Save a screenshot to a file.

Warning

This API is experimental and may change or be removed in future versions.

PARAMETER DESCRIPTION
file_path

The path where the screenshot will be saved.

Important

This path is relative to the viewer's filesystem, not the client's. If your viewer runs on a different machine, the screenshot will be saved there.

TYPE: str

view_id

Optional view ID to screenshot. If None, screenshots the entire viewer.

TYPE: str | UUID | None DEFAULT: None

def send_table(name, table)

Send a table to the viewer.

A table is represented as a dataframe defined by an Arrow record batch.

PARAMETER DESCRIPTION
name

The table name.

Note

The table name serves as an identifier. If you send a table with the same name twice, the second table will replace the first one.

TYPE: str

table

The Arrow RecordBatch containing the table data to send.

TYPE: RecordBatch | list[RecordBatch] | DataFrame

def spawn(*, headless=False, port=9876, memory_limit='75%', server_memory_limit='1GiB', hide_welcome_screen=False, detach_process=None, executable_name='rerun', executable_path=None) classmethod

Spawn a fresh viewer process and connect to it.

PARAMETER DESCRIPTION
headless

Run the spawned viewer in headless mode (no OS window). The viewer still listens for gRPC connections, so the SDK can keep logging data and request screenshots via save_screenshot.

A working graphics stack must be present — either a real GPU/driver or a software rasterizer like Mesa's lavapipe. In a bare CI container with no Vulkan adapter, the viewer panics on startup with "No graphics adapter was found".

TYPE: bool DEFAULT: False

port

The port to listen on.

TYPE: int DEFAULT: 9876

memory_limit

An upper limit on how much memory the Rerun Viewer should use. When this limit is reached, Rerun will drop the oldest data. Example: 16GB or 50% (of system total).

TYPE: str DEFAULT: '75%'

server_memory_limit

An upper limit on how much memory the gRPC server running in the same process as the Rerun Viewer should use. When this limit is reached, Rerun will drop the oldest data. Example: 16GB or 50% (of system total).

Defaults to 1GiB.

TYPE: str DEFAULT: '1GiB'

hide_welcome_screen

Hide the normal Rerun welcome screen.

TYPE: bool DEFAULT: False

detach_process

Detach the spawned viewer from this Python process.

A detached viewer survives unexpected parent termination (e.g. crashes or terminal hang-up), with block exits, and garbage collection — to take it down you must call close explicitly. An attached viewer is killed by all of those.

Defaults to True for a regular GUI viewer and False when headless=True, since a leftover invisible viewer is rarely what you want.

TYPE: bool | None DEFAULT: None

executable_name

Specifies the name of the Rerun executable. You can omit the .exe suffix on Windows.

Defaults to rerun.

TYPE: str DEFAULT: 'rerun'

executable_path

Enforce a specific executable to use instead of searching through PATH for executable_name.

Unspecified by default.

TYPE: str | None DEFAULT: None

def query_metrics()

Capture DataFusion query metrics for every query that runs inside the with block.

Yields a MetricsCollector; read .last_query() or .queries mid-scope or after the scope exits.

The scope is bound to the current contextvars.Context: every re_datafusion query built from dataset.reader(…) while this scope is open contributes a QueryMetrics record. Nested query_metrics() scopes each see queries built inside them. Queries built in another thread or asyncio task that did not inherit this context (e.g. a raw threading.Thread rather than one started via contextvars.copy_context()) are not captured.

The collectors are bound to a query at reader() time, so a df built inside the with block whose .collect() runs after __exit__ still flows to the collector; a df built outside but executed inside does not.

Examples:

import rerun as rr
from rerun.experimental import query_metrics

client = rr.catalog.CatalogClient("rerun://…")
dataset = client.get_dataset(name="…")

with query_metrics() as m:
    df = dataset.reader(index="time_1").limit(100)
    df.collect()
    print(m.last_query())

def send_chunks(chunks, *, recording=None)

Send chunks to a recording stream. Blocks until every chunk has been queued.

Note

For a LazyChunkStream and LazyStore inputs, this call triggers execution and/or loading and will block for the duration of this process.

PARAMETER DESCRIPTION
chunks

One of:

  • A single Chunk.
  • A LazyChunkStream — consume the stream and forward all chunks to the recording stream.
  • A LazyStore — send all chunks to the recording stream. This triggers loading all chunks from the source.
  • A ChunkStore — send all chunks to the recording stream (fast since all chunks are already loaded).
  • Any iterable of Chunk objects.

Source store identity (application_id, recording_id) is not preserved: chunks adopt the destination recording's identity.

TYPE: Chunk | LazyChunkStream | LazyStore | ChunkStore | Iterable[Chunk]

recording

Recording stream to send into. Defaults to the current active recording.

TYPE: RecordingStream | None DEFAULT: None