pub struct Chunk {
pub(crate) id: ChunkId,
pub(crate) entity_path: EntityPath,
pub(crate) heap_size_bytes: AtomicU64,
pub(crate) is_sorted: bool,
pub(crate) row_ids: StructArray,
pub(crate) timelines: HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>,
pub(crate) components: ChunkComponents,
}
Expand description
Dense arrow-based storage of N rows of multi-component multi-temporal data for a specific entity.
This is our core datastructure for logging, storing, querying and transporting data around.
The chunk as a whole is always ascendingly sorted by RowId
before it gets manipulated in any way.
Its time columns might or might not be ascendingly sorted, depending on how the data was logged.
This is the in-memory representation of a chunk, optimized for efficient manipulation of the
data within. For transport, see crate::TransportChunk
instead.
Fields§
§id: ChunkId
§entity_path: EntityPath
§heap_size_bytes: AtomicU64
§is_sorted: bool
§row_ids: StructArray
§timelines: HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>
§components: ChunkComponents
Implementations§
source§impl Chunk
impl Chunk
sourcepub fn builder(entity_path: EntityPath) -> ChunkBuilder
pub fn builder(entity_path: EntityPath) -> ChunkBuilder
Initializes a new ChunkBuilder
.
sourcepub fn builder_with_id(id: ChunkId, entity_path: EntityPath) -> ChunkBuilder
pub fn builder_with_id(id: ChunkId, entity_path: EntityPath) -> ChunkBuilder
Initializes a new ChunkBuilder
.
The final Chunk
will have the specified id
.
source§impl Chunk
impl Chunk
sourcepub fn get_first_component(
&self,
component_name: &ComponentName,
) -> Option<&GenericListArray<i32>>
pub fn get_first_component( &self, component_name: &ComponentName, ) -> Option<&GenericListArray<i32>>
Returns any list-array that matches the given ComponentName
.
This is undefined behavior if there are more than one component with that name.
source§impl Chunk
impl Chunk
sourcepub fn are_similar(lhs: &Chunk, rhs: &Chunk) -> bool
pub fn are_similar(lhs: &Chunk, rhs: &Chunk) -> bool
Returns true
is two Chunk
s are similar, although not byte-for-byte equal.
In particular, this ignores chunks and row IDs, as well as temporal timestamps.
Useful for tests.
pub fn are_equal(&self, other: &Chunk) -> bool
source§impl Chunk
impl Chunk
sourcepub fn clone_as(&self, id: ChunkId, first_row_id: RowId) -> Chunk
pub fn clone_as(&self, id: ChunkId, first_row_id: RowId) -> Chunk
Clones the chunk and assign new IDs to the resulting chunk and its rows.
first_row_id
will become the RowId
of the first row in the duplicated chunk.
Each row after that will be monotonically increasing.
sourcepub fn into_static(self) -> Chunk
pub fn into_static(self) -> Chunk
Clones the chunk into a new chunk without any time data.
sourcepub fn zeroed(self) -> Chunk
pub fn zeroed(self) -> Chunk
Clones the chunk into a new chunk where all RowId
s are RowId::ZERO
.
sourcepub fn time_range_per_component(
&self,
) -> HashMap<Timeline, HashMap<ComponentName, HashMap<ComponentDescriptor, ResolvedTimeRange, BuildHasherDefault<NoHashHasher<ComponentDescriptor>>>, BuildHasherDefault<NoHashHasher<ComponentName>>>, BuildHasherDefault<NoHashHasher<Timeline>>>
pub fn time_range_per_component( &self, ) -> HashMap<Timeline, HashMap<ComponentName, HashMap<ComponentDescriptor, ResolvedTimeRange, BuildHasherDefault<NoHashHasher<ComponentDescriptor>>>, BuildHasherDefault<NoHashHasher<ComponentName>>>, BuildHasherDefault<NoHashHasher<Timeline>>>
Computes the time range covered by each individual component column on each timeline.
This is different from the time range covered by the Chunk
as a whole because component
columns are potentially sparse.
This is crucial for indexing and queries to work properly.
sourcepub fn num_events_cumulative(&self) -> u64
pub fn num_events_cumulative(&self) -> u64
The cumulative number of events in this chunk.
I.e. how many component batches (“cells”) were logged in total?
sourcepub fn num_events_cumulative_per_unique_time(
&self,
timeline: &Timeline,
) -> Vec<(TimeInt, u64)>
pub fn num_events_cumulative_per_unique_time( &self, timeline: &Timeline, ) -> Vec<(TimeInt, u64)>
The cumulative number of events in this chunk for each unique timestamp.
I.e. how many component batches (“cells”) were logged in total at each timestamp?
Keep in mind that a timestamp can appear multiple times in a Chunk
.
This method will do a sum accumulation to account for these cases (i.e. every timestamp in
the returned vector is guaranteed to be unique).
sourcepub fn num_events_for_component(
&self,
component_name: ComponentName,
) -> Option<u64>
pub fn num_events_for_component( &self, component_name: ComponentName, ) -> Option<u64>
The number of events in this chunk for the specified component.
I.e. how many component batches (“cells”) were logged in total for this component?
sourcepub fn row_id_range_per_component(
&self,
) -> HashMap<ComponentName, HashMap<ComponentDescriptor, (RowId, RowId), BuildHasherDefault<NoHashHasher<ComponentDescriptor>>>, BuildHasherDefault<NoHashHasher<ComponentName>>>
pub fn row_id_range_per_component( &self, ) -> HashMap<ComponentName, HashMap<ComponentDescriptor, (RowId, RowId), BuildHasherDefault<NoHashHasher<ComponentDescriptor>>>, BuildHasherDefault<NoHashHasher<ComponentName>>>
Computes the RowId
range covered by each individual component column on each timeline.
This is different from the RowId
range covered by the Chunk
as a whole because component
columns are potentially sparse.
This is crucial for indexing and queries to work properly.
source§impl Chunk
impl Chunk
sourcepub fn new(
id: ChunkId,
entity_path: EntityPath,
is_sorted: Option<bool>,
row_ids: StructArray,
timelines: HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>,
components: ChunkComponents,
) -> Result<Chunk, ChunkError>
pub fn new( id: ChunkId, entity_path: EntityPath, is_sorted: Option<bool>, row_ids: StructArray, timelines: HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>, components: ChunkComponents, ) -> Result<Chunk, ChunkError>
Creates a new Chunk
.
This will fail if the passed in data is malformed in any way – see Self::sanity_check
for details.
Iff you know for sure whether the data is already appropriately sorted or not, specify is_sorted
.
When left unspecified (None
), it will be computed in O(n) time.
For a row-oriented constructor, see Self::builder
.
sourcepub fn from_native_row_ids(
id: ChunkId,
entity_path: EntityPath,
is_sorted: Option<bool>,
row_ids: &[RowId],
timelines: HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>,
components: ChunkComponents,
) -> Result<Chunk, ChunkError>
pub fn from_native_row_ids( id: ChunkId, entity_path: EntityPath, is_sorted: Option<bool>, row_ids: &[RowId], timelines: HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>, components: ChunkComponents, ) -> Result<Chunk, ChunkError>
Creates a new Chunk
.
This will fail if the passed in data is malformed in any way – see Self::sanity_check
for details.
Iff you know for sure whether the data is already appropriately sorted or not, specify is_sorted
.
When left unspecified (None
), it will be computed in O(n) time.
For a row-oriented constructor, see Self::builder
.
sourcepub fn from_auto_row_ids(
id: ChunkId,
entity_path: EntityPath,
timelines: HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>,
components: ChunkComponents,
) -> Result<Chunk, ChunkError>
pub fn from_auto_row_ids( id: ChunkId, entity_path: EntityPath, timelines: HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>, components: ChunkComponents, ) -> Result<Chunk, ChunkError>
Creates a new Chunk
.
This will fail if the passed in data is malformed in any way – see Self::sanity_check
for details.
The data is assumed to be sorted in RowId
-order. Sequential RowId
s will be generated for each
row in the chunk.
sourcepub fn new_static(
id: ChunkId,
entity_path: EntityPath,
is_sorted: Option<bool>,
row_ids: StructArray,
components: ChunkComponents,
) -> Result<Chunk, ChunkError>
pub fn new_static( id: ChunkId, entity_path: EntityPath, is_sorted: Option<bool>, row_ids: StructArray, components: ChunkComponents, ) -> Result<Chunk, ChunkError>
Simple helper for Self::new
for static data.
For a row-oriented constructor, see Self::builder
.
pub fn empty(id: ChunkId, entity_path: EntityPath) -> Chunk
sourcepub fn add_component(
&mut self,
component_desc: ComponentDescriptor,
list_array: GenericListArray<i32>,
) -> Result<(), ChunkError>
pub fn add_component( &mut self, component_desc: ComponentDescriptor, list_array: GenericListArray<i32>, ) -> Result<(), ChunkError>
Unconditionally inserts an ArrowListArray
as a component column.
Removes and replaces the column if it already exists.
This will fail if the end result is malformed in any way – see Self::sanity_check
.
sourcepub fn add_timeline(
&mut self,
chunk_timeline: TimeColumn,
) -> Result<(), ChunkError>
pub fn add_timeline( &mut self, chunk_timeline: TimeColumn, ) -> Result<(), ChunkError>
Unconditionally inserts a TimeColumn
.
Removes and replaces the column if it already exists.
This will fail if the end result is malformed in any way – see Self::sanity_check
.
source§impl Chunk
impl Chunk
pub fn id(&self) -> ChunkId
pub fn entity_path(&self) -> &EntityPath
sourcepub fn num_columns(&self) -> usize
pub fn num_columns(&self) -> usize
How many columns in total? Includes control, time, and component columns.
pub fn num_controls(&self) -> usize
pub fn num_timelines(&self) -> usize
pub fn num_components(&self) -> usize
pub fn num_rows(&self) -> usize
pub fn is_empty(&self) -> bool
pub fn row_ids_array(&self) -> &StructArray
sourcepub fn row_ids_raw(
&self,
) -> (&PrimitiveArray<UInt64Type>, &PrimitiveArray<UInt64Type>)
pub fn row_ids_raw( &self, ) -> (&PrimitiveArray<UInt64Type>, &PrimitiveArray<UInt64Type>)
Returns the RowId
s in their raw-est form: a tuple of (times, counters) arrays.
sourcepub fn row_ids(&self) -> impl Iterator<Item = RowId>
pub fn row_ids(&self) -> impl Iterator<Item = RowId>
All the RowId
in this chunk.
This could be in any order if this chunk is unsorted.
sourcepub fn component_row_ids(
&self,
component_name: &ComponentName,
) -> impl Iterator<Item = RowId>
pub fn component_row_ids( &self, component_name: &ComponentName, ) -> impl Iterator<Item = RowId>
Returns an iterator over the RowId
s of a Chunk
, for a given component.
This is different than Self::row_ids
: it will only yield RowId
s for rows at which
there is data for the specified component_name
.
sourcepub fn row_id_range(&self) -> Option<(RowId, RowId)>
pub fn row_id_range(&self) -> Option<(RowId, RowId)>
pub fn is_static(&self) -> bool
pub fn timelines( &self, ) -> &HashMap<Timeline, TimeColumn, BuildHasherDefault<NoHashHasher<Timeline>>>
pub fn component_names(&self) -> impl Iterator<Item = ComponentName>
pub fn component_descriptors(&self) -> impl Iterator<Item = ComponentDescriptor>
pub fn components(&self) -> &ChunkComponents
sourcepub fn timepoint_max(&self) -> TimePoint
pub fn timepoint_max(&self) -> TimePoint
Computes the maximum value for each and every timeline present across this entire chunk,
and returns the corresponding TimePoint
.
source§impl Chunk
impl Chunk
sourcepub fn sanity_check(&self) -> Result<(), ChunkError>
pub fn sanity_check(&self) -> Result<(), ChunkError>
Returns an error if the Chunk’s invariants are not upheld.
Costly checks are only run in debug builds.
source§impl Chunk
impl Chunk
sourcepub fn component_batch_raw(
&self,
component_name: &ComponentName,
row_index: usize,
) -> Option<Result<Arc<dyn Array>, ChunkError>>
pub fn component_batch_raw( &self, component_name: &ComponentName, row_index: usize, ) -> Option<Result<Arc<dyn Array>, ChunkError>>
Returns the raw data for the specified component.
Returns an error if the row index is out of bounds.
sourcepub fn component_batch<C>(
&self,
row_index: usize,
) -> Option<Result<Vec<C>, ChunkError>>where
C: Component,
pub fn component_batch<C>(
&self,
row_index: usize,
) -> Option<Result<Vec<C>, ChunkError>>where
C: Component,
Returns the deserialized data for the specified component.
Returns an error if the data cannot be deserialized, or if the row index is out of bounds.
sourcepub fn component_instance_raw(
&self,
component_name: &ComponentName,
row_index: usize,
instance_index: usize,
) -> Option<Result<Arc<dyn Array>, ChunkError>>
pub fn component_instance_raw( &self, component_name: &ComponentName, row_index: usize, instance_index: usize, ) -> Option<Result<Arc<dyn Array>, ChunkError>>
Returns the raw data for the specified component at the given instance index.
Returns an error if either the row index or instance index are out of bounds.
sourcepub fn component_instance<C>(
&self,
row_index: usize,
instance_index: usize,
) -> Option<Result<C, ChunkError>>where
C: Component,
pub fn component_instance<C>(
&self,
row_index: usize,
instance_index: usize,
) -> Option<Result<C, ChunkError>>where
C: Component,
Returns the component data of the specified instance.
Returns an error if the data cannot be deserialized, or if either the row index or instance index are out of bounds.
sourcepub fn component_mono_raw(
&self,
component_name: &ComponentName,
row_index: usize,
) -> Option<Result<Arc<dyn Array>, ChunkError>>
pub fn component_mono_raw( &self, component_name: &ComponentName, row_index: usize, ) -> Option<Result<Arc<dyn Array>, ChunkError>>
Returns the raw data for the specified component, assuming a mono-batch.
Returns an error if either the row index is out of bounds, or the underlying batch is not of unit length.
sourcepub fn component_mono<C>(
&self,
row_index: usize,
) -> Option<Result<C, ChunkError>>where
C: Component,
pub fn component_mono<C>(
&self,
row_index: usize,
) -> Option<Result<C, ChunkError>>where
C: Component,
Returns the deserialized data for the specified component, assuming a mono-batch.
Returns an error if the data cannot be deserialized, or if either the row index is out of bounds, or the underlying batch is not of unit length.
source§impl Chunk
impl Chunk
sourcepub fn to_unit(self: &Arc<Chunk>) -> Option<UnitChunkShared>
pub fn to_unit(self: &Arc<Chunk>) -> Option<UnitChunkShared>
Turns the chunk into a UnitChunkShared
, if possible.
sourcepub fn into_unit(self) -> Option<UnitChunkShared>
pub fn into_unit(self) -> Option<UnitChunkShared>
Turns the chunk into a UnitChunkShared
, if possible.
source§impl Chunk
impl Chunk
sourcepub fn iter_indices(
&self,
timeline: &Timeline,
) -> impl Iterator<Item = (TimeInt, RowId)>
pub fn iter_indices( &self, timeline: &Timeline, ) -> impl Iterator<Item = (TimeInt, RowId)>
Returns an iterator over the indices ((TimeInt, RowId)
) of a Chunk
, for a given timeline.
If the chunk is static, timeline
will be ignored.
See also:
sourcepub fn iter_component_indices(
&self,
timeline: &Timeline,
component_name: &ComponentName,
) -> impl Iterator<Item = (TimeInt, RowId)>
pub fn iter_component_indices( &self, timeline: &Timeline, component_name: &ComponentName, ) -> impl Iterator<Item = (TimeInt, RowId)>
Returns an iterator over the indices ((TimeInt, RowId)
) of a Chunk
, for a given
timeline and component.
If the chunk is static, timeline
will be ignored.
This is different than Self::iter_indices
in that it will only yield indices for rows
at which there is data for the specified component_name
.
See also Self::iter_indices
.
sourcepub fn iter_timepoints(&self) -> impl Iterator<Item = TimePoint>
pub fn iter_timepoints(&self) -> impl Iterator<Item = TimePoint>
sourcepub fn iter_component_timepoints(
&self,
component_name: &ComponentName,
) -> impl Iterator<Item = TimePoint>
pub fn iter_component_timepoints( &self, component_name: &ComponentName, ) -> impl Iterator<Item = TimePoint>
Returns an iterator over the TimePoint
s of a Chunk
, for a given component.
This is different than Self::iter_timepoints
in that it will only yield timepoints for rows
at which there is data for the specified component_name
.
See also Self::iter_timepoints
.
sourcepub fn iter_component_offsets(
&self,
component_name: &ComponentName,
) -> impl Iterator<Item = (usize, usize)>
pub fn iter_component_offsets( &self, component_name: &ComponentName, ) -> impl Iterator<Item = (usize, usize)>
Returns an iterator over the offsets ((offset, len)
) of a Chunk
, for a given
component.
I.e. each (offset, len)
pair describes the position of a component batch in the
underlying arrow array of values.
sourcepub fn iter_slices<'a, S>(
&'a self,
component_name: ComponentName,
) -> impl Iterator<Item = <S as ChunkComponentSlicer>::Item<'a>> + 'awhere
S: 'a + ChunkComponentSlicer,
pub fn iter_slices<'a, S>(
&'a self,
component_name: ComponentName,
) -> impl Iterator<Item = <S as ChunkComponentSlicer>::Item<'a>> + 'awhere
S: 'a + ChunkComponentSlicer,
Returns an iterator over the all the sliced component batches in a Chunk
’s column, for
a given component.
The generic S
parameter will decide the type of data returned. It is very permissive.
See ChunkComponentSlicer
for all the available implementations.
This is a very fast path: the entire column will be downcasted at once, and then every component batch will be a slice reference into that global slice.
See also Self::iter_slices_from_struct_field
.
sourcepub fn iter_slices_from_struct_field<'a, S>(
&'a self,
component_name: ComponentName,
field_name: &'a str,
) -> impl Iterator<Item = <S as ChunkComponentSlicer>::Item<'a>> + 'awhere
S: 'a + ChunkComponentSlicer,
pub fn iter_slices_from_struct_field<'a, S>(
&'a self,
component_name: ComponentName,
field_name: &'a str,
) -> impl Iterator<Item = <S as ChunkComponentSlicer>::Item<'a>> + 'awhere
S: 'a + ChunkComponentSlicer,
Returns an iterator over the all the sliced component batches in a Chunk
’s column, for
a specific struct field of given component.
The target component must be a StructArray
.
The generic S
parameter will decide the type of data returned. It is very permissive.
See ChunkComponentSlicer
for all the available implementations.
This is a very fast path: the entire column will be downcasted at once, and then every component batch will be a slice reference into that global slice.
See also Self::iter_slices_from_struct_field
.
source§impl Chunk
impl Chunk
sourcepub fn iter_indices_owned(
self: Arc<Chunk>,
timeline: &Timeline,
) -> impl Iterator<Item = (TimeInt, RowId)>
pub fn iter_indices_owned( self: Arc<Chunk>, timeline: &Timeline, ) -> impl Iterator<Item = (TimeInt, RowId)>
Returns an iterator over the indices ((TimeInt, RowId)
) of a Chunk
, for a given timeline.
If the chunk is static, timeline
will be ignored.
The returned iterator outlives self
, thus it can be passed around freely.
The tradeoff is that self
must be an Arc
.
See also Self::iter_indices
.
source§impl Chunk
impl Chunk
sourcepub fn iter_component<C>(
&self,
) -> ChunkComponentIter<C, impl Iterator<Item = (usize, usize)>>where
C: Component,
pub fn iter_component<C>(
&self,
) -> ChunkComponentIter<C, impl Iterator<Item = (usize, usize)>>where
C: Component,
Returns an iterator over the deserialized batches of a Chunk
, for a given component.
This is a dedicated fast path: the entire column will be downcasted and deserialized at once, and then every component batch will be a slice reference into that global slice. Use this when working with complex arrow datatypes and performance matters (e.g. ranging through enum types across many timestamps).
TODO(#5305): Note that, while this is much faster than deserializing each row individually, this still uses the old codegen’d deserialization path, which does some very unidiomatic Arrow things, and is therefore very slow at the moment. Avoid this on performance critical paths.
See also:
source§impl Chunk
impl Chunk
sourcepub fn latest_at(
&self,
query: &LatestAtQuery,
component_name: ComponentName,
) -> Chunk
pub fn latest_at( &self, query: &LatestAtQuery, component_name: ComponentName, ) -> Chunk
Runs a LatestAtQuery
filter on a Chunk
.
This behaves as a row-based filter: the result is a new Chunk
that is vertically
sliced to only contain the row relevant for the specified query
.
The resulting Chunk
is guaranteed to contain all the same columns has the queried
chunk: there is no horizontal slicing going on.
An empty Chunk
(i.e. 0 rows, but N columns) is returned if the query
yields nothing.
Because the resulting chunk doesn’t discard any column information, you can find extra relevant
information by inspecting the data, for examples timestamps on other timelines.
See Self::timeline_sliced
and Self::component_sliced
if you do want to filter this
extra data.
source§impl Chunk
impl Chunk
sourcepub fn concatenated(&self, rhs: &Chunk) -> Result<Chunk, ChunkError>
pub fn concatenated(&self, rhs: &Chunk) -> Result<Chunk, ChunkError>
Concatenates two Chunk
s into a new one.
The order of the arguments matter: self
‘s contents will precede rhs
’ contents in the
returned Chunk
.
This will return an error if the chunks are not concatenable.
sourcepub fn overlaps_on_row_id(&self, rhs: &Chunk) -> bool
pub fn overlaps_on_row_id(&self, rhs: &Chunk) -> bool
Returns true
if self
and rhs
overlap on their RowId
range.
sourcepub fn overlaps_on_time(&self, rhs: &Chunk) -> bool
pub fn overlaps_on_time(&self, rhs: &Chunk) -> bool
Returns true
if self
and rhs
overlap on any of their time range(s).
This does not imply that they share the same exact set of timelines.
sourcepub fn same_entity_paths(&self, rhs: &Chunk) -> bool
pub fn same_entity_paths(&self, rhs: &Chunk) -> bool
Returns true
if both chunks share the same entity path.
sourcepub fn same_timelines(&self, rhs: &Chunk) -> bool
pub fn same_timelines(&self, rhs: &Chunk) -> bool
Returns true
if both chunks contains the same set of timelines.
sourcepub fn same_datatypes(&self, rhs: &Chunk) -> bool
pub fn same_datatypes(&self, rhs: &Chunk) -> bool
Returns true
if both chunks share the same datatypes for the components that
they have in common.
sourcepub fn same_descriptors(&self, rhs: &Chunk) -> bool
pub fn same_descriptors(&self, rhs: &Chunk) -> bool
Returns true
if both chunks share the same descriptors for the components that
they have in common.
sourcepub fn concatenable(&self, rhs: &Chunk) -> bool
pub fn concatenable(&self, rhs: &Chunk) -> bool
Returns true if two chunks are concatenable.
To be concatenable, two chunks must:
- Share the same entity path.
- Share the same exact set of timelines.
- Use the same datatypes for the components they have in common.
sourcepub fn split_indicators(&mut self) -> Option<Chunk>
pub fn split_indicators(&mut self) -> Option<Chunk>
Moves all indicator components from self
into a new, dedicated chunk.
The new chunk contains only the first index from each index column, and all the indicators,
packed in a single row.
Beware: self
might be left with no component columns at all after this operation.
This greatly reduces the overhead of indicators, both in the row-oriented and column-oriented APIs. See https://github.com/rerun-io/rerun/issues/8768 for further rationale.
source§impl Chunk
impl Chunk
sourcepub fn patched_for_blueprint_021_compat(&self) -> Chunk
pub fn patched_for_blueprint_021_compat(&self) -> Chunk
A temporary migration kernel for blueprint data.
Deals with all the space-view terminology breaking changes (SpaceView
->View
, space_view
->view
, etc).
source§impl Chunk
impl Chunk
sourcepub fn range(&self, query: &RangeQuery, component_name: ComponentName) -> Chunk
pub fn range(&self, query: &RangeQuery, component_name: ComponentName) -> Chunk
Runs a RangeQuery
filter on a Chunk
.
This behaves as a row-based filter: the result is a new Chunk
that is vertically
sliced, sorted and filtered in order to only contain the row(s) relevant for the
specified query
.
The resulting Chunk
is guaranteed to contain all the same columns has the queried
chunk: there is no horizontal slicing going on.
An empty Chunk
(i.e. 0 rows, but N columns) is returned if the query
yields nothing.
Because the resulting chunk doesn’t discard any column information, you can find extra relevant
information by inspecting the data, for examples timestamps on other timelines.
See Self::timeline_sliced
and Self::component_sliced
if you do want to filter this
extra data.
source§impl Chunk
impl Chunk
sourcepub fn is_sorted(&self) -> bool
pub fn is_sorted(&self) -> bool
Is the chunk currently ascendingly sorted by crate::RowId
?
This is O(1) (cached).
See also Self::is_sorted_uncached
.
sourcepub fn is_time_sorted(&self) -> bool
pub fn is_time_sorted(&self) -> bool
Is the chunk ascendingly sorted by time, for all of its timelines?
This is O(1) (cached).
sourcepub fn is_timeline_sorted(&self, timeline: &Timeline) -> bool
pub fn is_timeline_sorted(&self, timeline: &Timeline) -> bool
Is the chunk ascendingly sorted by time, for a specific timeline?
This is O(1) (cached).
See also Self::is_timeline_sorted_uncached
.
sourcepub fn sort_if_unsorted(&mut self)
pub fn sort_if_unsorted(&mut self)
Sort the chunk, if needed.
The underlying arrow data will be copied and shuffled in memory in order to make it contiguous.
sourcepub fn sorted_by_timeline_if_unsorted(&self, timeline: &Timeline) -> Chunk
pub fn sorted_by_timeline_if_unsorted(&self, timeline: &Timeline) -> Chunk
Returns a new Chunk
that is sorted by (<timeline>, RowId)
.
The underlying arrow data will be copied and shuffled in memory in order to make it contiguous.
This is a no-op if the underlying timeline is already sorted appropriately (happy path).
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn shuffle_random(&mut self, seed: u64)
pub fn shuffle_random(&mut self, seed: u64)
Randomly shuffles the chunk using the given seed
.
The underlying arrow data will be copied and shuffled in memory in order to make it contiguous.
source§impl Chunk
impl Chunk
sourcepub fn cell(
&self,
row_id: RowId,
component_desc: &ComponentDescriptor,
) -> Option<Arc<dyn Array>>
pub fn cell( &self, row_id: RowId, component_desc: &ComponentDescriptor, ) -> Option<Arc<dyn Array>>
Returns the cell corresponding to the specified RowId
for a given ComponentName
.
This is O(log(n))
if self.is_sorted()
, and O(n)
otherwise.
Reminder: duplicated RowId
s results in undefined behavior.
sourcepub fn row_sliced(&self, index: usize, len: usize) -> Chunk
pub fn row_sliced(&self, index: usize, len: usize) -> Chunk
Slices the Chunk
vertically.
The result is a new Chunk
with the same columns and (potentially) less rows.
This cannot fail nor panic: index
and len
will be capped so that they cannot
run out of bounds.
This can result in an empty Chunk
being returned if the slice is completely OOB.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn timeline_sliced(&self, timeline: Timeline) -> Chunk
pub fn timeline_sliced(&self, timeline: Timeline) -> Chunk
Slices the Chunk
horizontally by keeping only the selected timeline
.
The result is a new Chunk
with the same rows and (at-most) one timeline column.
All non-timeline columns will be kept as-is.
If timeline
is not found within the Chunk
, the end result will be the same as the
current chunk but without any timeline column.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn component_sliced(&self, component_name: ComponentName) -> Chunk
pub fn component_sliced(&self, component_name: ComponentName) -> Chunk
Slices the Chunk
horizontally by keeping only the selected component_name
.
The result is a new Chunk
with the same rows and (at-most) one component column.
All non-component columns will be kept as-is.
If component_name
is not found within the Chunk
, the end result will be the same as the
current chunk but without any component column.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn timelines_sliced(
&self,
timelines_to_keep: &HashSet<Timeline, BuildHasherDefault<NoHashHasher<Timeline>>>,
) -> Chunk
pub fn timelines_sliced( &self, timelines_to_keep: &HashSet<Timeline, BuildHasherDefault<NoHashHasher<Timeline>>>, ) -> Chunk
Slices the Chunk
horizontally by keeping only the selected timelines.
The result is a new Chunk
with the same rows and (at-most) the selected timeline columns.
All non-timeline columns will be kept as-is.
If none of the selected timelines exist in the Chunk
, the end result will be the same as the
current chunk but without any timeline column.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn components_sliced(
&self,
component_names: &HashSet<ComponentName, BuildHasherDefault<NoHashHasher<ComponentName>>>,
) -> Chunk
pub fn components_sliced( &self, component_names: &HashSet<ComponentName, BuildHasherDefault<NoHashHasher<ComponentName>>>, ) -> Chunk
Slices the Chunk
horizontally by keeping only the selected component_names
.
The result is a new Chunk
with the same rows and (at-most) the selected component columns.
All non-component columns will be kept as-is.
If none of the component_names
exist in the Chunk
, the end result will be the same as the
current chunk but without any component column.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn densified(&self, component_name_pov: ComponentName) -> Chunk
pub fn densified(&self, component_name_pov: ComponentName) -> Chunk
Densifies the Chunk
vertically based on the component_name
column.
Densifying here means dropping all rows where the associated value in the component_name
column is null.
The result is a new Chunk
where the component_name
column is guaranteed to be dense.
If component_name
doesn’t exist in this Chunk
, or if it is already dense, this method
is a no-op.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn emptied(&self) -> Chunk
pub fn emptied(&self) -> Chunk
Empties the Chunk
vertically.
The result is a new Chunk
with the same columns but zero rows.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn components_removed(self) -> Chunk
pub fn components_removed(self) -> Chunk
Removes all component columns from the Chunk
.
The result is a new Chunk
with the same number of rows and the same index columns, but
no components.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn deduped_latest_on_index(&self, index: &Timeline) -> Chunk
pub fn deduped_latest_on_index(&self, index: &Timeline) -> Chunk
Removes duplicate rows from sections of consecutive identical indices.
- If the
Chunk
is sorted on that index, the remaining values in the index column will be unique. - If the
Chunk
has been densified on a specific column, the resulting chunk will effectively contain the latest value of that column for each given index value.
If this is a temporal chunk and timeline
isn’t present in it, this method is a no-op.
This does not obey RowId
-ordering semantics (or any other kind of semantics for that
matter) – it merely respects how the chunk is currently laid out: no more, no less.
Sort the chunk according to the semantics you’re looking for before calling this method.
sourcepub fn filtered(&self, filter: &BooleanArray) -> Option<Chunk>
pub fn filtered(&self, filter: &BooleanArray) -> Option<Chunk>
Applies a filter kernel to the Chunk
as a whole.
Returns None
if the length of the filter does not match the number of rows in the chunk.
In release builds, filters are allowed to have null entries (they will be interpreted as false
).
In debug builds, null entries will panic.
Note: a filter
kernel copies the data in order to make the resulting arrays contiguous in memory.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
sourcepub fn taken(&self, indices: &PrimitiveArray<Int32Type>) -> Chunk
pub fn taken(&self, indices: &PrimitiveArray<Int32Type>) -> Chunk
Applies a take kernel to the Chunk
as a whole.
In release builds, indices are allowed to have null entries (they will be taken as null
s).
In debug builds, null entries will panic.
Note: a take
kernel copies the data in order to make the resulting arrays contiguous in memory.
Takes care of up- and down-casting the data back and forth on behalf of the caller.
WARNING: the returned chunk has the same old crate::ChunkId
! Change it with Self::with_id
.
source§impl Chunk
impl Chunk
sourcepub fn to_record_batch(&self) -> Result<RecordBatch, ChunkError>
pub fn to_record_batch(&self) -> Result<RecordBatch, ChunkError>
Prepare the Chunk
for transport.
It is probably a good idea to sort the chunk first.
sourcepub fn to_transport(&self) -> Result<TransportChunk, ChunkError>
pub fn to_transport(&self) -> Result<TransportChunk, ChunkError>
Prepare the Chunk
for transport.
It is probably a good idea to sort the chunk first.
pub fn from_record_batch(batch: RecordBatch) -> Result<Chunk, ChunkError>
pub fn from_transport(transport: &TransportChunk) -> Result<Chunk, ChunkError>
source§impl Chunk
impl Chunk
pub fn from_arrow_msg(msg: &ArrowMsg) -> Result<Chunk, ChunkError>
pub fn to_arrow_msg(&self) -> Result<ArrowMsg, ChunkError>
Trait Implementations§
source§impl PartialEq for Chunk
impl PartialEq for Chunk
source§impl SizeBytes for Chunk
impl SizeBytes for Chunk
source§fn heap_size_bytes(&self) -> u64
fn heap_size_bytes(&self) -> u64
self
uses on the heap. Read moresource§fn total_size_bytes(&self) -> u64
fn total_size_bytes(&self) -> u64
self
in bytes, accounting for both stack and heap space.source§fn stack_size_bytes(&self) -> u64
fn stack_size_bytes(&self) -> u64
self
on the stack, in bytes. Read moreAuto Trait Implementations§
impl !Freeze for Chunk
impl !RefUnwindSafe for Chunk
impl Send for Chunk
impl Sync for Chunk
impl Unpin for Chunk
impl !UnwindSafe for Chunk
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CheckedAs for T
impl<T> CheckedAs for T
source§fn checked_as<Dst>(self) -> Option<Dst>where
T: CheckedCast<Dst>,
fn checked_as<Dst>(self) -> Option<Dst>where
T: CheckedCast<Dst>,
source§impl<Src, Dst> CheckedCastFrom<Src> for Dstwhere
Src: CheckedCast<Dst>,
impl<Src, Dst> CheckedCastFrom<Src> for Dstwhere
Src: CheckedCast<Dst>,
source§fn checked_cast_from(src: Src) -> Option<Dst>
fn checked_cast_from(src: Src) -> Option<Dst>
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§default unsafe fn clone_to_uninit(&self, dst: *mut T)
default unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)§impl<T> Conv for T
impl<T> Conv for T
§impl<T> Downcast for Twhere
T: Any,
impl<T> Downcast for Twhere
T: Any,
§fn into_any(self: Box<T>) -> Box<dyn Any>
fn into_any(self: Box<T>) -> Box<dyn Any>
Box<dyn Trait>
(where Trait: Downcast
) to Box<dyn Any>
. Box<dyn Any>
can
then be further downcast
into Box<ConcreteType>
where ConcreteType
implements Trait
.§fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
Rc<Trait>
(where Trait: Downcast
) to Rc<Any>
. Rc<Any>
can then be
further downcast
into Rc<ConcreteType>
where ConcreteType
implements Trait
.§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
&Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &Any
’s vtable from &Trait
’s.§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
&mut Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &mut Any
’s vtable from &mut Trait
’s.§impl<T> DowncastSync for T
impl<T> DowncastSync for T
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
source§impl<Src, Dst> LosslessTryInto<Dst> for Srcwhere
Dst: LosslessTryFrom<Src>,
impl<Src, Dst> LosslessTryInto<Dst> for Srcwhere
Dst: LosslessTryFrom<Src>,
source§fn lossless_try_into(self) -> Option<Dst>
fn lossless_try_into(self) -> Option<Dst>
source§impl<Src, Dst> LossyInto<Dst> for Srcwhere
Dst: LossyFrom<Src>,
impl<Src, Dst> LossyInto<Dst> for Srcwhere
Dst: LossyFrom<Src>,
source§fn lossy_into(self) -> Dst
fn lossy_into(self) -> Dst
source§impl<T> OverflowingAs for T
impl<T> OverflowingAs for T
source§fn overflowing_as<Dst>(self) -> (Dst, bool)where
T: OverflowingCast<Dst>,
fn overflowing_as<Dst>(self) -> (Dst, bool)where
T: OverflowingCast<Dst>,
source§impl<Src, Dst> OverflowingCastFrom<Src> for Dstwhere
Src: OverflowingCast<Dst>,
impl<Src, Dst> OverflowingCastFrom<Src> for Dstwhere
Src: OverflowingCast<Dst>,
source§fn overflowing_cast_from(src: Src) -> (Dst, bool)
fn overflowing_cast_from(src: Src) -> (Dst, bool)
§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read more§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read more§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self
, then passes self.as_ref()
into the pipe function.§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self
, then passes self.as_mut()
into the pipe
function.§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self
, then passes self.deref()
into the pipe function.§impl<T> Pointable for T
impl<T> Pointable for T
source§impl<T> SaturatingAs for T
impl<T> SaturatingAs for T
source§fn saturating_as<Dst>(self) -> Dstwhere
T: SaturatingCast<Dst>,
fn saturating_as<Dst>(self) -> Dstwhere
T: SaturatingCast<Dst>,
source§impl<Src, Dst> SaturatingCastFrom<Src> for Dstwhere
Src: SaturatingCast<Dst>,
impl<Src, Dst> SaturatingCastFrom<Src> for Dstwhere
Src: SaturatingCast<Dst>,
source§fn saturating_cast_from(src: Src) -> Dst
fn saturating_cast_from(src: Src) -> Dst
§impl<T> Tap for T
impl<T> Tap for T
§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B>
of a value. Read more§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B>
of a value. Read more§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R>
view of a value. Read more§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R>
view of a value. Read more§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target
of a value. Read more§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target
of a value. Read more§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap()
only in debug builds, and is erased in release builds.§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut()
only in debug builds, and is erased in release
builds.§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow()
only in debug builds, and is erased in release
builds.§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut()
only in debug builds, and is erased in release
builds.§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref()
only in debug builds, and is erased in release
builds.§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut()
only in debug builds, and is erased in release
builds.§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref()
only in debug builds, and is erased in release
builds.