pub struct RunArray<R>where
R: RunEndIndexType,{
data_type: DataType,
run_ends: RunEndBuffer<<R as ArrowPrimitiveType>::Native>,
values: Arc<dyn Array>,
}
Expand description
An array of run-end encoded values
This encoding is variation on run-length encoding (RLE) and is good for representing data containing same values repeated consecutively.
RunArray
contains run_ends
array and values
array of same length.
The run_ends
array stores the indexes at which the run ends. The values
array
stores the value of each run. Below example illustrates how a logical array is represented in
RunArray
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┐
┌─────────────────┐ ┌─────────┐ ┌─────────────────┐
│ │ A │ │ 2 │ │ │ A │
├─────────────────┤ ├─────────┤ ├─────────────────┤
│ │ D │ │ 3 │ │ │ A │ run length of 'A' = runs_ends[0] - 0 = 2
├─────────────────┤ ├─────────┤ ├─────────────────┤
│ │ B │ │ 6 │ │ │ D │ run length of 'D' = run_ends[1] - run_ends[0] = 1
└─────────────────┘ └─────────┘ ├─────────────────┤
│ values run_ends │ │ B │
├─────────────────┤
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┘ │ B │
├─────────────────┤
RunArray │ B │ run length of 'B' = run_ends[2] - run_ends[1] = 3
length = 3 └─────────────────┘
Logical array
Contents
Fields§
§data_type: DataType
§run_ends: RunEndBuffer<<R as ArrowPrimitiveType>::Native>
§values: Arc<dyn Array>
Implementations§
§impl<R> RunArray<R>where
R: RunEndIndexType,
impl<R> RunArray<R>where
R: RunEndIndexType,
pub fn logical_len(run_ends: &PrimitiveArray<R>) -> usize
pub fn logical_len(run_ends: &PrimitiveArray<R>) -> usize
Calculates the logical length of the array encoded by the given run_ends array.
pub fn try_new(
run_ends: &PrimitiveArray<R>,
values: &dyn Array,
) -> Result<RunArray<R>, ArrowError>
pub fn try_new( run_ends: &PrimitiveArray<R>, values: &dyn Array, ) -> Result<RunArray<R>, ArrowError>
Attempts to create RunArray using given run_ends (index where a run ends) and the values (value of the run). Returns an error if the given data is not compatible with RunEndEncoded specification.
pub fn run_ends(&self) -> &RunEndBuffer<<R as ArrowPrimitiveType>::Native>
pub fn run_ends(&self) -> &RunEndBuffer<<R as ArrowPrimitiveType>::Native>
Returns a reference to RunEndBuffer
pub fn values(&self) -> &Arc<dyn Array>
pub fn values(&self) -> &Arc<dyn Array>
Returns a reference to values array
Note: any slicing of this RunArray
array is not applied to the returned array
and must be handled separately
pub fn get_start_physical_index(&self) -> usize
pub fn get_start_physical_index(&self) -> usize
Returns the physical index at which the array slice starts.
pub fn get_end_physical_index(&self) -> usize
pub fn get_end_physical_index(&self) -> usize
Returns the physical index at which the array slice ends.
pub fn downcast<V>(&self) -> Option<TypedRunArray<'_, R, V>>where
V: 'static,
pub fn downcast<V>(&self) -> Option<TypedRunArray<'_, R, V>>where
V: 'static,
Downcast this RunArray
to a TypedRunArray
use arrow_array::{Array, ArrayAccessor, RunArray, StringArray, types::Int32Type};
let orig = [Some("a"), Some("b"), None];
let run_array = RunArray::<Int32Type>::from_iter(orig);
let typed = run_array.downcast::<StringArray>().unwrap();
assert_eq!(typed.value(0), "a");
assert_eq!(typed.value(1), "b");
assert!(typed.values().is_null(2));
pub fn get_physical_index(&self, logical_index: usize) -> usize
pub fn get_physical_index(&self, logical_index: usize) -> usize
Returns index to the physical array for the given index to the logical array.
This function adjusts the input logical index based on ArrayData::offset
Performs a binary search on the run_ends array for the input index.
The result is arbitrary if logical_index >= self.len()
pub fn get_physical_indices<I>(
&self,
logical_indices: &[I],
) -> Result<Vec<usize>, ArrowError>where
I: ArrowNativeType,
pub fn get_physical_indices<I>(
&self,
logical_indices: &[I],
) -> Result<Vec<usize>, ArrowError>where
I: ArrowNativeType,
Returns the physical indices of the input logical indices. Returns error if any of the logical
index cannot be converted to physical index. The logical indices are sorted and iterated along
with run_ends array to find matching physical index. The approach used here was chosen over
finding physical index for each logical index using binary search using the function
get_physical_index
. Running benchmarks on both approaches showed that the approach used here
scaled well for larger inputs.
See https://github.com/apache/arrow-rs/pull/3622#issuecomment-1407753727 for more details.
Trait Implementations§
§impl<T> Array for RunArray<T>where
T: RunEndIndexType,
impl<T> Array for RunArray<T>where
T: RunEndIndexType,
§fn slice(&self, offset: usize, length: usize) -> Arc<dyn Array>
fn slice(&self, offset: usize, length: usize) -> Arc<dyn Array>
§fn shrink_to_fit(&mut self)
fn shrink_to_fit(&mut self)
§fn offset(&self) -> usize
fn offset(&self) -> usize
0
. Read more§fn nulls(&self) -> Option<&NullBuffer>
fn nulls(&self) -> Option<&NullBuffer>
§fn logical_nulls(&self) -> Option<NullBuffer>
fn logical_nulls(&self) -> Option<NullBuffer>
NullBuffer
that represents the logical
null values of this array, if any. Read more§fn is_nullable(&self) -> bool
fn is_nullable(&self) -> bool
false
if the array is guaranteed to not contain any logical nulls Read more§fn get_buffer_memory_size(&self) -> usize
fn get_buffer_memory_size(&self) -> usize
§fn get_array_memory_size(&self) -> usize
fn get_array_memory_size(&self) -> usize
get_buffer_memory_size()
and
includes the overhead of the data structures that contain the pointers to the various buffers.§fn null_count(&self) -> usize
fn null_count(&self) -> usize
§fn logical_null_count(&self) -> usize
fn logical_null_count(&self) -> usize
§impl<R> Clone for RunArray<R>where
R: RunEndIndexType,
impl<R> Clone for RunArray<R>where
R: RunEndIndexType,
§impl<R> Debug for RunArray<R>where
R: RunEndIndexType,
impl<R> Debug for RunArray<R>where
R: RunEndIndexType,
§impl<R> From<ArrayData> for RunArray<R>where
R: RunEndIndexType,
impl<R> From<ArrayData> for RunArray<R>where
R: RunEndIndexType,
§impl<R> From<RunArray<R>> for ArrayDatawhere
R: RunEndIndexType,
impl<R> From<RunArray<R>> for ArrayDatawhere
R: RunEndIndexType,
§impl<'a, T> FromIterator<&'a str> for RunArray<T>where
T: RunEndIndexType,
impl<'a, T> FromIterator<&'a str> for RunArray<T>where
T: RunEndIndexType,
Constructs a RunArray
from an iterator of strings.
§Example:
use arrow_array::{RunArray, PrimitiveArray, StringArray, types::Int16Type};
let test = vec!["a", "a", "b", "c"];
let array: RunArray<Int16Type> = test.into_iter().collect();
assert_eq!(
"RunArray {run_ends: [2, 3, 4], values: StringArray\n[\n \"a\",\n \"b\",\n \"c\",\n]}\n",
format!("{:?}", array)
);
§impl<'a, T> FromIterator<Option<&'a str>> for RunArray<T>where
T: RunEndIndexType,
impl<'a, T> FromIterator<Option<&'a str>> for RunArray<T>where
T: RunEndIndexType,
Constructs a RunArray
from an iterator of optional strings.
§Example:
use arrow_array::{RunArray, PrimitiveArray, StringArray, types::Int16Type};
let test = vec!["a", "a", "b", "c", "c"];
let array: RunArray<Int16Type> = test
.iter()
.map(|&x| if x == "b" { None } else { Some(x) })
.collect();
assert_eq!(
"RunArray {run_ends: [2, 3, 5], values: StringArray\n[\n \"a\",\n null,\n \"c\",\n]}\n",
format!("{:?}", array)
);
Auto Trait Implementations§
impl<R> Freeze for RunArray<R>
impl<R> !RefUnwindSafe for RunArray<R>
impl<R> Send for RunArray<R>
impl<R> Sync for RunArray<R>
impl<R> Unpin for RunArray<R>
impl<R> !UnwindSafe for RunArray<R>
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CheckedAs for T
impl<T> CheckedAs for T
source§fn checked_as<Dst>(self) -> Option<Dst>where
T: CheckedCast<Dst>,
fn checked_as<Dst>(self) -> Option<Dst>where
T: CheckedCast<Dst>,
source§impl<Src, Dst> CheckedCastFrom<Src> for Dstwhere
Src: CheckedCast<Dst>,
impl<Src, Dst> CheckedCastFrom<Src> for Dstwhere
Src: CheckedCast<Dst>,
source§fn checked_cast_from(src: Src) -> Option<Dst>
fn checked_cast_from(src: Src) -> Option<Dst>
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§default unsafe fn clone_to_uninit(&self, dst: *mut T)
default unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)§impl<T> Conv for T
impl<T> Conv for T
§impl<T> Downcast for Twhere
T: Any,
impl<T> Downcast for Twhere
T: Any,
§fn into_any(self: Box<T>) -> Box<dyn Any>
fn into_any(self: Box<T>) -> Box<dyn Any>
Box<dyn Trait>
(where Trait: Downcast
) to Box<dyn Any>
. Box<dyn Any>
can
then be further downcast
into Box<ConcreteType>
where ConcreteType
implements Trait
.§fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
Rc<Trait>
(where Trait: Downcast
) to Rc<Any>
. Rc<Any>
can then be
further downcast
into Rc<ConcreteType>
where ConcreteType
implements Trait
.§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
&Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &Any
’s vtable from &Trait
’s.§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
&mut Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &mut Any
’s vtable from &mut Trait
’s.§impl<T> DowncastSync for T
impl<T> DowncastSync for T
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
source§impl<Src, Dst> LosslessTryInto<Dst> for Srcwhere
Dst: LosslessTryFrom<Src>,
impl<Src, Dst> LosslessTryInto<Dst> for Srcwhere
Dst: LosslessTryFrom<Src>,
source§fn lossless_try_into(self) -> Option<Dst>
fn lossless_try_into(self) -> Option<Dst>
source§impl<Src, Dst> LossyInto<Dst> for Srcwhere
Dst: LossyFrom<Src>,
impl<Src, Dst> LossyInto<Dst> for Srcwhere
Dst: LossyFrom<Src>,
source§fn lossy_into(self) -> Dst
fn lossy_into(self) -> Dst
source§impl<T> OverflowingAs for T
impl<T> OverflowingAs for T
source§fn overflowing_as<Dst>(self) -> (Dst, bool)where
T: OverflowingCast<Dst>,
fn overflowing_as<Dst>(self) -> (Dst, bool)where
T: OverflowingCast<Dst>,
source§impl<Src, Dst> OverflowingCastFrom<Src> for Dstwhere
Src: OverflowingCast<Dst>,
impl<Src, Dst> OverflowingCastFrom<Src> for Dstwhere
Src: OverflowingCast<Dst>,
source§fn overflowing_cast_from(src: Src) -> (Dst, bool)
fn overflowing_cast_from(src: Src) -> (Dst, bool)
§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read more§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read more§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self
, then passes self.as_ref()
into the pipe function.§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self
, then passes self.as_mut()
into the pipe
function.§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self
, then passes self.deref()
into the pipe function.§impl<T> Pointable for T
impl<T> Pointable for T
source§impl<T> SaturatingAs for T
impl<T> SaturatingAs for T
source§fn saturating_as<Dst>(self) -> Dstwhere
T: SaturatingCast<Dst>,
fn saturating_as<Dst>(self) -> Dstwhere
T: SaturatingCast<Dst>,
source§impl<Src, Dst> SaturatingCastFrom<Src> for Dstwhere
Src: SaturatingCast<Dst>,
impl<Src, Dst> SaturatingCastFrom<Src> for Dstwhere
Src: SaturatingCast<Dst>,
source§fn saturating_cast_from(src: Src) -> Dst
fn saturating_cast_from(src: Src) -> Dst
§impl<T> Tap for T
impl<T> Tap for T
§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B>
of a value. Read more§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B>
of a value. Read more§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R>
view of a value. Read more§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R>
view of a value. Read more§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target
of a value. Read more§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target
of a value. Read more§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap()
only in debug builds, and is erased in release builds.§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut()
only in debug builds, and is erased in release
builds.§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow()
only in debug builds, and is erased in release
builds.§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut()
only in debug builds, and is erased in release
builds.§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref()
only in debug builds, and is erased in release
builds.§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut()
only in debug builds, and is erased in release
builds.§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref()
only in debug builds, and is erased in release
builds.