Crate ipc
Expand description
Support for the Arrow IPC Format
The Arrow IPC format defines how to read and write RecordBatch
es to/from
a file or stream of bytes. This format can be used to serialize and deserialize
data to files and over the network.
There are two variants of the IPC format:
-
IPC Streaming Format: Supports streaming data sources, implemented by StreamReader and StreamWriter
-
IPC File Format: Supports random access, implemented by FileReader and FileWriter.
Modules§
- Utilities for converting between IPC types and native Arrow types
- Generated code
- Arrow IPC File and Stream Readers
- Arrow IPC File and Stream Writers
Structs§
- Opaque binary data
- Logically the same as Binary, but the internal representation uses a view struct that contains the string length and either the string’s entire data inline (for small strings) or an inlined prefix, an index of another buffer, and an offset pointing to a slice in that buffer (for non-small strings).
- Optional compression for the memory buffers constituting IPC message bodies. Intended for use with RecordBatch but could be used for other message types
- Provided for forward compatibility in case we need to support different strategies for compressing the IPC message body (like whole-body compression rather than buffer-level) in the future
- Date is either a 32-bit or 64-bit signed integer type representing an elapsed time since UNIX epoch (1970-01-01), stored in either of two units:
- Exact decimal value represented as an integer value in two’s complement. Currently only 128-bit (16-byte) and 256-bit (32-byte) integers are used. The representation uses the endianness indicated in the Schema.
- For sending dictionary encoding information. Any Field can be dictionary-encoded, but in this case none of its children may be dictionary-encoded. There is one vector / column per dictionary, but that vector / column may be spread across multiple dictionary batches by using the isDelta flag
- Represents Arrow Features that might not have full support within implementations. This is intended to be used in two scenarios:
- Same as Binary, but with 64-bit offsets, allowing to represent extremely large data values.
- Same as List, but with 64-bit offsets, allowing to represent extremely large data values.
- Same as ListView, but with 64-bit offsets and sizes, allowing to represent extremely large data values.
- Same as Utf8, but with 64-bit offsets, allowing to represent extremely large data values.
- Represents the same logical types that List can, but contains offsets and sizes allowing for writes in any order and sharing of child values among list values.
- A Map is a logical nested type that is represented as
- These are stored in the flatbuffer in the Type union below
- A data header describing the shared memory layout of a “record” or “row” batch. Some systems call this a “row batch” internally and others a “record batch”.
- Contains two child arrays, run_ends and values. The run_ends child array must be a 16/32/64-bit integer array which encodes the indices at which the run with the value in each corresponding index in the values child array ends. Like list/struct types, the value array can be of any type.
- Compressed Sparse format, that is matrix-specific.
- Compressed Sparse Fiber (CSF) sparse tensor index.
- A Struct_ in the flatbuffer metadata is the same as an Arrow Struct (according to the physical memory layout). We used Struct_ here as Struct is a reserved word in Flatbuffers
- Time is either a 32-bit or 64-bit signed integer type representing an elapsed time since midnight, stored in either of four units: seconds, milliseconds, microseconds or nanoseconds.
- Timestamp is a 64-bit signed integer representing an elapsed time since a fixed epoch, stored in either of four units: seconds, milliseconds, microseconds or nanoseconds, and is optionally annotated with a timezone.
- A union is a complex type with children in Field By default ids in the type vector refer to the offsets in the children optionally typeIds provides an indirection between the child offset and the type id for each child
typeIds[offset]
is the id used in the type vector - Unicode with UTF-8 encoding
- Logically the same as Utf8, but the internal representation uses a view struct that contains the string length and either the string’s entire data inline (for small strings) or an inlined prefix, an index of another buffer, and an offset pointing to a slice in that buffer (for non-small strings).
Enums§
Constants§
- ENUM_
MAX_ BODY_ COMPRESSION_ METHOD Deprecated - ENUM_
MAX_ COMPRESSION_ TYPE Deprecated - ENUM_
MAX_ DATE_ UNIT Deprecated - ENUM_
MAX_ DICTIONARY_ KIND Deprecated - ENUM_
MAX_ ENDIANNESS Deprecated - ENUM_
MAX_ FEATURE Deprecated - ENUM_
MAX_ INTERVAL_ UNIT Deprecated - ENUM_
MAX_ MESSAGE_ HEADER Deprecated - ENUM_
MAX_ METADATA_ VERSION Deprecated - ENUM_
MAX_ PRECISION Deprecated - ENUM_
MAX_ SPARSE_ MATRIX_ COMPRESSED_ AXIS Deprecated - ENUM_
MAX_ SPARSE_ TENSOR_ INDEX Deprecated - ENUM_
MAX_ TIME_ UNIT Deprecated - ENUM_
MAX_ TYPE Deprecated - ENUM_
MAX_ UNION_ MODE Deprecated - ENUM_
MIN_ BODY_ COMPRESSION_ METHOD Deprecated - ENUM_
MIN_ COMPRESSION_ TYPE Deprecated - ENUM_
MIN_ DATE_ UNIT Deprecated - ENUM_
MIN_ DICTIONARY_ KIND Deprecated - ENUM_
MIN_ ENDIANNESS Deprecated - ENUM_
MIN_ FEATURE Deprecated - ENUM_
MIN_ INTERVAL_ UNIT Deprecated - ENUM_
MIN_ MESSAGE_ HEADER Deprecated - ENUM_
MIN_ METADATA_ VERSION Deprecated - ENUM_
MIN_ PRECISION Deprecated - ENUM_
MIN_ SPARSE_ MATRIX_ COMPRESSED_ AXIS Deprecated - ENUM_
MIN_ SPARSE_ TENSOR_ INDEX Deprecated - ENUM_
MIN_ TIME_ UNIT Deprecated - ENUM_
MIN_ TYPE Deprecated - ENUM_
MIN_ UNION_ MODE Deprecated - ENUM_
VALUES_ BODY_ COMPRESSION_ METHOD Deprecated - ENUM_
VALUES_ COMPRESSION_ TYPE Deprecated - ENUM_
VALUES_ DATE_ UNIT Deprecated - ENUM_
VALUES_ DICTIONARY_ KIND Deprecated - ENUM_
VALUES_ ENDIANNESS Deprecated - ENUM_
VALUES_ FEATURE Deprecated - ENUM_
VALUES_ INTERVAL_ UNIT Deprecated - ENUM_
VALUES_ MESSAGE_ HEADER Deprecated - ENUM_
VALUES_ METADATA_ VERSION Deprecated - ENUM_
VALUES_ PRECISION Deprecated - ENUM_
VALUES_ SPARSE_ TENSOR_ INDEX Deprecated - ENUM_
VALUES_ TIME_ UNIT Deprecated - ENUM_
VALUES_ TYPE Deprecated - ENUM_
VALUES_ UNION_ MODE Deprecated
Functions§
- Verifies that a buffer of bytes contains a
Footer
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_footer_unchecked
. - Assumes, without verification, that a buffer of bytes contains a Footer and returns it.
- Verifies, with the given options, that a buffer of bytes contains a
Footer
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_footer_unchecked
. - Verifies that a buffer of bytes contains a
Message
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_message_unchecked
. - Assumes, without verification, that a buffer of bytes contains a Message and returns it.
- Verifies, with the given options, that a buffer of bytes contains a
Message
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_message_unchecked
. - Verifies that a buffer of bytes contains a
Schema
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_schema_unchecked
. - Assumes, without verification, that a buffer of bytes contains a Schema and returns it.
- Verifies, with the given options, that a buffer of bytes contains a
Schema
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_schema_unchecked
. - Verifies that a buffer of bytes contains a
SparseTensor
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_sparse_tensor_unchecked
. - Assumes, without verification, that a buffer of bytes contains a SparseTensor and returns it.
- Verifies, with the given options, that a buffer of bytes contains a
SparseTensor
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_sparse_tensor_unchecked
. - Verifies that a buffer of bytes contains a
Tensor
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_tensor_unchecked
. - Assumes, without verification, that a buffer of bytes contains a Tensor and returns it.
- Verifies, with the given options, that a buffer of bytes contains a
Tensor
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_tensor_unchecked
. - Verifies that a buffer of bytes contains a size prefixed
Footer
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior usesize_prefixed_root_as_footer_unchecked
. - Assumes, without verification, that a buffer of bytes contains a size prefixed Footer and returns it.
- Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed
Footer
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_footer_unchecked
. - Verifies that a buffer of bytes contains a size prefixed
Message
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior usesize_prefixed_root_as_message_unchecked
. - Assumes, without verification, that a buffer of bytes contains a size prefixed Message and returns it.
- Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed
Message
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_message_unchecked
. - Verifies that a buffer of bytes contains a size prefixed
Schema
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior usesize_prefixed_root_as_schema_unchecked
. - Assumes, without verification, that a buffer of bytes contains a size prefixed Schema and returns it.
- Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed
Schema
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_schema_unchecked
. - Verifies that a buffer of bytes contains a size prefixed
SparseTensor
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior usesize_prefixed_root_as_sparse_tensor_unchecked
. - Assumes, without verification, that a buffer of bytes contains a size prefixed SparseTensor and returns it.
- Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed
SparseTensor
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_sparse_tensor_unchecked
. - Verifies that a buffer of bytes contains a size prefixed
Tensor
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior usesize_prefixed_root_as_tensor_unchecked
. - Assumes, without verification, that a buffer of bytes contains a size prefixed Tensor and returns it.
- Verifies, with the given verifier options, that a buffer of bytes contains a size prefixed
Tensor
and returns it. Note that verification is still experimental and may not catch every error, or be maximally performant. For the previous, unchecked, behavior useroot_as_tensor_unchecked
.