StreamWriter

Struct StreamWriter 

pub struct StreamWriter<W> {
    writer: W,
    write_options: IpcWriteOptions,
    finished: bool,
    dictionary_tracker: DictionaryTracker,
    data_gen: IpcDataGenerator,
    compression_context: CompressionContext,
}
Expand description

Arrow Stream Writer

Writes Arrow RecordBatches to bytes using the IPC Streaming Format.

§See Also

§Example - Basic usage

let batch = record_batch!(("a", Int32, [1, 2, 3])).unwrap();
// create a new writer, the schema must be known in advance
let mut writer = StreamWriter::try_new(&mut stream, &batch.schema()).unwrap();
// write each batch to the underlying stream
writer.write(&batch).unwrap();
// When all batches are written, call finish to flush all buffers
writer.finish().unwrap();

§Example - Efficient delta dictionaries


let schema = Arc::new(Schema::new(vec![Field::new(
   "col1",
   DataType::Dictionary(Box::from(DataType::Int32), Box::from(DataType::Utf8)),
   true,
)]));

let mut builder = StringDictionaryBuilder::<arrow_array::types::Int32Type>::new();

// `finish_preserve_values` will keep the dictionary values along with their
// key assignments so that they can be re-used in the next batch.
builder.append("a").unwrap();
builder.append("b").unwrap();
let array1 = builder.finish_preserve_values();
let batch1 = RecordBatch::try_new(schema.clone(), vec![Arc::new(array1) as ArrayRef]).unwrap();

// In this batch, 'a' will have the same dictionary key as 'a' in the previous batch,
// and 'd' will take the next available key.
builder.append("a").unwrap();
builder.append("d").unwrap();
let array2 = builder.finish_preserve_values();
let batch2 = RecordBatch::try_new(schema.clone(), vec![Arc::new(array2) as ArrayRef]).unwrap();

let mut stream = vec![];
// You must set `.with_dictionary_handling(DictionaryHandling::Delta)` to
// enable delta dictionaries in the writer
let options = IpcWriteOptions::default().with_dictionary_handling(DictionaryHandling::Delta);
let mut writer = StreamWriter::try_new(&mut stream, &schema).unwrap();

// When writing the first batch, a dictionary message with 'a' and 'b' will be written
// prior to the record batch.
writer.write(&batch1).unwrap();
// With the second batch only a delta dictionary with 'd' will be written
// prior to the record batch. This is only possible with `finish_preserve_values`.
// Without it, 'a' and 'd' in this batch would have different keys than the
// first batch and so we'd have to send a replacement dictionary with new keys
// for both.
writer.write(&batch2).unwrap();
writer.finish().unwrap();

Fields§

§writer: W§write_options: IpcWriteOptions§finished: bool§dictionary_tracker: DictionaryTracker§data_gen: IpcDataGenerator§compression_context: CompressionContext

Implementations§

§

impl<W> StreamWriter<BufWriter<W>>
where W: Write,

pub fn try_new_buffered( writer: W, schema: &Schema, ) -> Result<StreamWriter<BufWriter<W>>, ArrowError>

Try to create a new stream writer with the writer wrapped in a BufWriter.

See StreamWriter::try_new for an unbuffered version.

§

impl<W> StreamWriter<W>
where W: Write,

pub fn try_new( writer: W, schema: &Schema, ) -> Result<StreamWriter<W>, ArrowError>

Try to create a new writer, with the schema written as part of the header.

Note that there is no internal buffering. See also StreamWriter::try_new_buffered.

§Errors

An ‘Err’ may be returned if writing the header to the writer fails.

pub fn try_new_with_options( writer: W, schema: &Schema, write_options: IpcWriteOptions, ) -> Result<StreamWriter<W>, ArrowError>

Try to create a new writer with IpcWriteOptions.

§Errors

An ‘Err’ may be returned if writing the header to the writer fails.

pub fn write(&mut self, batch: &RecordBatch) -> Result<(), ArrowError>

Write a record batch to the stream

pub fn finish(&mut self) -> Result<(), ArrowError>

Write continuation bytes, and mark the stream as done

pub fn get_ref(&self) -> &W

Gets a reference to the underlying writer.

pub fn get_mut(&mut self) -> &mut W

Gets a mutable reference to the underlying writer.

It is inadvisable to directly write to the underlying writer.

pub fn flush(&mut self) -> Result<(), ArrowError>

Flush the underlying writer.

Both the BufWriter and the underlying writer are flushed.

pub fn into_inner(self) -> Result<W, ArrowError>

Unwraps the the underlying writer.

The writer is flushed and the StreamWriter is finished before returning.

§Errors

An ‘Err’ may be returned if an error occurs while finishing the StreamWriter or while flushing the writer.

§Example
// The result we expect from an empty schema
let expected = vec![
    255, 255, 255, 255,  48,   0,   0,   0,
     16,   0,   0,   0,   0,   0,  10,   0,
     12,   0,  10,   0,   9,   0,   4,   0,
     10,   0,   0,   0,  16,   0,   0,   0,
      0,   1,   4,   0,   8,   0,   8,   0,
      0,   0,   4,   0,   8,   0,   0,   0,
      4,   0,   0,   0,   0,   0,   0,   0,
    255, 255, 255, 255,   0,   0,   0,   0
];

let schema = Schema::empty();
let buffer: Vec<u8> = Vec::new();
let options = IpcWriteOptions::try_new(8, false, MetadataVersion::V5)?;
let stream_writer = StreamWriter::try_new_with_options(buffer, &schema, options)?;

assert_eq!(stream_writer.into_inner()?, expected);

Trait Implementations§

§

impl<W> RecordBatchWriter for StreamWriter<W>
where W: Write,

§

fn write(&mut self, batch: &RecordBatch) -> Result<(), ArrowError>

Write a single batch to the writer.
§

fn close(self) -> Result<(), ArrowError>

Write footer or termination data, then mark the writer as done.

Auto Trait Implementations§

§

impl<W> Freeze for StreamWriter<W>
where W: Freeze,

§

impl<W> RefUnwindSafe for StreamWriter<W>
where W: RefUnwindSafe,

§

impl<W> Send for StreamWriter<W>
where W: Send,

§

impl<W> Sync for StreamWriter<W>
where W: Sync,

§

impl<W> Unpin for StreamWriter<W>
where W: Unpin,

§

impl<W> UnwindSafe for StreamWriter<W>
where W: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

§

impl<T> ErasedDestructor for T
where T: 'static,