Skip to main content

Stream API

This page groups the row-by-row streaming APIs for T-TOON and T-JSON. Use the language tabs to switch between Python, JavaScript, and Rust.

All streaming APIs require a StreamSchema.

Shared format conventions:

  • T-TOON stream: [*]{fields}:
  • T-JSON stream: top-level array of objects
  • Object path: row values as language-native objects
  • Arrow path: row batches as Arrow-native batches

Package: ttoon

Reader Factories

FunctionReturnsFormatPath
stream_read(source, *, schema, mode=None, codecs=None)StreamReaderT-TOONObject
stream_read_tjson(source, *, schema, mode=None, codecs=None)TjsonStreamReaderT-JSONObject
stream_read_arrow(source, *, schema, batch_size=1024, mode=None)ArrowStreamReaderT-TOONArrow
stream_read_arrow_tjson(source, *, schema, batch_size=1024, mode=None)TjsonArrowStreamReaderT-JSONArrow

All readers are Python iterators:

for row in reader:
print(row)

For T-JSON streaming readers, mode does not relax JSON value syntax. It only controls how schema-unknown fields are handled: compat discards them, while strict rejects them.

Writer Factories

FunctionReturnsFormatPath
stream_writer(sink, *, schema, delimiter=",", binary_format=None, codecs=None)StreamWriterT-TOONObject
stream_writer_tjson(sink, *, schema, binary_format=None, codecs=None)TjsonStreamWriterT-JSONObject
stream_writer_arrow(sink, *, schema, delimiter=",", binary_format=None)ArrowStreamWriterT-TOONArrow
stream_writer_arrow_tjson(sink, *, schema, binary_format=None)TjsonArrowStreamWriterT-JSONArrow

All writers support context managers:

with stream_writer(sink, schema=schema) as writer:
writer.write({"name": "Alice", "score": 95})
result = writer.result

Writer Methods and Result

ClassWrite MethodNotes
StreamWriterwrite(row: Mapping)Object rows
TjsonStreamWriterwrite(row: Mapping)Object rows
ArrowStreamWriterwrite_batch(batch)Arrow RecordBatch
TjsonArrowStreamWriterwrite_batch(batch)Arrow RecordBatch

StreamResult:

AttributeTypeDescription
rows_emittedintNumber of rows written

Codec Scope

use(codecs) -> None

Registers global codecs for Python object-path streaming APIs.

Codecs affect:

  • stream_read() / stream_writer()
  • stream_read_tjson() / stream_writer_tjson()

They do not affect batch loads(), batch to_tjson(), Arrow-path streaming, or direct transcode.

Stream Schema

StreamSchema defines the field names and types for streaming operations. All streaming readers and writers require a schema.

Construction

from ttoon import StreamSchema, types

# From dict
schema = StreamSchema({
"name": types.string,
"score": types.int,
"amount": types.decimal(10, 2),
})

# From list of tuples (preserves insertion order)
schema = StreamSchema([
("name", types.string),
("score", types.int),
])

Types Namespace

TypePythonJavaScriptRust
Stringtypes.stringtypes.stringScalarType::String
Inttypes.inttypes.intScalarType::Int
Floattypes.floattypes.floatScalarType::Float
Booltypes.booltypes.boolScalarType::Bool
Datetypes.datetypes.dateScalarType::Date
Timetypes.timetypes.timeScalarType::Time
DateTime (tz)types.datetimetypes.datetimeScalarType::DateTime { has_tz: true }
DateTime (naive)types.datetime_naivetypes.datetimeNaiveScalarType::DateTime { has_tz: false }
UUIDtypes.uuidtypes.uuidScalarType::Uuid
Binarytypes.binarytypes.binaryScalarType::Binary
Decimal(p, s)types.decimal(p, s)types.decimal(p, s)ScalarType::decimal(p, s) or ScalarType::Decimal { precision, scale }

Rust also exposes convenience constructors ScalarType::datetime() and ScalarType::datetime_naive().

Nullable Fields

All type specs support .nullable() to allow null values in the column:

schema = StreamSchema({
"name": types.string, # NOT NULL
"nickname": types.string.nullable(), # nullable
})

Schema Access

schema["name"]     # returns a field spec built from ttoon.types
len(schema) # number of fields
list(schema) # field names
schema.export() # serializable form

Validation Rules

All three language surfaces enforce the same conceptual rules:

  • schemas must contain at least one field
  • field names must be strings
  • duplicate field names are rejected
  • field types must come from the language-specific typed schema surface

Error surface by language:

  • Python: invalid names/types raise TypeError; duplicates/empty schemas raise ValueError
  • JavaScript: invalid names/types raise TypeError; duplicates/empty schemas raise Error
  • Rust: StreamSchema::try_new() returns Result; StreamSchema::new() panics on invalid input

Decimal Constraints

decimal(precision, scale) is forwarded to the Rust backend. Effective backend limits are:

  • precision must be between 1 and 76
  • scale must fit Rust i8
  • Arrow conversion uses Decimal128 for precision <= 38, otherwise Decimal256

Out-of-range values may be accepted by the Python/JS wrapper constructors but will fail once the schema is validated or converted in Rust.

Arrow Schema Conversion (Rust)

// StreamSchema -> Arrow Schema
let arrow_schema = schema.to_arrow_schema()?;

// Arrow Schema -> StreamSchema
let stream_schema = StreamSchema::from_arrow_schema(&arrow_schema)?;