Information on Parquet Features
Taming Floating-Point Statistics in Apache Parquet: IEEE 754 Total Order and NaN Counts
Friday, May 29, 2026 in features
Categories:
Column statistics are the secret to Apache Parquet’s blazing fast performance. By storing compact summaries—like min, max, and null counts—for row groups, column chunks, and pages, readers can easily skip irrelevant data that doesn’t …
Variant Type in Apache Parquet for Semi-Structured Data
Friday, February 27, 2026 in features
Categories:
The Apache Parquet community is excited to announce the addition of the Variant type—a feature that brings native support for semi-structured data to Parquet, significantly improving efficiency compared to less efficient formats such as JSON. This …
Native Geospatial Types in Apache Parquet
Friday, February 13, 2026 in features
Categories:
Geospatial data has become a core input for modern analytics across logistics, climate science, urban planning, mobility, and location intelligence. Yet for a long time, spatial data lived outside the mainstream analytics ecosystem. In primarily …