Implementation status

This page summarizes the features supported by different Parquet implementations.

Note: If you find out of date information, please help us improve the accuracy of this page by opening an issue or submitting a pull request.

Legend

The value in each box means:

  • ✅: supported
  • ❌: not supported
  • (R/W): partial reader/writer only support
  • (blank): no data

Implementations:

Physical types

Physical types are defined by the enum Type in parquet.thrift

Data typearrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
BOOLEAN
INT32
INT64
INT96 (1)(R)(R)
FLOAT
DOUBLE
BYTE_ARRAY
FIXED_LEN_BYTE_ARRAY
  • (1) This type is deprecated, but as of 2024 it’s common in currently produced parquet files

Logical types

Logical types are defined by the union LogicalType in parquet.thrift and described in LogicalTypes.md

Data typearrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
STRING
ENUM✅ (1)
UUID✅ (1)
8, 16, 32, 64 bit signed and unsigned INT
DECIMAL (INT32)
DECIMAL (INT64)
DECIMAL (BYTE_ARRAY)(R)
DECIMAL (FIXED_LEN_BYTE_ARRAY)
FLOAT16✅ (1)
DATE
TIME (INT32)
TIME (INT64)
TIMESTAMP (INT64)
INTERVAL✅ (1)
JSON✅ (1)✅ (1)
BSON✅ (1)✅ (1)
VARIANT
GEOMETRY
GEOGRAPHY
LIST(R)
MAP(R)
UNKNOWN (always null)
  • (1) Only supported to use its annotated physical type

Encodings

Encodings are defined by the enum Encoding in parquet.thrift and described in Encodings.md

Encodingarrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
PLAIN
PLAIN_DICTIONARY(R)
RLE_DICTIONARY
RLE
BIT_PACKED (deprecated)❌ (1)(R)(R)
DELTA_BINARY_PACKED(R)
DELTA_LENGTH_BYTE_ARRAY(R)
DELTA_BYTE_ARRAY(R)
BYTE_STREAM_SPLIT(R)
  • (1) Partial read support, but only in the case of level data with a bitwidth of 0

Compressions

Compressions are defined by the enum CompressionCodec in parquet.thrift and described in Compression.md

Compressionarrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
UNCOMPRESSED
BROTLI(R)(R)
GZIP(R)(R)
LZ4 (deprecated)(R)
LZ4_RAW(R)
LZO
SNAPPY
ZSTD(R)

Other format level features

Featurearrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
xxHash-based bloom filters(R)(R)
Bloom filter length (1)(R)(R)
Statistics min_value, max_value
Page index(R)(R)
Page CRC32 checksum(R)
Modular encryption✅ (*)
Size statistics (2)(R)(R)
Data Page V2 (3)
  • (*) Partial support

High level data APIs for Parquet feature usage

Featurearrowparquet-javaarrow-goarrow-rscudfhyparquetduckdb
External column data (1)(W)
Row group “Sorting column” metadata (2)(W)(R)
Row group pruning using statistics✅ (*)
Row group pruning using bloom filter✅ (*)
Reading select columns only
Page pruning using statistics✅ (*)
  • (1) In parquet.thrift: ColumnChunk->file_path

  • (2) In parquet.thrift: RowGroup->sorting_columns

  • (*) Partial Support