Implementation status

This page summarizes the features supported by different Parquet implementations.

Note: This is a work in progress and we would welcome help expanding its scope.

Legend

The value in each box means:

  • ✅: supported
  • ❌: not supported
  • (blank) no data

Implementations:

Physical types

Data typeC++JavaGoRust
BOOLEAN
INT32
INT64
INT96 (1)
FLOAT
DOUBLE
BYTE_ARRAY
FIXED_LEN_BYTE_ARRAY
  • (1) This type is deprecated, but as of 2024 it’s common in currently produced parquet files

Logical types

Data typeC++JavaGoRust
STRING
ENUM
UUID
8, 16, 32, 64 bit signed and unsigned INT
DECIMAL (INT32)
DECIMAL (INT64)
DECIMAL (BYTE_ARRAY)
DECIMAL (FIXED_LEN_BYTE_ARRAY)
DATE
TIME (INT32)
TIME (INT64)
TIMESTAMP (INT64)
INTERVAL
JSON
BSON
LIST
MAP
UNKNOWN (always null)
FLOAT16

Encodings

EncodingC++JavaGoRust
PLAIN
PLAIN_DICTIONARY
RLE_DICTIONARY
RLE
BIT_PACKED (deprecated)
DELTA_BINARY_PACKED
DELTA_LENGTH_BYTE_ARRAY
DELTA_BYTE_ARRAY
BYTE_STREAM_SPLIT

Compressions

CompressionC++JavaGoRust
UNCOMPRESSED
BROTLI
GZIP
LZ4 (deprecated)
LZ4_RAW
LZO
SNAPPY
ZSTD

Other format level features

C++JavaGoRust
xxxHash-based bloom filters
Bloom filter length (1)
Statistics min_value, max_value
Page index
Page CRC32 checksum
Modular encryption
Size statistics (2)
  • (1) In parquet.thrift: ColumnMetaData->bloom_filter_length

  • (2) In parquet.thrift: ColumnMetaData->size_statistics

High level data APIs for Parquet feature usage

FormatC++JavaGoRust
External column data (1)
Row group “Sorting column” metadata (2)
Row group pruning using statistics
Reading select columns only
Page pruning using statistics
Page pruning using bloom filter
  • (1) In parquet.thrift: ColumnChunk->file_path

  • (2) In parquet.thrift: RowGroup->sorting_columns