Modules

The parquet-format project contains format specifications and Thrift definitions of metadata required to properly read Parquet files.

The parquet-mr project contains multiple sub-modules, which implement the core components of reading and writing a nested, column-oriented data stream, map this core onto the parquet format, and provide Hadoop Input/Output Formats, Pig loaders, and other Java-based utilities for interacting with Parquet.

The parquet-cpp project is a C++ library to read-write Parquet files. It is part of the Apache Arrow C++ implementation, with bindings to Python, R, Ruby and C/GLib.

The parquet-rs project is a Rust library to read-write Parquet files.

The parquet-compatibility project (deprecated) contains compatibility tests that can be used to verify that implementations in different languages can read and write each other’s files. As of January 2022 compatibility tests only exist up to version 1.2.0.