Apache Parquet files are a popular columnar storage format used by data scientists and anyone using the Hadoop ecosystem. It was developed to be very efficient in terms of compression and encoding. Check out their documentation if you want to know all the details about how Parquet files work.
You can read and write Parquet files with Python using the pya…
Keep reading with a 7-day free trial
Subscribe to The Python Papers to keep reading this post and get 7 days of free access to the full post archives.