Skip to content

File

CSV logo CSV Apache Parquet logo Apache Parquet

Attribute CSV Apache Parquet
Name CSV Apache Parquet
Description Comma-Separated Values (CSV) is a text file format that uses commas to separate values in plain text. Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval.
Source code https://github.com/apache/parquet-format
Website https://www.rfc-editor.org/rfc/rfc4180.html https://parquet.apache.org/
Language support java, scala, c++, python, r, php, go java, scala, c++, python, r, php
License N/A Apache license 2.0
Year created 0 2013
Company Twitter, Cloudera
Use cases Write once read many, Analytics, Efficient storage, Column based queries
Is human readable
yes
no
Orientation row column
Has type system
no
yes
Has nested structure support
no
yes
Has native compression
no
yes
Has encoding support
no
yes
Has constraint support
no
no
Has acid support
no
no
Has metadata
no
yes
Has encryption support
no
yes
Data processing framework support Apache Beam, Apache Drill, Apache Flink, Apache Gobblin, Apache Hive, Apache NiFi, Apache Pig, Apache Spark, Apache Beam, Apache Drill, Apache Flink, Apache Spark,
Analytics query support Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt, Apache Hive, Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt,