File

Apache Avro Apache Hudi Apache Iceberg Apache ORC Apache Parquet CSV Delta Lake

Attribute	Apache Avro	Apache Hudi	Apache Iceberg	Apache ORC	Apache Parquet	CSV	Delta Lake
Name	Apache Avro	Apache Hudi	Apache Iceberg	Apache ORC	Apache Parquet	CSV	Delta Lake
Description	Apache Avro is the leading serialization format for record data, and first choice for streaming data pipelines.	Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Utilises data stored in either parquet or orc.	Iceberg is a high-performance format for huge analytic tables. Utilises data stored in either parquet, avro, or orc.	ORC is a self-describing type-aware columnar file format designed for Hadoop workloads.	Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval.	Comma-Separated Values (CSV) is a text file format that uses commas to separate values in plain text.	Delta Lake is an open-source storage framework that enables building a Lakehouse architecture.
License	Apache license 2.0	Apache license 2.0	Apache license 2.0	Apache license 2.0	Apache license 2.0	N/A	Apache license 2.0
Source code	https://github.com/apache/avro	https://github.com/apache/hudi	https://github.com/apache/iceberg	https://github.com/apache/orc	https://github.com/apache/parquet-format		https://github.com/delta-io/delta
Website	https://avro.apache.org/	https://hudi.apache.org/	https://iceberg.apache.org/	https://orc.apache.org/	https://parquet.apache.org/	https://www.rfc-editor.org/rfc/rfc4180.html	https://delta.io/
Year created	2009	2016	2017	2013	2013	0	2019
Company	Apache	Uber	Netflix	Hortonworks, Facebook	Twitter, Cloudera		Databricks
Language support	java, c++, c#, c, python, javascript, perl, ruby, php, rust			java, scala, c++, python	java, scala, c++, python, r, php	java, scala, c++, python, r, php, go	scala, java, python, rust
Use cases	Stream processing, Analytics, Efficient data exchange	Incremental data processing, Data upserts, Change Data Capture (CDC), ACID transactions	Write once read many, Analytics, Efficient storage, ACID transactions	Write once read many, Analytics, Efficient storage, ACID transactions	Write once read many, Analytics, Efficient storage, Column based queries		Write once read many, Analytics, Efficient storage, ACID transactions
Is human readable	no	no	no	no	no	yes	no
Orientation	row	column or row	column or row	row	column	row	column
Has type system	yes	yes	yes	yes	yes	no	yes
Has nested structure support	yes	yes	yes	yes	yes	no	yes
Has native compression	yes	yes	yes	yes	yes	no	yes
Has encoding support	yes	yes	yes	yes	yes	no	yes
Has constraint support	no	yes	no	no	no	no	yes
Has acid support	no	yes	yes	no	no	no	yes
Has metadata	yes	yes	yes	yes	yes	no	yes
Has encryption support	no	maybe	maybe	yes	yes	no	maybe
Data processing framework support	Apache Flink, Apache Gobblin, Apache NiFi, Apache Pig, Apache Spark,	Apache Spark, Apache Flink,	Apache Drill, Apache Flink, Apache Gobblin, Apache Pig, Apache Spark,	Apache Flink, Apache Gobblin, Apache Hadoop, Apache NiFi, Apache Pig, Apache Spark,	Apache Beam, Apache Drill, Apache Flink, Apache Spark,	Apache Beam, Apache Drill, Apache Flink, Apache Gobblin, Apache Hive, Apache NiFi, Apache Pig, Apache Spark,	Apache Drill, Apache Flink, Apache Spark,
Analytics query support	Apache Impala, Apache Druid, Apache Hive, Apache Pinot, AWS Athena, BigQuery, Clickhouse, Firebolt,	Apache Hive, Apache Impala, AWS Athena, BigQuery, Clickhouse, Presto, Trino,	Apache Impala, Apache Druid, Apache Hive, AWS Athena, BigQuery, Clickhouse, Dremio, DuckDB, Presto, Trino,	Apache Impala, Apache Druid, Apache Hive, Apache Pinot, AWS Athena, BigQuery, Clickhouse, Firebolt, Presto, Trino,	Apache Hive, Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt,	Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt,	Apache Hive, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, Presto, Trino,