Skip to content

File

Apache Parquet logo Apache Parquet Delta Lake logo Delta Lake

Attribute Apache Parquet Delta Lake
Name Apache Parquet Delta Lake
Description Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. Delta Lake is an open-source storage framework that enables building a Lakehouse architecture.
License Apache license 2.0 Apache license 2.0
Source code https://github.com/apache/parquet-format https://github.com/delta-io/delta
Website https://parquet.apache.org/ https://delta.io/
Year created 2013 2019
Company Twitter, Cloudera Databricks
Language support java, scala, c++, python, r, php scala, java, python, rust
Use cases Write once read many, Analytics, Efficient storage, Column based queries Write once read many, Analytics, Efficient storage, ACID transactions
Is human readable
no
no
Orientation column column
Has type system
yes
yes
Has nested structure support
yes
yes
Has native compression
yes
yes
Has encoding support
yes
yes
Has constraint support
no
yes
Has acid support
no
yes
Has metadata
yes
yes
Has encryption support
yes
maybe
Data processing framework support Apache Beam, Apache Drill, Apache Flink, Apache Spark, Apache Drill, Apache Flink, Apache Spark,
Analytics query support Apache Hive, Apache Impala, Apache Druid, Apache Pinot, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, DuckDB, Firebolt, Apache Hive, AWS Athena, Azure Synapse, BigQuery, Clickhouse, Dremio, Presto, Trino,