Skip to content

File

Apache Hudi logo Apache Hudi Apache Avro logo Apache Avro

Attribute Apache Hudi Apache Avro
Name Apache Hudi Apache Avro
Description Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Utilises data stored in either parquet or orc. Apache Avro is the leading serialization format for record data, and first choice for streaming data pipelines.
Source code https://github.com/apache/hudi https://github.com/apache/avro
Website https://hudi.apache.org/ https://avro.apache.org/
License Apache license 2.0 Apache license 2.0
Year created 2016 2009
Company Uber Apache
Use cases Incremental data processing, Data upserts, Change Data Capture (CDC), ACID transactions Stream processing, Analytics, Efficient data exchange
Language support java, c++, c#, c, python, javascript, perl, ruby, php, rust
Is human readable
no
no
Orientation column or row row
Has type system
yes
yes
Has nested structure support
yes
yes
Has native compression
yes
yes
Has encoding support
yes
yes
Has constraint support
yes
no
Has acid support
yes
no
Has metadata
yes
yes
Has encryption support
maybe
no
Data processing framework support Apache Spark, Apache Flink, Apache Flink, Apache Gobblin, Apache NiFi, Apache Pig, Apache Spark,
Analytics query support Apache Hive, Apache Impala, AWS Athena, BigQuery, Clickhouse, Presto, Trino, Apache Impala, Apache Druid, Apache Hive, Apache Pinot, AWS Athena, BigQuery, Clickhouse, Firebolt,