Skip to content

File

Apache Avro logo Apache Avro Apache Hudi logo Apache Hudi

Attribute Apache Avro Apache Hudi
Name Apache Avro Apache Hudi
Description Apache Avro is the leading serialization format for record data, and first choice for streaming data pipelines. Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Utilises data stored in either parquet or orc.
License Apache license 2.0 Apache license 2.0
Source code https://github.com/apache/avro https://github.com/apache/hudi
Website https://avro.apache.org/ https://hudi.apache.org/
Year created 2009 2016
Company Apache Uber
Language support java, c++, c#, c, python, javascript, perl, ruby, php, rust
Use cases Stream processing, Analytics, Efficient data exchange Incremental data processing, Data upserts, Change Data Capture (CDC), ACID transactions
Is human readable
no
no
Orientation row column or row
Has type system
yes
yes
Has nested structure support
yes
yes
Has native compression
yes
yes
Has encoding support
yes
yes
Has constraint support
no
yes
Has acid support
no
yes
Has metadata
yes
yes
Has encryption support
no
maybe
Data processing framework support Apache Flink, Apache Gobblin, Apache NiFi, Apache Pig, Apache Spark, Apache Spark, Apache Flink,
Analytics query support Apache Impala, Apache Druid, Apache Hive, Apache Pinot, AWS Athena, BigQuery, Clickhouse, Firebolt, Apache Hive, Apache Impala, AWS Athena, BigQuery, Clickhouse, Presto, Trino,