Databases

AWS and IBM Netezza come out in support of Iceberg in table format face-off

Join Snowflake, Cloudera, Google as Apache format fans

Tue 1 Aug 2023 // 08:36 UTC

Cloud giant AWS has picked table format Apache Iceberg to extend the reach of its Redshift data warehouse to data lakes, in a move replicated by IBM's Netezza last week.

AWS revealed that it was previewing support for Iceberg, which emerged from Netflix in the late 2010s, to allow users to employ Redshift to run analytics queries on Apache Iceberg tables in external data lakes.

"You can now use Amazon Redshift to query your Apache Iceberg tables in AWS Glue Data Catalog while other users or applications can safely conduct data manipulation on your tables using ACID-compliant services like Amazon EMR, Amazon Athena, and AWS Glue," it said.

The fine print introduced some caveats, though. "New Iceberg tables only – Queries on partitioned tables which were converted from Apache Parquet tables to Apache Iceberg tables and include partition columns in the query are not supported," it said in an accompanying user guide.

AWS later clarified how the system could be used to query data outside its cloud platform.

"Amazon Redshift provides transactional consistency for querying Apache Iceberg tables from data lakes in AWS (including Amazon S3). To run analytics on external data sources (including Google BigQuery or Google Cloud Storage), AWS customers can use Amazon Athena's prebuilt data source connectors," the company told The Register.

It said that pricing would be based on Redshift Spectrum or Redshift Serverless usage.

Another fillip for Iceberg comes from IBM's Netezza, that almost forgotten data warehouse originally based on PostgreSQL. We last heard from Netezza when IBM, which bought it in 2010, finally moved the system to the cloud.

IBM software engineer Mike DeRoy blogged this week that users can employ IBM's lakehouse technology watsonx.data to create tables in the Apache Iceberg table format, "allowing any compatible engine to access the data and preventing you from being locked in to any specific engine."

"IBM is bringing first class lakehouse integration into the Netezza engine, allowing you to query Iceberg tables from both the watsonx.data platform, as well as other datalake platforms," he said.

Who's sitting at which table?

Although hardly the Betamax vs VHS standards face-off, the big-hitting vendors seem to be divided in which table format they are backing in bringing the vision of analytics engines to the data, wherever the data is. Snowflake, Cloudera, Google and now AWS and Netezza have gone with Iceberg. But Microsoft, SAP and Databricks have picked the table format the latter created, with the open source project managed by the Linux Foundation.

Each vendor has justified its approach by saying their chosen format reflects what customers are demanding most. They have also said they would support a range of formats, including Apache Hudi, in the fullness of time.

Which leaves Oracle. Earlier this month Big Red said it was extending its MySQL HeatWave to query data held in object storage. It means its own object storage, of course. Oracle did say, though, that it intends to support open table formats, starting with Iceberg and Delta Lake, in the future. ®

Topics

Special Features

Vendor Voice

Resources

Databases

AWS and IBM Netezza come out in support of Iceberg in table format face-off

Join Snowflake, Cloudera, Google as Apache format fans

Who's sitting at which table?

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Now IBM sued for age discrim by its own HR veterans

There are lots of ways to put a database in the cloud – here's what to consider

MongoDB's SQL-to-NoSQL converter uses AI to smash the language barrier

Driving down the cost of Office applications

IBM's Weather Company leaked my personal info to analytics, thunders netizen

Cryptojackers spread their nets to capture more than just EC2

AWS spins up more cloudy Mac Minis, now with M2 Pro silicon

Tabular's Iceberg vision goes from Netflix and chill to database thrill

VCs lay $52.5M golden egg for MotherDuck's serverless analytics platform

Former IBM services outfit Kyndryl said to be mulling China split

Azure SQL Database takes Saturday off on US east coast following network power failure

NTT will take those SAP licenses off your hands if it helps ease cloud migration

About Us

Our Websites

Your Privacy