Qlik has announced new capabilities in Qlik Open Lakehouse that bring streaming ingestion and real-time transformations to its managed, Apache Iceberg–based lakehouse.
With these additions, teams can ingest high-volume events from Apache Kafka, Amazon Kinesis, and Amazon Simple Storage Service (Amazon S3) into governed Iceberg tables and apply transformations as the data lands.
Data quality, lineage, cataloging, and the Qlik Trust Score are applied automatically. The data is immediately available to analytics, applications, and ML teams, with ingestion and transformation offloaded to cost-effective compute.
Qlik also expanded its Iceberg ecosystem integrations with support for Snowflake Open Catalog, enhanced Apache Spark compatibility, and zero copy mirroring to Databricks and Amazon Redshift, alongside the already available mirroring to Snowflake, to simplify hybrid lakehouse and warehouse designs.
“The next phase of AI is operational,” says Drew Clarke, executive vice-president: product and technology at Qlik. “It runs on fresh, governed data, not nightly batches.
“By adding streaming ingestion and on-the-fly transformations to Open Lakehouse, teams get access to an open and trusted enterprise data foundation in their own cloud, built on Iceberg and integrated with the engines they already use. It shortens time to action and turns AI from pilots into performance.”
New features include
- Streaming ingestion to Apache Iceberg from Apache Kafka, Amazon Kinesis, and Amazon S3, without consuming data warehouse compute
- Streaming transformations including cleansing, filtering, normalisation, flattening, and more as data arrives
- Automatic Iceberg optimisation for compaction and metadata maintenance to sustain performance at scale
- Comprehensive data quality and governance, including data lineage, catalogue services, and Qlik Trust Score applied to data in Iceberg tables and to mirrored datasets
- Integration with Snowflake Open Catalog in addition to AWS Glue
- Zero-copy mirroring to Databricks and Amazon Redshift, reflecting Iceberg data without duplicating it
- Enhanced Apache Spark compatibility for seamless access to up-to-date Iceberg tables
Real-time work stalls when data is late or duplicated. Qlik Open Lakehouse now unifies real-time ingest, on-the-fly transforms, optimisation, and governance so data is usable on arrival. It connects to hundreds of sources and adds real-time pipelines without depending on warehouse compute for ingestion or streaming transformations.
Data writes once to Apache Iceberg in the customer’s account and is queryable by range of engines including Snowflake, Amazon Athena, Amazon SageMaker Studio, Apache Spark, Trino, Presto and more, with data quality, lineage, and Trust Score built in.
Open Lakehouse manages Apache Iceberg tables on Amazon S3 inside the customer’s environment. Streaming pipelines write events and apply transforms as data flows, with automatic compaction and metadata updates to sustain performance.
Governance in Qlik Talend Cloud enforces data quality and lineage and keeps the catalogue current. Zero-copy mirroring makes the same Iceberg datasets available in Snowflake, Databricks, and Amazon Redshift without extra copies.