< Back

Distributed Rules Engine & Exactly-once processing with Apache Kafka!

September 20, 2017 6:00 PM

Introducing a business Rules Engine for Big Data processing

Nitin Motgi Cask

Business Rules are statements that describe business policies or procedures to process data. Rules engines or inference engines execute business rules in a runtime production environment, and have become commonplace for many IT applications. Except in the world of big data, where there has been a gap for a horizontally scalable, lightweight inference-based business rules engine for big data processing.

In this session, you learn about Cask’s new business Rule Rngine built on top of CDAP, which is a sophisticated if-then-else statement interpreter that runs natively on big data systems such as Spark, Hadoop, Amazon EMR, Azure HDInsight and GCE. It provides an alternative computational model for transforming your data while empowering business users to specify and manage the transformations and policy enforcements.

In his talk, Nitin Motgi, Cask co-founder and CTO, demonstrates this new, distributed rule engine and explain how business users in big data environments can make decisions on their data, enforce policies, and be an integral part of the data ingestion and ETL process. He also shows how business users can write, manage, deploy, execute and monitor business data transformation and policy enforcements.

Building Stream Processing Applications with Apache Kafka

Matthias Sax Confluent

Kafka 0.11 added a new feature called “exactly-once guarantees”. In this talk, we will explain what “exactly-once” means in the context of Kafka and data stream processing and how it effects application development. The talk will go into some details about exactly-once namely the new idempotent producer and transactions and how both can be exploited to simplify application code: for example, you don’t need to have complex deduplication code in your input path, as you can rely on Kafka to deduplicate messages when data is produces by an upstream application. Transactions can be used to write multiple messages into different topics and/or partitions and commit all writes in an atomic manner (or abort all writes so none will be read by a downstream consumer in read-committed mode). Thus, transactions allow for applications with strong consistency guarantees, like in the financial sector (e.g., either send a withdrawal and deposit message to transfer money or none of them). Finally, we talk about Kafka’s Streams API that makes exactly-once stream processing as simple as it can get.