Introduction to Apache Beam (FKA Google's Dataflow)
Jean-Baptiste Onofré Talend
Apache Beam (formerly Google Cloud Dataflow SDK) is an unified model and set of language-specific SDKs for defining and executing data processing workflows. You design pipelines, simplifying the mechanics of large-scale batch and streaming data processing and can run on a number of runtimes like Apache Flink, Apache Spark, and Google Cloud Dataflow (a cloud service). This talk will introduce the Beam programming model, and how you can use it to design your pipelines, transporting PCollection and applying some PTransforms. You will see how the same code will be “translated” to a target runtimes thanks to a specific runner. You will also have an overview of the current roadmap, with the new interesting features.
Harnessing the power of unstructured data using Haven OnDemand
Phong Vu HPE
HPE Haven OnDemand is a platform for building data-rich applications and analytics using text analysis, speech recognition, image analysis, indexing and search APIs. Simply put, developers and businesses use the Haven OnDemand APIs to add advanced capabilities such as natural language processing, machine learning, and predictive analytics to their applications.
This talk will focus on Haven OnDemand platform’s capabilities of human information analytics and building advanced unstructured text indexes.
Introduction EsgynDB, based on Apache Trafodion
Rao Kakarlamudi Esgyn
Introducing EsgynDB based on Apache Trafodion, the Big Data database that revolutionizes the way you manage Big Data and Hadoop. With EsgynDB, you can now run your transactional and enterprise operational reporting workloads on Hadoop, and avoid being locked into those expensive, proprietary database vendors.
By consolidating your workloads onto the same platform, you can derive business insight faster and cheaper than ever before. You can adopt EsgynDB to enable a Big Data strategy that simplifies and modernizes your operational data management, as illustrated by the following use cases from early adopters:
- Gain real-time views and analytics on security data collected from IT infrastructure, firewalls, and web traffic worldwide
- Monitor transit fleet to optimize and maintain efficiency in real time and perform historical reporting for future planning
- Offload historic data from expensive transactional systems to lower costs and differentiate customer experience by enriching transactional data with other data sources
- Transform traditional back office services to deliver service capabilities over the Internet