Next Meetup

November 21st, 2019 6:00pm Google Office, 1170 Bordeaux Dr · Sunnyvale, CA

RSVP Now OR Join Group

The Big Data Application Meetup is for the community focusing on Big Data technologies that solve real world problems. Reach out to the organizers if you are interested in speaking at or hosting the next meetup.

  • Introduction to Dataplex, an intelligent data fabric to power analytics at scale

    Prajakta Damle Google

  • How Major League Baseball is using Data Fusion to Modernize Data Movement

    Charles Nixon Major League Baseball

  • General introduction and background on Data Lake Management.

    Chris Crosbie Google

  • Modernizing Apache Hive Metastore for the next decade

    Feng Lu Google

  • How Delta Lake Address Data Lake Challenges

    Ajay Singh Databricks

  • CDAP: Where Do We Go From Here?

    Chai Pydimukkala Google

  • Featured Speaker: LiveRamp

    Sagar Batchu LiveRamp

  • Fine grained root cause and impact analysis with CDAP Lineage

    Yuki Jung Google

  • Accelerating workloads and bursting remote data with Google Dataproc using Alluxio

    Dipti B. & Roderick Y. Alluxio

  • Collecting IoT device Data and Triggering Alerts in Realtime using CDAP

    Vinisha Shah Google

  • Kubernetes Operator

    Kevin Xu PingCAP

  • Augmented OLAP for Big Data

    Daniel Gu Kyligence

  • Maintaining full data lineage and governance across billions of data partitions

    Steven Parkes

  • Moving to the Cloud: Data Migration and Change Data Capture (CDC) with CDAP

    Tony Hajdari Google

  • The Linux Foundation ONAP project enables 5G & Edge Computing using CDAP

    Amar Kapadia Aarna Networks

  • Centralized Metadata for Multi Cloud Data Pipelines

    Rohit Sinha Google

  • Build Apps Without Pipelines: Shortest Path from Complex Data to Live Apps

    Anirudh Ramanathan Rocket

  • Modernizing Big Data Infrastructure with Docker and Kubernetes

    Ravikumar A. & Tushar D. Robin Systems

  • CDAP 5 - Elevating to the Cloud

    Terence Yim Google

  • Apache Kylin - Extreme OLAP engine for big data

    Daniel Gu Kyligence

  • When Rotten Tomatoes isn’t enough: Analyzing Twitter Movie Reviews using DataStax Enterprise

    Amanda Moran DataStax

  • Machine Learning for the Masses

    Albert Shau Cask

  • How TiDB helps Scale World's Largest Bikesharing Platform

    Kevin Xu PingCap

  • Building a Self-Service Data Lake on Google Cloud Platform

    Ali Anwar Cask

  • Apache Spark and Apache Ignite: Where Fast Data Meets the IoT

    Denis Magda GridGain

  • Scalable Clusters on Demand

    Bogdan K. & Gustavo T. Opendoor

  • Foundations for securing Big Data Applications

    Yaojie Feng Cask

  • Multi-tenant and Geo-replication Messaging with Apache Pulsar

    Matteo M. & Sijie G. Streamlio

  • Introducing a business Rules Engine for Big Data processing

    Nitin Motgi Cask

  • Building Stream Processing Applications with Apache Kafka

    Matthias Sax Confluent

  • Self-Service Data Integration using Apache Spark

    Edwin Elia Cask

  • Making Big Data Go Faster

    Morgan Littlewood Kodiak Data

  • Data Pipelines in Kubernetes

    Sean Suchter Pepperdata

  • EDW Optimization with Hadoop and CDAP

    Sagar Kapare Cask

  • Future proof, portable batch and streaming pipelines using Apache Beam

    Malo Denielou Google

  • Turning a data pond into a data lake with Apache NiFi

    Gene Peters Telligent Data

  • Building an ECA Rules Engine for IoT with CDAP

    Bhooshan Mogal Cask

  • Demonstrating the Benefits of Hyper-Acceleration for both Batch and Streaming Spark Processing

    Roop Ganguly BigStream

  • Improving Application and Cluster Performance for Big Data Stacks

    Kunal Agarwal Unravel Data

  • Cask Market - Big Data's App Store

    Albert Shau Cask

  • Scaling tribal knowledge at Airbnb

    Chris J. & John . Airbnb

  • Agile Data Science: Full-Stack Analytics App Dev OR Building an Aviation Data Explorer

    Russell Jurney

  • Peeling the onion: How data abstraction helps building big data applications

    Andreas Neumann Cask

  • Big Data and Analytics in the Cloud

    Ryan Lippert Cloudera

  • Designing Modern Data Pipelines with Apache Kafka

    Gwen Shapira Confluent

  • Who Moved my Data? - Why tracking changes and sources of data is critical to your data lake success

    Russ Savage Cask

  • One size doesn’t fit all: making a case for Federated Data Science using Ampool

    Nitin R. & Suhas . Ampool

  • Analyze Ad impressions at speed of thought using Spark 2.0 and Snappydata

    Jags Ramnarayan SnappyData

  • Building Large Scale Applications on Apache Hadoop YARN with Apache Twill

    Poorna Chandra Cask

  • Introduction to large-scale Machine Learning with Apache Flink

    Theodore Vasiloudis SICS

  • Ambry: Linkedin's Scalable Geo-distributed Object Store

    Sivabalan Narayanan LinkedIn

  • Building Data pipelines with Cask Hydrator

    Gokul Gunasekaran Cask

  • PXF: A Unified access framework for distributed data systems on HDFS

    Shivram Mani Pivotal

  • Practical TensorFlow

    Illia Polosukhin Google

  • Introducing Pachyderm

    Joe Doliner Pachyderm

  • Leveraging Big Data at TubeMogul to convert Events --> Insights --> Actions

    Murtaza D. & John T. TubeMogul

  • Introduction to Apache Beam (FKA Google's Dataflow)

    Jean-Baptiste Onofré Talend

  • Harnessing the power of unstructured data using Haven OnDemand

    Phong Vu HPE

  • Introduction EsgynDB, based on Apache Trafodion

    Rao Kakarlamudi Esgyn

  • Simplifying big data analytics with Apache Kudu

    Mike Percy Cloudera

  • Apache Phoenix: OLTP in Hadoop

    James Taylor

  • SQL-on-Everything with Apache Drill

    Julien Le Dem Dremio

  • Logging infrastructure for Microservices using StreamSets Data Collector

    Virag Kothari Streamsets

  • Introducting Apache Apex (Incubating)

    Thomas Weise DataTorrent

  • When-To-Post on Social Networks

    Zhisheng L. & Prantik B. Lithium

  • Fluentd and Docker Integration

    John Hammink Treasure Data

  • Turbocharging CDAP Applications With Ampool

    Milind Bhandarkar Ampool

  • Tips for Building a Data Science Platform

    David Chaiken Altiscale

  • Realtime Cube Updates with Kylin/Kafka Integration

    Seshu Adunuthula eBay

  • High Volume Streaming Analytics with CDAP

    Jia-long Wu Lotame

  • Introducing Athena stream processing platform

    Yuanchi Ning Uber

  • Apache Kafka: Leveraging Real-time Data at Scale

    Neha Narkhede Confluent

  • NRT Event Processing with Guaranteed Delivery of HTTP

    Poorna Chandra Cask

  • Kite: Helping Hadoop Projects Work Together

    Ryan Blue Cloudera