CDAP 5 - Elevating to the Cloud
Terence Yim Google
CDAP is an open source data application platform for building and operating data analytic applications. In its latest 5.0 release, CDAP added support to run your application anywhere, including on premises as well as in the cloud. With this new capability, you are free to leverage the full cloud computation power to run your application at anytime, anywhere. In this talk, we will walk you through the design and showcase the flexibility that can be provided by this new feature, with real world use cases.
Terence Yim is a software engineer with Google Cloud, working on the open source Big Data Platform CDAP (https://cdap.io) and responsible for designing and building realtime processing systems on Hadoop. Prior to Google, Terence worked at both LinkedIn and Yahoo!, building high performance large scale distributed systems.
Apache Kylin - Extreme OLAP engine for big data
Daniel Gu Kyligence
Apache Kylin is an extreme distributed OLAP engine for big data, which provides sub-second latency for analytics queries and web-scale high concurrency. In recent months, Apache Kylin has released its v2.4.0, which brings lots of highlighted features, such as JDBC data source, cube planner, enhanced streaming ingestion etc. Also we’ve seen lots of excellent solution practices from the community like superset integration. In this session, Daniel Gu will outline the design ideas and technical basics of these features and solutions, and present the benefits with real use cases.
Daniel Gu, Kyligence’s Senior Director of Solutions, focus on Big Data Technology and Apache Kylin’s Eco-system development. Have multiple years of experience in Data Platform Design and Implementation within both big companies like eBay and PayPal and also startup company like Machine Zone.
When Rotten Tomatoes isn’t enough: Analyzing Twitter Movie Reviews using DataStax Enterprise
Amanda Moran DataStax
Getting real-time insights is essential in this fast-paced world – like finding a good movie to catch this weekend. In this talk, we’ll use sentiment analysis on Twitter data about the latest movie titles to answer that age old question: “Is that movie any good?” We’ll show how we built the solution using Apache Cassandra, Apache Spark and DataStax Enterprise Analytics. This is a great talk to attend if you are new to the big data space, want to learn more about Cassandra and Spark, or just want to see a demo of DataStax latest product.
Amanda Moran is a Developer Advocate for DataStax. Her passion is bridging the gap between customers and engineering! Amanda graduated from Santa Clara University in 2012 with a Master’s in Computer Science, she also has a Bachelor’s of Science In Biology from the University of Washington. She is based in the Bay Area and has worked for HP, Lockheed Martin, Teradata, and an Apache Trafodion startup Esgyn. Amanda is an Apache Committer and member of the PMC for Apache Trafodion. She has worked on customer poc’s, executive demos, AWS deployments, python coding, data science workshops, conferences, linux/hadoop administration, and scripting — a little bit of everything! In her spare time, she loves running, hanging out with her dog, and finding reasons to go to Disneyland.