Get started with the amazing Apache Spark parallel computing framework – this course is designed especially for Java Developers.
If you’re new to Data Science and want to find out about how massive datasets are processed in parallel, then the Java API for spark is a great way to get started, fast.
All of the fundamentals you need to understand the main operations you can perform in Spark Core, SparkSQL and DataFrames are covered in detail, with easy to follow examples. You’ll be able to follow along with all of the examples, and run them on your own local development computer.
Included with the course is a module covering SparkML, an exciting addition to Spark that allows you to apply Machine Learning models to your Big Data! No mathematical experience is necessary!
And finally, there’s a full 3 hour module covering Spark Streaming, where you will get hands–on experience of integrating Spark with Apache Kafka to handle real–time big data streams. We use both the DStream and the Structured Streaming APIs.
Optionally, if you have an AWS account, you’ll see how to deploy your work to a live EMR (Elastic Map Reduce) hardware cluster. If you’re not familiar with AWS you can skip this video, but it’s still worthwhile to watch rather than following along with the coding.
Courses : 4
Specification: Apache Spark for Java Developers