Introduction to Apache Spark

FREE

Anthony D. Joseph 102 Big Data Edx

★★★★★

(1 customer review)

Description
Instructor Details
Additional information
Reviews (1)

8.1/10 (Our Score)

Product is rated as #53 in category Big Data

Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10–100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers are becoming increasingly valued. This statistics and data analysis course will teach you the basics of working with Spark and will provide you with the necessary foundation for diving deeper into Spark. You’ll learn about Spark’s architecture and programming model, including commonly used APIs. After completing this course, you’ll be able to write and debug basic Spark applications. This course will also explain how to use Spark’s web user interface (UI), how to recognize common coding errors, and how to proactively prevent errors. The focus of this course will be Spark Core and Spark SQL. This course covers advanced undergraduate–level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), but previous experience with Spark or distributed computing is NOT required. Students should take this Python mini–quiz before the course and take this Python mini–course if they need to learn Python or refresh their Python knowledge.

Instructor Details

Anthony D. JosephProfessor in Electrical Engineering and Computer Science at University of California, Berkeley

Votes: 0

Courses : 2

Anthony D. Joseph is a Professor in Electrical Engineering and Computer Science at UC Berkeley. He received his B.S., S.M., and Ph.D. Degrees in Computer Science from MIT. He joined the UC Berkeley faculty in 1998, where he is developing adaptive techniques for: cloud computing, network and computer security, and security defenses for machine learning-based decision systems. He also co-leads the DETERlab testbed, a secure scalable testbed for conducting cybersecurity research, and he is a Technical Advisor at Databricks.

Specification: Introduction to Apache Spark

Duration	15 hours
Year	2021
Level	Beginner
Certificate	Yes
Quizzes	Yes

1 review for Introduction to Apache Spark

2.0 out of 5

★★★★★

Write a review

Show all Most Helpful Highest Rating Lowest Rating

★★★★★
Caio Taniguchi – June 22, 2016
More of a paid tutorial than an actual course. It’s ok to say that the Spark approach is better than the alternatives, but it’s just too much.
Other than that, the actual contents of the lectures are ok (although shallow), but quizzes are terrible and utterly irrelevant. They are there to make sure you paid attention to the videos and nothing else.
Setting up the development environment is overly complicated and probably the most challenging aspect of the class, which is frustrating. Didn’t reach the point to get hands on with Spark, since the notebook labs are very verbose and have everything, except code.
For now, I’ll stick to tutorials on the web and the documentation, they seem to be the more promising than this course.
Helpful(0) Unhelpful(0)You have already voted this