This course is for data scientists (or aspiring data scientists) who want to get PRACTICAL training in PySpark (Python for Apache Spark) using REAL WORLD datasets and APPLICABLE coding knowledge that you’ll use everyday as a data scientist! By enrolling in this course, you’ll gain access to over 100 lectures, hundreds of example problems and quizzes and over 100,000 lines of code!
I’m going to provide the essentials for what you need to know to be an expert in Pyspark by the end of this course, that I’ve designed based on my EXTENSIVE experience consulting as a data scientist for clients like the IRS, the US Department of Labor and United States Veterans Affairs.
I’ve structured the lectures and coding exercises for real world application, so you can understand how PySpark is actually used on the job. We are also going to dive into my custom functions that I wrote MYSELF to get you up and running in the MLlib API fast and make getting started building machine learning models a breeze! We will also touch on MLflow which will help us manage and track our model training and evaluation process in a custom user interface that will make you even more competitive on the job market!
Instructor Details
Courses : 1
Specification: PySpark Essentials for Data Scientists (Big Data + Python)
|
60 reviews for PySpark Essentials for Data Scientists (Big Data + Python)
Add a review Cancel reply
This site uses Akismet to reduce spam. Learn how your comment data is processed.
$94.99 $11.99
Ozzy –
I absolutley LOVE the custom functions you added in the bonus lectures! They were so helpful. And generally, I love your approach to programming. It’s so elegant and graceful. The concepts you cover are exactly what I needed on my job. My team just got started using PySpark and I was really struggling getting started with other courses. But your course gave me some super practical functions and code I could implement right away. THANK YOU for making this course. I can see you poured your heart and sole into it.
Arianne Draco –
This course was definitely a great match for what I needed to learn pyspark. The instructor is definitely very Knowledgeable about the material and her teaching style is definitely engaging. The first time I haven t fallen asleep in a technical lecture. Ha ha Hahaha. I also really like the coding approach. It is very elegant and easy to follow.
Kathryn Jones –
The instructor is very well spoken. You can tell she’s been doing data science stuff for a while AND that she’s a consultant. She seems to know the coding like the bak of her hand and comes up with really creative ways to tackle problems. The pace could be a bit faster for my person liking but to each their own. I just speed up the video speed.
Shayla Baker –
This is course is top notch! I LOVE the examples the instructor gives and the way she talks is very engaging. Her coding methods are also super inventive and easy to follow.
Rahid Alfasy –
So far this course is AMAZING! The instructor’s voice is easy on the ears and explains very complicated subjects in an easy to digest fashion. She (yes SHE!) seems to really know her stuff and her coding is extremely elegant. I’m learning a ton!
Sarah Jones –
I like the way the way the instructor designed this course. The data frames section is very intuitive and seems to hit all the high points that a real data scientist would use. The bonus lectures are also super cool, especially the MLflow one. I had never seen that used before but will definitely use it at my job.
Hanah Granger –
This course is solid. The instructor knows her stuff in and out and has put together really awesome projects with cool datasets. Her approaches to solving the HW assignments and projects are also really creative and unique. My only gripe is that she doesn’t cover spark streaming.
Adam Walker –
The instructor is very easy to listen to. I also like the design of the course. It’s much more practical than other courses that just go through each algorithm individually and don’t show how to compare them. Actual data scientists usually compare several algorithms against each other and that is exactly how the instructor approaches the ML portion of the course. You can tell she (YES SHE! also great) has a lot of experience and designed the course very thoughtfully. Hand clap for this one! I hope she adds more lectures in the future about using MLflow… that part was awesome!
Sarah Jones –
I like the way the way the instructor designed this course. The data frames section is very intuitive and seems to hit all the high points that a real data scientist would use. The bonus lectures are also super cool, especially the MLflow one. I had never seen that used before but will definitely use it at my job.
Tucker Hagan –
This course is definitely solid. Love the examples. And the project exercises are super down to earth.
Weber Allen –
Pretty solid course. The example and project are super practical as advertised. Code is creative. Datasets are engaging.
Anthony Garcia –
This course is REALLY well put together. The concept review lectures are really easy to understand and the coding is inventive and she explains it really well. I’m using PySpark on my job already and have learned A TON with this course that I am able to use for everyday use. The review notebooks have been SUPER useful because I can just search them for what I need. It’s like my own personal google for PySpark!
Tucker Hagan –
This course is definitely solid. Love the examples. And the project exercises are super down to earth.
Weber Allen –
Pretty solid course. The example and project are super practical as advertised. Code is creative. Datasets are engaging.
Anthony Garcia –
This course is REALLY well put together. The concept review lectures are really easy to understand and the coding is inventive and she explains it really well. I’m using PySpark on my job already and have learned A TON with this course that I am able to use for everyday use. The review notebooks have been SUPER useful because I can just search them for what I need. It’s like my own personal google for PySpark!
Zeynep Arslan –
This course is definitely top notch. The instructor gives A TON of examples and very good explanations.
Marcos Gravley –
This instructor definitely knows her stuff. She is super well spoken and seems to have put a lot of thought into the content of the course. It’s obvious that she has actually had a real job in this field because the examples are super practical which I haven’t seen in many other PySpark courses. The review notebooks she provides are also super useful. I’ve used the dataframes one several times already at my job, as well as the function that corrects for skewness.
Zeynep Arslan –
This course is definitely top notch. The instructor gives A TON of examples and very good explanations.
Marcos Gravley –
This instructor definitely knows her stuff. She is super well spoken and seems to have put a lot of thought into the content of the course. It’s obvious that she has actually had a real job in this field because the examples are super practical which I haven’t seen in many other PySpark courses. The review notebooks she provides are also super useful. I’ve used the dataframes one several times already at my job, as well as the function that corrects for skewness.
Ozledim Demir –
This course definitely helped as I was first starting out with PySpark at my job. I was asked to pick up this skill since our data was getting too large to work with and I had to learn quickly as we had a new feature being released that required it. So I can say this course gave me resources to do that FAST since all the examples were so practical, especially the machine learning stuff.
Alicia Delgado –
Solid course design. I’ve been a python programmer as a data scientist for about 6 years now and looking to pick up some big data analytics skills and decided to try this course out among others. This one provides the most realistic examples and approaches to Machine Learning tasks. The bonus lectures are especially helpful as the cover some more advanced topics. I also appreciated the transitioning from Python to PySpark lecture…. that was helpful for Python veteran.
Chen Zhang –
The homework assignments are good. Challenging enough but not too hard or too easy. I like the projects too. They are super creative and the instructors solutions are helpful. I also really like the functions she gives and how she explains when and where to apply UDFs.
Tyron Williams –
Great examples. The projects are fun and the solutions the instructor provides are very creative. Overall it’s a very good course with tons of material that you can apply right away to your job. Super practical.
Shayla Brown –
Super detailed course. Great for getting up and running in PySpark pretty quickly. The instructor really does cover all the essentials it seems.
Riku Takahashi –
Solid course. Great examples. Entertaining projects. The instructor is engaging.
Eric Hanger –
The instructor is engaging and thoughtful with the presentation of the material. I liked the project exercises and the homework assignments. All in all worth the time.
Husain Albaya –
Really well put together. I like the structure of the course. Much more realistic than other courses as far as application on the job goes. You can tell this lady has some a lot of experience.
Vihaan Agarwal –
Awesome course. The homework assignments and lecture were engaging and the projects were fun yet challenging.
Josh Handam –
Definitely a top notch course. Very well put together.
Valdez Akharam –
This course is better than most. The homework assignments and projects are way more realistic to a real job setting and the slides in the lectures have graphics instead of just text which makes it way easier to pay attention. This is first PySpark course that I haven’t fallen asleep to 🙂
Emily Hargrave –
I had a good experience with this course. The instructor was responsive to my questions and covered all the essentials that I need for my job. I copy and pasted a lot of the code from this to my own scripts at work which saved me a lot of time.
George Shepherd –
The concept review lectures are solid and the coding easy to follow and covers all the essentials. Most instructors I’ve found don’t get into details like skewness or outlier transformations but this one does. I found that very helpful.
Eric Makita –
Straight to the point, I am so glad to take that training
Franco Lema –
Excelente course
Muhammad Khan –
This course was very well put together. I liked the concept review lectures and the homework assignments. The ML portion was solid and I LOVED the MLflow stuff! Highly recommend this course.
Reyansh Laghari –
This course is pretty well designed. The structure is different than most because it’s more practical. For example the ML section is divided up in supervised and unsupervised ML applications which it easier to think about which algorithms to consider for your project. I really like that aspect.
Aarav Khatri –
I really enjoyed this course. The instructor is knowledgable and designed the course well. The coding was easy to follow and the homework assignments and project were engaging.
Elizabeth Regaldo –
Great course. Good projects and homework assignments. Solid design.
Liam Hurley –
This may be the best course I’ve ever taken on Udemy…. well ever. It is very well constructed and you can tell the instructor put a lot of thought into the design of the course. I’m a teacher myself so I know good when i see it. This course is QUALITY.
Wang Chen –
This course designed good. I like structure of course. Easy to follow and teacher speaks clearly. Coding part also very easy follow and covers many advanced topics.
Kamil Mysiak –
The installing of pyspark was lacking. Needed to stackoverflow the crap out of my error before I had it installed.
Amina Alowatobi –
This course was exactly what I needed! Not overly simplified or unnecessarily complex. The projects were fun and super realistic to a job setting. The homework assignments were also just challenging enough.
Reginald Albaso –
Great course! Well designed and thoughtful. I like the HW assignments and projects too.
Frank Jamaza –
Great course! Installation was tough, but once I got past those error messages the rest of the course was very easy to follow and the example projects were lots of fun!
Glenda Andersson –
This course was very well done. Def exceeded my expectations. Not like other courses with boring examples and basic coding. The coding in the course is super realistic to a real job setting.
Alexander Kalabalikmi –
Highly recommend this course!
Amir Habiba –
My experience was top notch! I like the layout of the course and the instructor put in some good challenges along the way. I highly recommend this course to anyone thinking to take it.
Avitab Ayan Sarmah –
i want to add a skill to my cv
Yug Reddy –
I liked the lay out of this course. And the fact that the instructor is a SHE is great too! We need more women in data science!!!!!
Chang Huang –
Great course! Loved the instructors approach to teaching. She is engaging and thoughtful in her responses to questions.
Ot vio Augusto Mota Guerra –
For those looking to a complete course in PySpark, this is the course! Covers the most fundamental parts of the API. I highly recommend it for beginners.
Jianping Deng –
The quality of the course is good, but teacher can you improve the quality of your voice input, too much noise, respect!
Edison –
Solid course. The projects are fun and the lectures are super fun to listen to. You def won’t get bored.
Lijo –
Very nice content. But for a beginner , it was little bit hard to learn the mllib part. Anyways great course.
Claudio Roberto D az Mandujano –
Hasta el momento siento que s fue una buena elecci n.
Suyog Shimpi –
Great explanation and content to practice more.
Amod Krishna –
Great start before diving deep into the technology! Keep it up…
Sharath kumar –
Pros: Best pyspark course on Udemy. Covered most of the advanced topics. Cons: Weird bomb blasting sounds irritates a lots before video starts.
Ritesh Tripathi –
This is just a brilliant course, i have skipped the machine learning part for time being, but concentrating more on the data manipulation part. Thanks to the UDEMY algo for suggesting this course.
Chao John –
Great course, it not only covers useful concepts and functionalities, but also uses a wide range of real life datasets with different issues to handle instead of simple clean toy datasets.