Latest Courses
ISTQB Artificial Intelligence Tester Sample ExamsCheck course
JAVA Programming Online Practice ExamCheck course
Programming for Kids and Beginners: Learn to Code in PythonCheck course
Practice Exams | Codeigniter 4 developer certificationCheck course
WordPress Practice Tests & Interview Questions (Basic/Adv)Check course
Git &Github Practice Tests & Interview Questions (Basic/Adv)Check course
Machine Learning and Deep Learning for Interviews & ResearchCheck course
Laravel | Build Pizza E-commerce WebsiteCheck course
101 - F5 CERTIFICATION EXAMCheck course
Master Python by Practicing 100 QuestionCheck course
ISTQB Artificial Intelligence Tester Sample ExamsCheck course
JAVA Programming Online Practice ExamCheck course
Programming for Kids and Beginners: Learn to Code in PythonCheck course
Practice Exams | Codeigniter 4 developer certificationCheck course
WordPress Practice Tests & Interview Questions (Basic/Adv)Check course
Getting and Cleaning Data

Getting and Cleaning Data

FREE

Add your review
Add to wishlistAdded to wishlistRemoved from wishlist 0
Add to compare
8.7/10 (Our Score)
Product is rated as #86 in category Data Science

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data. The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life–long learning, to foster independent and original research, and to bring the benefits of discovery to the world.

Instructor Details

Jeff Leek is an Assistant Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health and co-editor of the Simply Statistics Blog. He received his Ph.D. in Biostatistics from the University of Washington and is recognized for his contributions to genomic data analysis and statistical methods for personalized medicine. His data analyses have helped us understand the molecular mechanisms behind brain development, stem cell self-renewal, and the immune response to major blunt force trauma. His work has appeared in the top scientific and medical journals Nature, Proceedings of the National Academy of Sciences, Genome Biology, and PLoS Medicine. He created Data Analysis as a component of the year-long statistical methods core sequence for Biostatistics students at Johns Hopkins. The course has won a teaching excellence award, voted on by the students at Johns Hopkins, every year Dr. Leek has taught the course.

Specification: Getting and Cleaning Data

Duration

15 hours

Year

2015

Certificate

Yes

Quizzes

Yes

53 reviews for Getting and Cleaning Data

3.8 out of 5
31
8
6
5
3
Write a review
Show all Most Helpful Highest Rating Lowest Rating
  1. Gustavo d P P

    Muito bom! Aprendi a usar dplyr e agora minha vida se tornou muito mais facil!!!

    Helpful(0) Unhelpful(0)You have already voted this
  2. Pablo d A S

    Very well thought out and more advanced than previous one.

    Helpful(0) Unhelpful(0)You have already voted this
  3. Johnnery A

    Los profesores debe explicar con mas detalle los temas. Pienso que hay una brecha importante entre los temas del curso y los proyectos

    Helpful(0) Unhelpful(0)You have already voted this
  4. Luu T S

    Great !! Thank you

    Helpful(0) Unhelpful(0)You have already voted this
  5. Heidi P

    Challenging!

    Helpful(0) Unhelpful(0)You have already voted this
  6. Gustavo D S F

    excellent!

    Helpful(0) Unhelpful(0)You have already voted this
  7. Ratanaporn

    Goog Job

    Helpful(0) Unhelpful(0)You have already voted this
  8. Nishant R S

    Very deeply explained and very accurately organised.

    Helpful(0) Unhelpful(0)You have already voted this
  9. Pitak P

    Good

    Helpful(0) Unhelpful(0)You have already voted this
  10. Eduardo S B

    In my opinion the structure of the course is not the best. I mainly dislike the fact that some libraries, packages, etc. (e.g. MySQL) are not trivial to install. Still I learnt quite a lot, so I wouldn’t say it’s bad.

    Helpful(0) Unhelpful(0)You have already voted this
  11. Thaer Z

    I am done with this course. every week is the same thing. the lectures are a long list of references to other references. The quiz questions can not be answered without spending hours troubleshooting RStudio or searching the forum for help and hints to find out why the loaded packages or functions are not found. The quiz recommends to load packages that don’t work or have dependencies that are no longer valid. I wanted to take this specialization to learn new data analysis techniques. if I wanted to spend my time searching the internet for answers I can do that without paying monthly fees. Good luck everyone. I am done. I will try a different course or field of interest.

    Helpful(0) Unhelpful(0)You have already voted this
  12. Daniel P

    T h e c o u r s e i s good. I like the videos and the assignment. There is cerain redundancy of information. Much of the “new” information was already elaborated in the previous courses of the same specialization. Additionally, the grading system is based on other students whose knolledge may be not beyond the course scope and submitting an inovative solution can mean not passing the course.

    Helpful(0) Unhelpful(0)You have already voted this
  13. Aki T

    This course was excellent and fundamental in order to even start a data analysis. It sets the foundation for how to read and treat the data, which is as the instructor mentioned, often overlooked. Thank you very much for taking the time to break the cleaning process into each comprehensive pieces.

    Helpful(0) Unhelpful(0)You have already voted this
  14. Amit S

    Very good content

    Helpful(0) Unhelpful(0)You have already voted this
  15. Edward A S M

    More examples of code running in parallel with the course work will be helpful.

    Helpful(0) Unhelpful(0)You have already voted this
  16. Jose C

    Great course!

    Helpful(0) Unhelpful(0)You have already voted this
  17. Graciano P

    Great course for getting and cleaning data using R.

    Helpful(0) Unhelpful(0)You have already voted this
  18. Fabien N

    The course was good but a bit too versatile for such hard quizzes.

    Helpful(0) Unhelpful(0)You have already voted this
  19. Vivek G

    Challenging course for working professionals from time perspective but very educative and useful. I learned a lot.

    Helpful(0) Unhelpful(0)You have already voted this
  20. Christian B

    No idea what they want for the project and the discussion forum is clogged with people asking for peer reviews. The previous courses at least provided you with a understanding of what the final product should be, in this case it’s make tidy data, but with no idea on how that data should look.

    Helpful(0) Unhelpful(0)You have already voted this
  21. Sean S

    Challenging but rewarding! Great community.

    Helpful(0) Unhelpful(0)You have already voted this
  22. Patrick B

    Challenging assignment. Interesting material

    Helpful(0) Unhelpful(0)You have already voted this
  23. Dev P

    Good introduction to getting and cleaning data and very useful learning about the principles of tidy data. Jeff Leek isn’t as good a tutor as Roger Peng and it was a bit frustrating following along at times as no hyperlinks are available for the data. The lessons are just recycled content from Jeff’s lectures. The course project was a good challenge!

    Helpful(0) Unhelpful(0)You have already voted this
  24. Erin A

    This is my third course completed in the Data Science Specialization offered by Johns Hopkins. In all three, I feel the lectures, quizzes, and swirl exercises are easily accessible, and then the final project makes me feel like I am seeing R for the first time. One review of the course made a brilliant suggestions: go through the videos as quickly as you can, and then look at what will be asked of you in the final project. Then, go back through the videos and quizzes with a different set of eyes. I feel like there is just so much to learn with R that sometimes you need a lens to help you focus on a subset of things that you absolutely will need, while getting a “taste” for all that R has to offer. Overall, I am enjoying the courses, but the final projects are indeed a different kind of challenge.

    Helpful(0) Unhelpful(0)You have already voted this
  25. Victor A d S P

    Great course, some background needed. If you are taking the Data Science specialization program, then this is a great catch.

    Helpful(0) Unhelpful(0)You have already voted this
  26. Kunal P

    This was one of the best class. Recommend more side reading material on data. SWIRL has a reading link but the link is not provided anywhere else on the board. Also, it would be beneficial if the links can be made clickable in lecture slides. Thanks.

    Helpful(0) Unhelpful(0)You have already voted this
  27. Abdullah M

    Great content. Would have been better with more resources in addition to videos.

    Helpful(0) Unhelpful(0)You have already voted this
  28. rahul g

    A very interesting and unique course, for the kind of things it helps to learn are often ignored. It brings the breadth of the theme forward.

    Helpful(0) Unhelpful(0)You have already voted this
  29. James L J J

    The Value of the course is in the course projects. I found that working through them really accelerated the understanding.

    Helpful(0) Unhelpful(0)You have already voted this
  30. Ricardo L

    The most important information I learned was the tidy data standard. Very useful and clear. It will make the analysis process easier.

    Helpful(0) Unhelpful(0)You have already voted this
  31. Mathew K

    Pros: I learned a ton about cleaning data, the challenges involved, and how to tackle new problems. The quizzes and projects throw you into the deep end, asking you to import some data set and report some features of it, and you often need to figure out what package to use and how to work with it on your own. Cons: The videos in this course are basically useless. You get a superficial coverage of how to use some package without a lot of explanation on what each part does, and basically all of the examples are broken, because the data have been updated, the site has changed/no longer exists. The instructors very annoyingly bat away any responsibility in the forum by saying it would be too expensive to fix anything. Too expensive? This isn’t a Micheal Bay movie, this is a guy talking over a powerpoint.

    Helpful(0) Unhelpful(0)You have already voted this
  32. Udom A

    Good

    Helpful(0) Unhelpful(0)You have already voted this
  33. Manav S

    good

    Helpful(0) Unhelpful(0)You have already voted this
  34. Prasad R B

    very nice

    Helpful(0) Unhelpful(0)You have already voted this
  35. Bill J

    In weeks two and three, the course presents a list of data format and how to read them into R. I would have preferred a better description on why tidy data sets are considered tidy that included some side–by–side comparisons and downstream effects of untidy data. This would help me evaluate the effort and risk of introducing errors from tidying the data against the benefit of tidying it.

    Helpful(0) Unhelpful(0)You have already voted this
  36. Sujeet S

    Too tough

    Helpful(0) Unhelpful(0)You have already voted this
  37. Jehan H

    The course has useful information.

    Helpful(0) Unhelpful(0)You have already voted this
  38. Amanyiraho R

    Best data cleaning technique.

    Helpful(0) Unhelpful(0)You have already voted this
  39. Zhou c

    well–organized course!

    Helpful(0) Unhelpful(0)You have already voted this
  40. Sreemoyee M

    The challenging quizzes and assignments are a main reason why this course is so great! This course truly ensures that you truly understand and implement the videos.

    Helpful(0) Unhelpful(0)You have already voted this
  41. Adam M

    The information in the lectures is very stale, which makes it extremely frustrating to learn from.

    Helpful(0) Unhelpful(0)You have already voted this
  42. Nachiketas N

    Very useful and hands–on.

    Helpful(0) Unhelpful(0)You have already voted this
  43. Willie C

    Not a great course. The lecture videos were dull and not very informative, and did not do a good job of preparing you for the quizzes at the end of each week. The lecture videos mentioned and linked to a number of external resources, but you couldn’t click on the links through the videos, so that wasn’t useful. The forums were much more helpful than the lecture videos when it came to teaching you what you needed to know. I understand why a course like this is essential to the Data Science specialization, but I feel like this content could’ve been covered in a much more engaging and instructive manner.

    Helpful(0) Unhelpful(0)You have already voted this
  44. Mohamed D

    Very easy to follow.

    Helpful(0) Unhelpful(0)You have already voted this
  45. uttam K

    Good course

    Helpful(0) Unhelpful(0)You have already voted this
  46. Liam C

    Week 1 and 2 are completely worthless. They’re cursory 5–10m introductions to topics that show you HOW to start to do something, but don’t explain any commands or what is going on, it’s just instructions to follow. This leaves you completely unprepared to do any actual work. Then you get the assignments and you basically have to go learn everything independently. The course info is useless. I skipped these. When I want to do the type of work they cover, I’ll watch some tutorials and read documentation to actually learn it. They need to focus in on one or two topics (e.g. APIs, MySQL) and actually teach you the basics of them. The lecture videos even use weird syntax without explanation (e.g. using instead of <–. Using par(), etc.). Like the other courses in this specialization, you'll spend almost all of your time learning independently, and not using any of the materials provided. The discussion board is sometimes useful, but you can see how little work is done to improve the course there, as people point out errors and issues which are still outstanding months/years later.

    Helpful(0) Unhelpful(0)You have already voted this
  47. Whitchurch S R M

    This was an awesome course. I really liked the final project. Especially creating a Codebook as well as tidying up the data. I feel I went too much in–depth into creating the codebook as well as the readme file. But in hindsight it was totally worth it. My advice to future learners. push yourselves to the limit when doing the final project. You will definitely learn much much more by putting in 110% into these hard projects.

    Helpful(0) Unhelpful(0)You have already voted this
  48. Yanying J

    It’s useful and practical! Especially, the swirl excerises offered not only the method, but also a clear example of data cleaning.

    Helpful(0) Unhelpful(0)You have already voted this
  49. Neha P

    Good experience with excellent knowledge

    Helpful(0) Unhelpful(0)You have already voted this
  50. Viktor K

    There is a big gap between class material and practical exercises.

    Helpful(0) Unhelpful(0)You have already voted this
  51. Miguel C

    I really enjoyed and learned a lot in this course. I feel a lot more comfortable with looking for and reading data. I learned how to clean data and getting it ready for further analysis. I think the course project was particularly good for completely understanding the process of tidying data and all the aspects it involves, such as writing a code book and a README file for accompanying it. Furthermore, I believe I further developed my R programming skills, by learning how to code new things or things I already knew but in a more efficient way, by using new packages and techniques. Moreover, I found Professor Jeffrey Leek quite engaging, very easy to understand and I had complete confidence in his knowledge on this subject. However, I believe the course is slightly outdated. I was often disheartened and frustrated by not being able to replicate what was being done in the lecture videos. For example, there were many links that did not work anymore and sometimes information that simply wasn’t correct anymore. I found the discussion forums and many mentors responses to be very helpful. I think this can easily be fixed by writing up an errata or updating the lecture videos.

    Helpful(0) Unhelpful(0)You have already voted this
  52. Apurva G

    Its extremely difficult to install the packages. Most of the time the instructions are not clear on what packages to intall in the videos. There should be a pre read with links and instructions on which packages are needed to be able to work on this course. Extremely frustrating, considering a majority of the time is wasted just trying to figure out how to install packages. If you are serious about success of course takers, you have to make it easier to understand and instructions have to be clear.

    Helpful(0) Unhelpful(0)You have already voted this
  53. Rose G

    Everything was new for me in this course, so I loved learning so much

    Helpful(0) Unhelpful(0)You have already voted this

    Add a review

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Price tracking

    Java Code Geeks
    Logo
    Register New Account
    Compare items
    • Total (0)
    Compare