Mining Massive Datasets
The course is based on the text Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, and Jeff Ullman, who by coincidence are also the instructors for the course. The book is published by Cambridge Univ. Press, but by arrangement with the publisher, you can download a free copy Here. The material in this on–line course closely matches the content of the Stanford course CS246. The major topics covered include: MapReduce systems and algorithms, Locality–sensitive hashing, Algorithms for data streams, PageRank and Web–link analysis, Frequent itemset analysis, Clustering, Computational advertising, Recommendation systems, Social–network graphs, Dimensionality reduction, and Machine–learning algorithms.
Courses : 2
Specification: Mining Massive Datasets
5 reviews for Mining Massive Datasets
Add a review Cancel reply
This site uses Akismet to reduce spam. Learn how your comment data is processed.
This is a course with interesting content but that is somewhat lacking in pedagogy.
The course has a lot of good content, notably from J.Ullman, but course sessions are very long, pedagogy is not optimal.
The course is a huge time investment with dense content all along the 7 weeks or so. If you can get over this it will be very rewarding but not everyone has that kind of time available.
That course would probably be better off cut in smaller chunks or offered as a self paced course.
Also the fact the course doesn’t offer verified certificate will make think twice before investing so much time in it.
Excellent course by the authors, covering the content of the book of the same name http://www.amazon.com/gp/product/1107077230. It is the MOOC version of http://cs246.stanford.edu. Many useful topics in large scale data processing algorithms are covered including mapreduce, pagerank, networks and graph analysis, streaming algorithms, just to mention a few. The level is advanced undergrad or postgrad, with some chapters covering topics in research papers published within the last decade.
Pacing is faster than most other MOOCs (I estimate about 2x the workload of a typical MOOC). But the material is very useful and rewarding. Exercises are comprehensive and the forums are very useful for checking your understanding.
I found the lecture to be of medium difficulty for the post grad student and I would expect it to be rather hard for an undergrad.
The content is offered in two paces; the lectures of Prof. Ullman are hard to follow, as he browses quickly through many of the notions of the course and does not use enough/ explain in enough detail examples. Jure on the other hand uses a lot of examples and is easy to follow even from an undergrad.
Overall it is a time consuming course, expect to need around 6 8 hours per week. In the end, you do learn quite a few stuff and it is a good lecture to take. I am in favor of the instructors’ choice of offering it as it is in Stanford.
Something that could help in the course is to split the content in 10 weeks instead of 7 and add mandatory programming exercises. They help a lot in learning stuck and remembering them for a long time.
Aliaksandr Bely –
Very interesting course covers a lot of topics. It is rather difficult and takes a lot of time (only lectures usually take around 3 hours/week and it’s hard to watch them faster than 1.25x). The only disappointment for me was lectures taught by prof Ullman, was very hard to fallow his monotonic reading, other two lecturers have strong accents but were much more alive and understandable.
Kristina `ekrst –
I loved this course, and I’m recommending it to everyone. It’s hugely time consuming, however, there are two tracks the basic one and the advanced one. Since the basic one was tough for me, I’m looking forward to taking this course again, and try out the advanced one, since I had no time for it in the first run of this course. I’m glad to see it had more future runs, it surely deserves it. All the instructors were great, and I loved the way how they explained difficult concepts with various analogies and illustrations. The forum discussions were great, but the course lacks some programming assignments to try to see these approaches in practice, or perhaps to make it a bit longer. Great, great job!