At the end of the course, you will be able to: *Retrieve data from example database and big data management systems *Describe the connections between data management operations and the big data processing patterns needed to utilize them in large–scale analytical applications *Identify when a big data problem needs data integration *Execute simple big data integration and processing on Hadoop and Spark platforms This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands–on assignments. Refer to the specialization technical requirements for complete hardware and software specifications. Hardware Requirements: (A) Quad Core Processor (VT–x or AMD–V support recommended), 64–bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right–clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up …
Instructor Details
Courses : 6
Specification: Big Data Integration and Processing
|
57 reviews for Big Data Integration and Processing
Add a review Cancel reply
This site uses Akismet to reduce spam. Learn how your comment data is processed.
Price | Free |
---|---|
Provider | |
Duration | 19 hours |
Year | 2016 |
Level | Beginner |
Language | English |
Certificate | Yes |
Quizzes | No |
FREE
Rozina S –
It would be really helpful if there were full time teaching assistant whom we could directly contact for queries, since questions on forum many times go unanswered.
dstart –
Very good introductory course. It makes you want to continue to learn about Spakr and MongoDB.
Ranjan K G –
Good course to start learning Mongo DB and spark basically.
KAY A –
It is very difficult when the environments don’t work. This course has been very difficult to navigate
Andres H –
That’s an excellent course I’ve Learned a lot about not just the Platforms Basics but also how to perform basic operations in Mongo, the tests and the practical exercises also were well planned to ensure that you know what are you doing.
Swapnil D –
Things are materialized in well manner
Guilherme D C T –
The final project is a bit tough but worth it. If you manage to finish it you’ll have a new understanding of Spark RDDs and DataFrames.
Joaquim P –
I think that this course doesn’t provide a substantial value to the student. It’s basically a series of theoretical videos with irrelevant exercices that the student doesn’t even have to think about. It’s only about copy and paste until the last assignment. Until then, it’s just a waste of time. Obviously it will be a good course for those people who only want the certificate and to pass the course with no effort at all, but it provides no value. On top of this, there is no technical support and I have struggled a lot in order to make everything work properly. I also suggest Coursera to give some guidance in the last assignment, there is a lot of lost people.
Brian S –
Practice activities files are outdated and a lot of the installation of downloaded tools requires manual fixing, there is no support at all from the course publishers.
Luis A R –
Excellent course, very good material.
Laurent C S –
While the teacher are excellent and the course enjoyable, the instructions are simply not working (especially week 6). Just check on the forums, the instructions are so outdated. The Internet changed. Many script must be debugged to add parameters like no check ssl and add classpathes and without this, the test on week 6 is too hard without guessing and retrying. This a shame to sell courses that no longer work. The whole course would need a serious refresh to get it working with some recent version. Remember, it is based on outdated software from 2014 or 2015. Please update ! Thanks
Emilio M –
very good course
Pasawat S –
very useful
Estefania L –
A fundamental and theoretical course for beginners. This course taught about the big data pipeline and the different tools involved in the data analysis process.
shruthi r –
The hands on dataset installation had lots of problems while installing and spark and mongodb hardly worked even after multiple installations and i had tried many ways to get it to work but there was no benifit.
Rohit G –
Environment setup instructions for final assignment do not work as expected and need to be updated to avoid time spent in troubleshooting.
Liliana d C C M –
Muy bueno, aprendi mucho, sobre todo en el trabajo de curso
DHANRAJ N –
This course is very interactive and practical.
Rahul P –
The last exercise was worth all the effort. A huge thumbs up for the entire faculty involved with the course !!
Shekh A –
Awesome
Wania K –
I found this quite beneficial for me, as it provide all the relevant knowledge that is required to know all about Big Data Integration and Processing. Thanks coursera for providing such a platform to everyone.
Suraj J W –
A BIG THANKS for all your help to learn others about this new technology.
JAMES F –
Good info, just a lot of info to digest.
Diogo M –
Some problems occur in VM and the use of virtualbox is not good, I use vmware, because I find some problems to run Virtualbox in my PC. The course needs update.
Shubhradeep B –
anaconda file not working when bigdata3 file is extracted. Very bad experience in doing the project. Local host 8889 not opeening. Dont know why
Misty S –
This course is very good, however not for beginners because one needs some background base for traditional databases and some math skills. Otherwise the course content is good and the assignment really makes the concepts understandable and clear.
ASHUTOSH S –
awesome
Reinaldo L N –
I had lots of problems with postgresql, could not run the hands on for it
Ivan M H C –
So, in general, the course provides you with significant knowledge about big data integration processing, however there were simple exercises that could be done faster if there were no problems executing the commands. This problem leads students to quit the course. I request the staff correct those errors in order to increase the approval rate.
Gary V –
The course was informative but some of the scripts in the exercises were wrong. Example is WGET for example has to be changed to CURL. I spent countless hours just on figuring out what is wrong. There is no support to ask about the any issues with the lab except for the coursera community. The community is supportive, it’s just that they dont give the best answers at times.
Berschauer A –
learning data is corrupted. nothing works
Ahmed R –
Content was up to date but practice exercises are limited to Cloudera platform as well as too old. Need to be updated with more use cases and more exercises. Thanks Coursera 🙂
Sun W –
Content wise is okay. Hands on material is not properly prepared. Software installation has some many errors, even the course is using a standard VM. This shows the provider paying little attention to prepare the course material. Very disappointing attitude. Another error spotted, mongoexport give wrong instructions.. using the instruction, there is no way to extract just the tweets column. I really doubt if the instructors have ever tested on whatever they have provided to the students lol I dont know how can i express my disappointment to this course. Quiz instructions are very unclear as well. If the instructors have ever tried their own instructions, they would discover these naive mistakes. Please be responsible when selling knowledge
Carlos M –
It was a good overview; however I feel that there are not enough examples of streaming processes. Also an example on how to integrate a relational source with a NoSQL one will be valuable for the learners.
KANWALJEET S L –
Awesome able to learn integration with various Big Data Tools
Mohamad K –
Teaching about the quality and not quantity. It’s a great course , i have learned a lot tech that used now days in our real life.
Gnana P B –
It is good course. Some instructions followed not working, course instructions and supplied resources requires update. thank you
Johan A P O –
Last week was a disaster in terms of giving the necessary educational resources. I found it extremely hard to finish the assignment because I couldn’t understand the knowledge set required to do it. I think you must work on making sure students are getting tailored to the functions that you will request them at the end. It was tremendously underwhelming to me to find such interesting tasks and finding myself unable to understand any clear path to perform even the first actions. I had to research a lot out of the platform and dig up old replies in the forum just to have hints about what I had to do to find the answers you were requesting. If you consider that it’s sufficient with what you explained, you’re applying an unfair filter to students. If you didn’t mean that, please adjust either this whole module to focus on * pyspark syntaxis * clear use cases in Data retrieval and analysis * evaluating the syntaxis of each function that you will request later Or just change the last module to make it according to what you’ve taught. Thanks, even though I found these struggles, I was able to learn.
Chandrakanth B –
Able to know how big data helps in data integration and processing works
Anatolie P –
No any support, the soft versions are old. The control data in quiz does not correspond to real data
Sagar –
Mainly I have learned the big data structure and the technologies which are used to control the flow of the data. Practical explanation was really good. This course also given basic idea about the machine learning Algorithms which are used in big data processing such as classification, clustering… etc.I really enjoyed the learning journey..:)
Ripunjay K –
Good course to enhance knowledge. Data sheet for code should be given separately
Tariq A –
The course exceeded my expectations in many regards — especially in the depth of information supplied and the access to the instructor for feedback on work in progress. In a very non threatening environment, I learned key principles of design that I can implement immediately.
Esra K –
Assigments would be more complexity but for a beginner they are enough to understand framework. More code practice should be presented. They need to be repeated to remember of syntax
Nicholas S –
I learned quite a bit about Big Data problems and the varioius technologies, especially Spark, that you can use for those problems
Ferran G F –
Low score because professor team/staff seemed to completely ignore discussion forums. A lot of participants have had problems running shell scripts and other setup instructions that are necessary to perform some tasks, and their posts have been ignored.
Dev A S –
Good course. But we couldn’t relate the theoretical videos with hands on.
Abhishek G –
Amazing course to learn the fundamentals and get hands on experience with mongoDB and pySpark. Course is a little bit challenging due to some errors in guidelines for setting up some working environment and with solutions to final quiz. Would have given 5 stars if those issues discussed on the forums would have been answered. Overall, great learning experience.
Basil C –
The last 2 assignments were really challenging. On hindsight, it has provided a holistic view on the use of MongoDB and Spark in ingesting, processing and transforming data. It required real perseverance in researching and trying out “theories”. It was enriching and rewarding but not for fainthearted.
mostafa r m –
…
aleksei a –
good
Simon B C –
This course was the best of the big data courses by now. I liked the hands on and not to many hints. Let people figure it out by themselves.
Michael L –
I especially enjoyed the hand on exercise of week 6 and all in all the lectures. They give a good overview on various data integration tools. Though, I think the virtual machine and some documentation around it needs an update. If you do not finish exercises in one sweep, it is often not obvious how to restore the original settings. I think I’ve spent almost the same time trying to get the environment on my virtual machine running as with the actual doing in the exercise. I know that this might even reflect the life of a data scientist but some checker scripts which test, if hadoop is running properly, environment variables are set correct, the right version of java is in the path, and so on would be really helpful.
piaoyang –
There is little instruction for the final task (either for the other tasks). And I’m confused by the comments in the jupyter for a long time. You have to google many things to complete the task.
Marianna K –
It is a great hands on course. The problems were in setting up and configuring variety of software for completing projects.
Naman A –
It was fun learning.
Stephen B –
Too many different software packages, not enough depth, and no support. Good high level overview.