Complete list of courses for quick start in Data Science

In our Data Science group we raised a question about courses which person could take to get started with Data Science. Some time ago I was applying to one investment program where I had to present the list of courses which I need funds. When I created that list I was thinking about quick introduction to all important aspects of the industry with further more deep diving into details. The approach looks like the following: starting from a tool set and concepts, moving to mathematics and statistics background and finishing with advanced techniques. I extended the list with some additional courses and now this is a kind of ToDo list for me for the closest year. Below you can find this list with my comments about each of them. Please, take into account that not all of them I took at the moment of the writing this article. The taken courses with some insights you can find here.

Courses

Please, read annotation to the list below.

Skills Title Teacher Comments Length Price
Data Engineer
Python, Numpy, Pandas, Matlibplot
DAT208x: Introduction to Python for Data Science  Microsoft This course has everything you need in the beginning. It covers Numpy, Pandas, visualization with Matlibplot and Python language. The materials are very good and presented in good ways. It is done in collaboration with DataCamp and you can learn new things in action in their IPython shell. 1 month $100
Data Engineer
Azure Machine Learning, Predictive Models, Azure Data Factory
DAT228x: Developing Big Data Solutions with Azure Machine Learning Microsoft During the course you’ll build a pipeline to get data, train a model and update the model with Azure services. I liked this course since it gives detailed information about what you need to do to build the pipeline and which services you need and why. 1 month $100
Data Engineer
Hadoop, HBase, Storm, Spark, Azure HDInsight, Hive and Pig
Microsoft Azure HDInsight Big Data Analyst (X Series)  Microsoft The series consists of three courses and covers very actual set of technologies. I took partially the first course of the series and found it has detailed explanation of tools, quite enough to understand what is happening under the hood. 3 months $300
Data Engineer
HDFS, MapReduce and Spark RDD, Hive, Spark SQL, DataFrames and GraphFrames, Python
Big Data for Data Engineers Specialization  Yandex The specialization consists of 5 courses. They were made in collaboration of Yandex, Odnoklassniki and MIPT university. All teachers have good expertise in the industry of big data. The lectures, by some reviews, has very structured materials and clean way of storytelling. 6 months  $300
Data Engineer
Supervised, Unsupervised, Reinforcement, and Deep Learning
Machine Learning Engineer  Kaggle The course consists of 6 lessons 1 month length each. Each of the lessons is related to one of the main concepts of Data Science and student will touch all aspects of the industry. 6 months $1200
Data Engineer
Python, jupyter, pandas, numpy, matplotlib
DSE200x:
Python for Data Science
The University of California, San Diego This course is a part of MicroMasters program Data Science. It covers important packages and techniques to work with data in Python and also introduces you to ML concepts. 2 months $350
Data Engineer
Python and algorithms
6.00.1x: Introduction to Computer Science and Programming Using Python MITx This is course for beginners. You’ll learn Python, algorithm and data structures. 2 months $50
Data Scientist
R, ggplot2, swirl, etc.
Data Science Specialization Johns Hopkins University The specialization has big focus on R language usage and data exploration techniques. There are some reviews that it has too abstract storytelling. 10 months  $500
Data Scientist
Python with packages for plotting and data analysis
Applied Data Science with Python Specialization University of Michigan The specialization consists of 5 courses. It gives possibility to learn packages and techniques which are the standard in industry. They all covers such aspects as statistical, machine learning, visualization, text analysis, social network analysis, Python toolkits such as pandas, matplotlib, scikit-learn, nltk, and networkx. 5 months $250
Data Scientist
supervised, unsupervised learning, bias/variance theory, neural networks
Machine Learning Stanford University
Andrew Ng
The course is conducted by Andrew Ng, who is a well-known person in Data Science. The course covers such aspects as supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning), bias/variance theory.  This just a very good course to get started with Data Science. 3 months $79
Data Scientist
Neural networks, Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization
Deep Learning Specialization deeplearning.ai
Andrew Ng
Andrew Ng with his colleagues conducted this course to get deeper introduction to Deep Learning and Neural Networks. Five courses of the specialization will draw you through theory and practice using examples on Python and TensorFlow. 5 months $250
Data Scientist
NLP, reinforcement, deep learning, bayesian methods
Advanced Machine Learning Specialization Higher School of Economics This specialization is built in collaboration with Yandex School of Data Analysis. Seven courses will draw you through Kaggle challengers, text analysis, computer vision, and more. 7 months  $350
Data Scientist
Neural networks, calculus, Python, Octave
Neural Networks for Machine Learning University of Toronto
Geoffrey Hinton
I partially finished this course. It’s not easy to grasp ideas without good background in math, but it’s possible if you learn connected areas in math in parallel. The overall speed will be slower, but it worth it. This course brings the very important basics, though it might be not so modern. It uses Octave which is a bit wired for me personally. There is a review here on this course. Anyway I would strongly recommend it because it brings clarity to the vision of NN. 4 months $50
Data Scientist
Python, probability, distributions, Monte Carlo simulations
6.00.2x: Introduction to Computational Thinking and Data Science MITx This is purely for people who needs to upgrade their skills in computation and data exploration. It’s good that Python is used in the labs. By reviews it gives information for beginners in good and clean way. It worth to take after another MITx course 6.00.1x 3 months $50

Annotation

MOOC platforms are edX, Coursera, Udacity.

Skills

In general Data Science as an industry consists of two specializations: theoretical and hardware. Theoretical specialization is about analyzing of data, finding some insights, building features and so on. Specialist in this area should have good math background and might not know programming very well. The hardware specialization is about constructing a pipeline to gather raw data, prepare it for further analysis, implementing models in real applications. Such specialists may know something about data analysis, but this is not the major skill. As the results in the list you can see Data Scientists and Data Engineers. I tried to sort all courses according to which area they belong more.

Length & Price

Length and prices are approximate. Some of the mentioned courses are available for free and only certification requires investments. Free access can vary from platform to platform and from course to course. For example, Coursera may hide assessments behind the paywall, when materials are available. On edX only certificate is paid.

Additional Courses

There are some other programs which may be interested, but I’ve heard very little about them at the moment. I’m just putting them here to not forget later to check.

  • Master of Computer Science in Data Science. Very expensive course from  the University of Illinois. Passing it you’ll get Master Degree in Data Science.
  • Learning from Data. It’s free lectures and materials from the California Institute of Technology, taught by Professor Yaser Abu-Mostafa. I found good feedback about it, but didn’t include to the list since the course is closed on edX at the moment of writing this article.
  • CS231n: Convolutional Neural Networks for Visual Recognition. This is highly recommended course from the Stanford University. I’m not sure at the moment how to take it online. I found feedback that this course requires good hardware and it might need to create VMs on cloud which satisfy requirements.
  • CS224n: Natural Language Processing with Deep Learning. Same situation as above.
  • MSc in Statistical Science. It’s provided by the University of Oxford. It is a twelve-month full-time taught master’s degree running from October to September each academic year. The MSc has a particular focus on modern computationally-intensive theory and methods.
  • MPhil in Machine Learning, Speech and Language Technology. It’s provided by the University of Cambridge. It is a twelve-month full-time MPhil programme offered by the Computational and Biological Learning Group, the Speech Group, and the Computer Vision and Robotics Group in the Cambridge University Department of Engineering, with a unique, joint emphasis on both machine learning and on speech and language technology.

Books

Here I put some books found during the creation of the list. This paragraph may be moved to another article later.




No Comments


You can leave the first : )



Leave a Reply