Data Science self-assessment topic list

Running a group by specialization is connected with understanding of the current awareness of the group about the subject that they’re interested in. This knowledge gives the opportunity to build a relevant road map for learning of the subject. Now I’m trying to build an assessment to get this understanding for the group which I’m about to run. The purpose of the group is to get familiar with Data Science and all important aspects of this discipline. In order to lunch the group the assessment should give a picture of the group profile. It should not be very detailed because the purpose is to get average level of the group. At the same time the assessment should be connected with all important topics of the discipline. Here I summarized my findings and you can find an example of the assessment.

Discipline Key Notions

The heart of Data Science is mathematics. It is implied in all kinds so the good education in this area is a key factor. I found one book where the detailed assessment is explained – Doing Data Science: Straight Talk from the Frontline. In the book authors are suggested the following profile structure:

  1. Computer science
  2. Math
  3. Statistics
  4. Machine learning
  5. Domain expertise
  6. Communication and presentation skills
  7. Data visualization

There is another assessment provided on edX: Data Science Readiness Assessment. It covers such aspects as calculus, linear algebra and programming. Also this and this explains other important aspects. This article has quite good coverage of algebra for machine learning. There are a lot of other materials can be found on internet. All of them have in common algebra, statistics, programming and special terms for Data Science. Bearing this in mind, I tried to identify essential topics to make a profile in the beginning.

Only Key Factors

The assessment for the beginning should cover only topics which shows person’s awareness about the subject. For example, no sense to include some basic topics about math. For example, if person knows about integrals, he/she for sure also knows about reducing formulas of multiplication. I identified the following main topics for the assessment.

  • Math
    • Multiplication of matrices
    • Inverse matrix
    • Derivatives of functions
  • Statistics
    • Measure of center
      • Mean
      • Median
      • Mode
    • Measure of spread
      • Variance
      • Standard Deviation
      • Covariance
      • Correlation
    • Measure of error
      • Mean Absolute Error
      • Root Mean Squared Error
      • Relative Absolute Error
      • Relative Squared Error
      • The Coefficient of Determination
      • Confusion Matrix
  • Machine Learning Concepts
    • Types
      • Classification
      • Regression
      • Clusterization
    • Model training concepts
      • Supervised/Unsupervised learning
      • Reinforced learning
    • Model evaluation
      • Overfitting
      • Training and Testing data

No Cloud Services

I didn’t include “Cloud services” or something else related to infrastructure because the questions in this area would be based on your technological targets and doesn’t show a person’s readiness from Data Science standpoint. In overall the assessment doesn’t have purpose to give a full and detailed overview. For sure it might be improved a lot. The approach I use is to make some questions which are related to those aspects which person should be aware at some level, which I consider well enough to be ready for Data Science area.

No Programming Languages

I didn’t include any programming language to the assessment because it’s just a tool. The fact is that any of the statistic function can be implemented using any language: Python, R, C++, Java, C#, etc. It seems that Python is the most popular programming language for Data Science solutions because it’s a simple and effective language, with a lot of tools. But really it doesn’t show person’s awareness about Data Science.

Wrapping Up

Building of the assessment with respect to these topics will take some time and soon I’ll publish the example of it. For sure it will contain algebra exercises and some questions with picking a correct answer. Meanwhile I need to verify the topic list and add or remove some items.

 

 




No Comments


You can leave the first : )



Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.