reading-notes

View on GitHub

Data Science Primer

Machine Learning

Machine learning is not about algorithms. Machine learning is the practice of teaching computers how to learn patterns from data, often for making decisions or predictions.

Machine learning is based on experiences, while direct commands are explicit programming, not machine learning

Key Terminology :

Model : a set of patterns learned from data. Algorithm  : a specific ML process used to train a model. Training data : the dataset from which the algorithm learns the model. Test data : a new dataset for reliably evaluating model performance. Features : Variables (columns) in the dataset used to train the model. Target variable : A specific variable you’re trying to predict. Observations : Data points (rows) in the dataset.

The 3 Elements of Great Machine Learning : human guidance clean, relevant data avoid overfitting

Better data beats fancier algorithms Therefore, it is important to arrange the data before using it

Exploratory Analysis The goal of Exploratory Analysis is to study the data and make sure that it is logical and appropriate for our model

Correlation is a value between -1 and 1 that represents how closely two features move in unison : Positive correlation means that as one feature increases, the other increases. E.g. a child’s age and her height. Negative correlation means that as one feature increases, the other decreases. E.g. hours spent studying and number of parties attended.