Notes: Fundamentals of Machine Learning

Notes: Fundamentals of Machine Learning

February 22, 2025

I am upskilling on AI and machine learning on Datacamp, a learning platform for data, analytics, and AI skills. These notes help me to understand and remember what I’m learning.

Machine Learning

A set of tools for making inferences and predictions from data

Reinforcement Learning Models

Supervised Learning Models

  • Training data contains labeled Target Variables and as many observations with relevant Features as possible
  • Workflow: extract features; split data to test and training datasets; train the model; evaluate the model
Classification models

assigns a category

  • Support vector machine - linear classifier
  • Support vector machine - polynomial classifier: used when data is not linearly separable. Uses a “Polynomial Kernel” trick that implicitly comutes transformations into higher-dimensional spaces, without explicitly performing the transformation: $$ K(x, y) = (\gamma \cdot x^T y + r)^d $$ where x and y rare input vectors; γ is a scaling parameter; r is a constant term; d is the degree of the polynomial.
Regression models

Assigns a continuous variable

Unsupervised Learning Models

  • Training data does not contained labeled data as a target variable: only features: useful for clustering and anomaly detection. Use cases:

Clustering

  • K Means: specify number of clusters
  • Density-based spacial clustering of applications with noise (DBSCAN): specify meaning of a cluster

Association

find relationships between observations

Anomaly detection

find outliers, which is tough in higher dimensions

Performance Improvement

Dimensionality reduction

reduce features, removing features that are correlated or irrelevant

hyperparameter tuning

Various SVM algorithm options: kernel (linear, poly); C; degree; gamma; shrinking; coef0; tol; …

Ensemble methods

Use multiple models


Deep Learning

Inspired Neural Networks; a subset of Machine Learning, requires vast amounts of data

Computer Vision

Thanks, deepfakes come in here.

Natural Language Processing (NLP)

  • Bag of words: count word frequency into n-grams; doesn’t consider similar meanings
  • word embeddings: mathematical meanings for groups of similar words
Updated on