# Math-ML Course [Online]

*Machine Learning* goes hand in hand with learning the mathematical fundamentals that make it possible to run optimized **Machine Learning** algorithms. The DaSCI Institute has designed, to follow online, the five modules that comprise this course.

#### Teaching Methodology

The course distributes the mathematical concepts associated with Machine Learning in five modules. The contents can be followed independently because each module is self-contained. For this purpose, each module consists of 4-6 short videos of a total of approximately 90 minutes. There is additional material in pdf to obtain the concepts of the course, have recommended readings, as well as practical proposals with associated exercises.

Each module has an academic load of approximately 0.6 ECTS. A participant can contact the responsible for the module to clarify any doubts.

#### Academic Contents

#### Module 1 – Norms and Regularization Techniques for Machine Learning

- Introduction and motivation

Motivation: overfitting concept

Regularization as an overfitting solution - Vector norms

Motivation for the use of vectors norms

Vector Norm Definition

Most used norms: The Lp family

Representation of the vector norms as the unity sphere

Extension: Other norms - ML regularization techniques

Concept of regularization in ML

Most common regularizations and their relationship with vector norms

Comparison between regularizations - Practical example in Python. Objective: to graphically illustrate the concepts taught in the course

##### Researcher

#### Module 2 – Linear Algebra and Dimensionality Reduction

- Introduction

Representation of data sets

Vector spaces

Dimensionality curse - Fundamentals of Linear Algebra

Matrix operations

Linear transformations

Linear and related subspaces - Principal Component Analysis

Least square problem

Resolution

High dimensional PCA calculation - Dimensionality reduction methods

Linear techniques (PCA, Factor Analysis, Linear Discriminant Analysis)

Non-linear techniques, manifold learning (KPCA, MDS, Isomap, LLE)

Neural Models: autoencoders

Probabilistic Neural Models - Practical use of auto encoders

General structure

Regularized autoencoders for manifold learning

Convolutional autoencoders for noise reduction

Generative auto-encoders and instance generation

##### Researcher

#### Module 3 – Probability, Distributions and Probabilistic Models

- Random variables and vectors

Probability space

Distribution of a random variable

Random Vector

Independence of variables - Expectation, Variance and Estimators

Expectation and Variance

Estimators

Correlation - Marginal and conditional distributions

Marginal distribution

Conditional distribution

Law of Total Probability

Bayes’ theorem - Common distributions in ML

Discrete distributions

Continuous distributions

Multivariate distributions - Model parameter estimation

Parameter estimation

Maximum Likelihood estimation

Maximum a posteriori estimation

Bayesian estimation - Introduction to Bayesian networks

Fundamentals of Bayesian Networks

Network construction

Parameter estimation

Naive Bayes

##### Researcher

#### Module 4: Convex/Non-Convex Optimization and Optimization Heuristics

- Introduction

Optimization problem definition

Types of optimization problems

NP complexity class: definition and relation to optimization problems - Constraints on optimization problems

Intuition behind optimization with constraints

Problem with dual constraints. Primal-dual problem, Lagrangian. Weak duality of the problem. Calculating the hyperplane for SVM

Internal and External Penalty Methods

Constraint satisfaction problem - Convex/Non-Convex Optimization Problems

Definition of convex set and convex function. Importance of convex problems

Characterization of convexity and convex functions Relationship to optimization problems with constraints

Examples of Machine Learning problems

Definitions and characteristics of non-convexity

Treatment of non-convex problems - Descending Gradient

A description of the intuitive idea of the algorithm.

The mathematical formulation of the Descending Gradient. Explanation of parameters. Practical example

Downward Gradient Analysis

Descending Gradient Variants - Metaheuristics for Machine Learning

Introduction to Metaheuristics

Metaheuristics for Feature Selection

Metaheuristics for Hyperparameter Tuning

Metaheuristics for Instance Selection

Metaheuristics as optimization algorithms

##### Researcher

#### Module 5: Statistical hypothesis testing, Validation and Comparison of ML Models

- Introduction

Null Hypothesis Statistical Test and Model Validation

Study case - Parametric tests

t-test

ANOVA test

Intervalos y curvas de confianza

T^2 de Hotelling Test - Non-Parametric tests

Checking the preconditions

Two-samples comparison

Convergence study

Non-parametric test for multiples measures

Multiples measures comparison

Post-Hoc Procedures - Bayesian tests

NHST Criticism

Bayesian t-test

Bayesian Sign Test and Bayesian Ranked Sign Test

Imprecise Dirichlet Process

Bayesian Friedman Test

Bayesian Multiples Measures Test