3990-002: Mathematics of Machine Learning (Fall 2023)
Machine learning is the study of algorithms (i.e. gradient descent) that learn functions (i.e. deep networks) from experience (i.e. data). Behind this simple statement, is a lot of mathematical scaffolding: statistics for handling data, optimization for understanding learning algorithms, and linear algebra to create expressive models.
However, the typical computer science degree typical requires only a basic understanding of these mathematical concepts. This means that taking an advanced machine learning course may require taking multiple courses across graduate statistics and mathematics just to get up to speed. It will also get you used to the mathematical lingo of machine learning. If you’ve ever tried reading an ML paper and found it difficult to follow the concepts and equations, this might be the course for you.
To better prepare undergraduates for machine learning coursework and research, this course aims to provide the missing background required to be able understand mathematical concepts commonly used in machine learning. This course will be based on the Mathematics for Machine Learning textbook, which covers the mathematical foundations of machine learning as well as examples of how machine learning algorithms that use these foundations.
Instructor: Eric Wong (exwong@cis)
Class: Monday and Wednesday, 3:30PM-4:59PM
Website: https://www.cis.upenn.edu/~exwong/moml/
Registration: To register, you need to sign up both on courses.upenn.edu and also submit the questionaire on the CIS waitlist before I can add you to the course.
Prerequisites: We will assume you’ve taken the minimum mathematics requirements of the Penn CS degree. That is:
- CIS1600 (CS foundations)
- CIS2610 or equivalent (discrete probability)
- MATH1400/MATH1410 or equivalent (vector calculus)
- MATH2400 or equivalent (linear algebra)
If you haven’t yet taken the course CS prerequisites, you may be able to get by going over the review topics in chapters 2, 5, and 6 of the course textbook.
Structure: We will build upon these foundations and cover a more in depth study suited for machine learning problems. Each focus area will be structured in three parts as (1) review of prior material, (2) new ML fundamentals, and (3) an ML example. The review will quickly go over concepts that were already covered in a previous course. The ML fundamentals will introduce the advanced concepts for machine learning. The example will show you how these fundamentals are used in practice. These focus areas are:
- Probability & statistics. Review: probability spaces and discrete probability. Fundamentals: continuous probability. Example: generalization bounds.
- Linear & functional analysis. Review: linear algebra. Fundamentals: function spaces. Example: representer theorems.
- Calculus & optimization. Review: Multivariate calculus. Fundamentals: optimization. Example: convergence rates.
We will accompany these topics with several examples demonstrating how these core techniques are used to prove fundamental results about machine learning algorithms.
Grading: There will be approximately 10 homeworks (estimated weekly) totaling 50% of your grade. There will also be 3 midterms at 15% each, one per focus area. 5% for participation.
A template for your homework solutions can be found here. Homeworks are due a week after they are assigned.
Schedule
Tentative schedule.
Date | Topic | Notes | |
---|---|---|---|
August 30 | Overview | (1.1, extra notes) | |
September 4 | Labor day (no class) | ||
Probability & statistics | (probability lecture notes) | ||
September 6 | Review | Discrete + Continuous Probability Reading: Chapters 6.1, 6.2 Homework: Problems 6.1, 6.4, 6.11 (due September 13) |
|
September 11 | Review | Discrete + Continuous Probability Reading: Chapters 6.3, 6.4 |
|
September 13 | Fundamentals | Mean and Variance, Gaussian distribution Reading: Chapters 6.4, 6.5 Homework: Problems 6.5, 6.7, 6.9 (due September 20) |
|
September 18 | Fundamentals | Exponential Distributions and Conjugacy Reading: 6.6 |
|
September 20 | Fundamentals | Concentration inequalities (Markov, Chebyshev, WLLN) (concentration lecture notes) Homework: Problems 6.3, 6.12abd, MGF/Chernoff (due September 27) |
|
September 25 | Example | Generalization bounds | |
September 27 | Example | Generalization bounds | |
October 2 | Midterm 1 | ||
Linear & functional analysis | |||
October 4 | Review | Linear algebra (2.2,2.4) (linear algebra lecture notes) Homework: 2.1, 2.3, 2.9 (due October 11) |
|
October 9 | Review | Linear algebra (2.5,2.6) | |
October 11 | Fundamentals | Change of Basis (2.7) Homework: 2.10, 2.16, 2.19 (due October 18) |
|
October 16 | Fundamentals | Inner product spaces and Orthogonality (3.1-3.8) | |
October 18 | Fundamentals | Decompositions (4.1, 4.2, 4.4) Homework: 3.1, 3.7, 4.11 (due October 25) |
|
October 23 | No Class | ||
October 25 | Example | Functional analysis, Hilbert spaces, Kernels (12.4) (representer lecture notes) |
|
October 30 | Example | Representer theorems | |
November 1 | Midterm 2 | ||
Calculus & optimization | |||
November 6 | Review | Multivariate calculus (5.1-5.4) (calculus notes) |
|
November 8 | Review | Multivariate calculus (5.5-5.7) Homework: 5.4, 5.5, 5.9 (due November 15) |
|
November 13 | Fundamentals | Multivariate Taylor Series (5.8-5.9) | |
November 15 | Fundamentals | Gradient Descent (7.1)) (continuous optimization notes) Homework: 7.4, 7.5, 7.8, 7.9 (due November 29) |
|
November 20 | Fundamentals | Constrained and Convex Optimization (7.2-7.3) | |
November 22 | Friday class schedule (no class) | ||
November 27 | Fundamentals | Conjugates & Taylor’s Theorem (SGD convergence notes) |
|
November 29 | Example | Convergence analysis | |
December 4 | Example | Convergence analysis | |
December 6 | Midterm 3 | ||
December 11 | No class | ||
December 21 | Term ends |