Statistical Modelling course
Third-year course of the B.Sc. in Artificial Intelligence (Universities of Milano-Bicocca, Milano and Pavia).
A.A. 2025/2026, 1st semester.
Communications
- Please sign up for the course on the Kiro platform to receive communications via email.
Contacts
Calendar
(may be subject to change)
Course material
Lecture notes
The course material is largely based on the book “Modello lineare - Teoria e Applicazioni con R” (Grigoletto M., Pauli F., Ventura L., 2017), and on the course material kindly provided by Prof. Nicola Sartori and Prof. Bernardo Nipoti.
30 Sep - Class 1
Review of probability and statistics.
Probability review (1) -
Probability review (2)
Statistics review (1) -
Statistics review (2)
02 Oct - Class 2
Introduction: role of the variables; phases of the analysis; types of models.
Simple linear model via ordinary least squares: definition, estimate, interpretation of the parameters, descriptive and inferential properties.
Introduction
Simple linear model via OLS -
Properties
07 Oct - Class 3
Simple Gaussian linear model: definition, estimation via likelihood.
Exact distribution of the maximum likelihood estimators.
Inference about the regression coefficients (confidence intervals, tests). Inference about the mean (prediction).
Simple Gaussian LM -
Inference about beta -
Prediction
09 Oct - Class 4
Partition of the sum of squares. Coefficient of determination R2.
Test about the overall model; equivalence with the test about the significance of β2.
Decomposition of the sum of squares -
Coefficient R2
Test about the overall model -
Equivalence with the test about β2
14 Oct - Class 5 *
Exercises on the simple Gaussian linear model.
Exercises simple LM
Solutions exercise 1
16 Oct - Class 6
Analysis of the residuals (descriptive properties, distribution, types of residuals).
Diagnostics (residuals vs. fitted, ECDF, normal Q-Q plot).
Analysis of the residuals -
Diagnostics
21 Oct - Class 7 *
Exercises on the simple Gaussian linear model (continuation of Class 5).
Solutions exercise 2 -
Example on R
Solutions exercises 3 and 4
23 Oct - Class 8
Multiple Gaussian linear model: specification, assumptions, estimation.
Model specification and assumptions -
Estimation
28 Oct - Class 9
Geometric interpretation of the linear model.
Distribution of the estimators in the multiple Gaussian linear model.
Geometric interpretation
Distribution of the estimators
30 Oct - Class 10
Gauss-Markov theorem.
Inference in the multiple Gaussian linear model: test about an individual coefficient, test about a subset of coefficients.
Geometric interpretation of the test for comparing nested models.
Special cases: test about βj, test about the overall significance, equivalence with the test about R2.
Gauss-Markov theorem
Test about βj - Test about a subset of β
Geometric interpretation - Special cases
4 Nov - Class 11 *
Adjusted R2
Exercises on the multiple linear model.
Exercises multiple LM
Solutions exercise 1 -
Solutions exercise 2
6 Nov - Class 12
Cuckoo exercise: test about the equality of the means of two Gaussian populations with equal variances.
Cuckoo exercise via a simple Gaussian linear model.
General formulation of the analysis of variance (ANOVA) - part 1.
Cuckoo exercise via t test -
Cuckoo exercise via LM
Cuckoo exercise on R -
Data
ANOVA
11 Nov - Class 13
One-way ANOVA (part 2), decomposition of the sum of squares with grouped data.
Two-way ANOVA with and without interaction.
Analysis of the covariance (ANCOVA).
ANOVA part2 -
Two-way ANOVA
ANCOVA
13 Nov - Class 14 *
Exercises on the multiple linear model (part 2).
Exercises
Solutions exercise 3
18 Nov - Class 15
Introduction to generalized linear models.
Poisson regression: assumptions, interpretation of the parameters, estimation, inference (test about an individual coefficient, test for comparing nested models, test about the overall significance, goodness of fit).
Introduction to GLMs
Poisson regression (1) -
Poisson regression (2)
20 Nov - Class 16
Introduction to GLMs for binary data.
Logistic regression: assumptions, interpretation of the parameters, estimation, inference (test about an individual coefficient, test for comparing nested models, test about the overall significance, deviance).
Probit regression and interpretation as a threshold model.
Models for binary data
Logistic regression (1) -
Logistic regression (2)
Probit model
25 Nov - Class 17 *
Exercises on the multiple linear model (part 3).
Solutions exercise 1 -
Solutions exercise 2
27 Nov - Class 18 *
Exercises on the generalized linear models.
Exercises
Solutions exercise 1 -
Solutions exercise 2 -
Solutions exercise 3
4 Dec - Class 19 *
Exercises on the generalized linear models (part 2).
Exercises
Solutions exercise 1 -
Solutions exercise 2
Past Exams
Exam practice - Exam 00
25 Jan 2024 - Exam 01
22 Feb 2024 - Exam 02
27 Jun 2024 - Exam 03
23 Jul 2024 - Exam 04
03 Sep 2024 - Exam 05
24 Sep 2024 - Exam 06
Suggested book
Fox, J., 2015. Applied regression analysis and generalized linear models. Sage Publications.
Abraham and Ledolter, Introduction to Regression Modeling, Duxbury Press, 2006 –> pdf
Recordings
(the ones I don’t forget to do)
Folder