Statistical Modelling course


Third-year course of the B.Sc. in Artificial Intelligence (Universities of Milano-Bicocca, Milano and Pavia).
A.A. 2025/2026, 1st semester.


Communications
  • Please sign up for the course on the Kiro platform to receive communications via email.

Contacts

laura.dangelo@unimib.it


Calendar

(may be subject to change)



Course material
Lecture notes

The course material is largely based on the book “Modello lineare - Teoria e Applicazioni con R” (Grigoletto M., Pauli F., Ventura L., 2017), and on the course material kindly provided by Prof. Nicola Sartori and Prof. Bernardo Nipoti.

30 Sep - Class 1
Review of probability and statistics.
Probability review (1) - Probability review (2)
Statistics review (1) - Statistics review (2)

02 Oct - Class 2
Introduction: role of the variables; phases of the analysis; types of models.
Simple linear model via ordinary least squares: definition, estimate, interpretation of the parameters, descriptive and inferential properties.
Introduction
Simple linear model via OLS - Properties

07 Oct - Class 3
Simple Gaussian linear model: definition, estimation via likelihood.
Exact distribution of the maximum likelihood estimators.
Inference about the regression coefficients (confidence intervals, tests). Inference about the mean (prediction).
Simple Gaussian LM - Inference about beta - Prediction

09 Oct - Class 4
Partition of the sum of squares. Coefficient of determination R2.
Test about the overall model; equivalence with the test about the significance of β2.
Decomposition of the sum of squares - Coefficient R2
Test about the overall model - Equivalence with the test about β2

14 Oct - Class 5 *
Exercises on the simple Gaussian linear model.
Exercises simple LM
Solutions exercise 1

16 Oct - Class 6
Analysis of the residuals (descriptive properties, distribution, types of residuals).
Diagnostics (residuals vs. fitted, ECDF, normal Q-Q plot).
Analysis of the residuals - Diagnostics

21 Oct - Class 7 *
Exercises on the simple Gaussian linear model (continuation of Class 5).
Solutions exercise 2 - Example on R
Solutions exercises 3 and 4

23 Oct - Class 8
Multiple Gaussian linear model: specification, assumptions, estimation.
Model specification and assumptions - Estimation

28 Oct - Class 9
Geometric interpretation of the linear model.
Distribution of the estimators in the multiple Gaussian linear model.
Geometric interpretation
Distribution of the estimators

30 Oct - Class 10
Gauss-Markov theorem.
Inference in the multiple Gaussian linear model: test about an individual coefficient, test about a subset of coefficients.
Geometric interpretation of the test for comparing nested models.
Special cases: test about βj, test about the overall significance, equivalence with the test about R2.
Gauss-Markov theorem
Test about βj - Test about a subset of β
Geometric interpretation - Special cases

4 Nov - Class 11 *
Adjusted R2
Exercises on the multiple linear model.
Exercises multiple LM
Solutions exercise 1 - Solutions exercise 2

6 Nov - Class 12
Cuckoo exercise: test about the equality of the means of two Gaussian populations with equal variances.
Cuckoo exercise via a simple Gaussian linear model.
General formulation of the analysis of variance (ANOVA) - part 1.
Cuckoo exercise via t test - Cuckoo exercise via LM
Cuckoo exercise on R - Data
ANOVA

11 Nov - Class 13
One-way ANOVA (part 2), decomposition of the sum of squares with grouped data.
Two-way ANOVA with and without interaction.
Analysis of the covariance (ANCOVA).
ANOVA part2 - Two-way ANOVA
ANCOVA

13 Nov - Class 14 *
Exercises on the multiple linear model (part 2).
Exercises
Solutions exercise 3

18 Nov - Class 15
Introduction to generalized linear models.
Poisson regression: assumptions, interpretation of the parameters, estimation, inference (test about an individual coefficient, test for comparing nested models, test about the overall significance, goodness of fit).
Introduction to GLMs
Poisson regression (1) - Poisson regression (2)

20 Nov - Class 16
Introduction to GLMs for binary data.
Logistic regression: assumptions, interpretation of the parameters, estimation, inference (test about an individual coefficient, test for comparing nested models, test about the overall significance, deviance).
Probit regression and interpretation as a threshold model.
Models for binary data
Logistic regression (1) - Logistic regression (2)
Probit model

25 Nov - Class 17 *
Exercises on the multiple linear model (part 3).
Solutions exercise 1 - Solutions exercise 2

27 Nov - Class 18 *
Exercises on the generalized linear models.
Exercises
Solutions exercise 1 - Solutions exercise 2 - Solutions exercise 3

4 Dec - Class 19 *
Exercises on the generalized linear models (part 2).
Exercises
Solutions exercise 1 - Solutions exercise 2


Past Exams

Exam practice - Exam 00
25 Jan 2024 - Exam 01
22 Feb 2024 - Exam 02
27 Jun 2024 - Exam 03
23 Jul 2024 - Exam 04
03 Sep 2024 - Exam 05
24 Sep 2024 - Exam 06

Suggested book

Fox, J., 2015. Applied regression analysis and generalized linear models. Sage Publications.

Abraham and Ledolter, Introduction to Regression Modeling, Duxbury Press, 2006 –> pdf

Recordings

(the ones I don’t forget to do)
Folder