Professor: Filipe A. N. Verri

Email: filipe.verri@gp.ita.br

Updates

July 26, 2025: Classes are confirmed to start on August 4, 2025, in room 209 at ICT Unifesp.

Course Program

Brief history of data science. Fundamental data concepts. Methodologies for data science projects. Structured data, database normalization, and tidy data. Data handling operators and their properties. Learning from data and principles of statistical learning theory. Data preprocessing tasks. Evaluation and validation of data science products.

Course Information

Important: Only graduate students are permitted to enroll in this course.

Prerequisites

Goals

Providing the theoretical foundation and practical concepts to develop an end-to-end data science project for an inductive task.

Teaching Methodology

Expository classes in a common classroom, using a whiteboard, slide presentations, coding examples, books, and scientific papers. Supplementary didactic materials will be available in this page. The development of the case study will occur during home study hours, including programming and scientific paper writing.

Assessment

Grading Components

Final Grade Calculation

Final grades will be calculated as:

√((T₁ + T₂ + T₃)/3 × L)

Case Study Project

Ideally, 3 groups will be formed. Each group will be responsible for a case study. Students must choose a real-world problem and develop a data science project, including:

The results must be presented in a 30-minute presentation. Extra points will be awarded to groups that write a scientific paper about the case study. The trained models must be incorporated into a data science product, such as a web application, a mobile application, or a web service.

Bibliography

Any required extra material will be made available in this page.

Schedule

1st Quarter

WeekTopics
1Chapter 1: A brief history of data scienceReview: Mathematical foundations
2Written test (60 min) and Chapter 2: Fundamental concepts
3Chapter 3: Data science project
4-5Chapter 4: Structured data
6-7Chapter 5: Data handling
8Written test (60 min) and Project discussions

2nd Quarter

WeekTopics
1Chapter 6: Learning from data
2Chapter 7: Data preprocessing
3Chapter 8: Solution validation
4Project discussions
5Written test (60 min) and Project discussions
6-7Project discussions
8Presentations

Presentation Details

At most, 3 case studies will be presented per day, with 30 minutes for each presentation and 20 minutes for questions.

A break of 1 week will be observed between the 1st and 2nd quarters.