Statistics for Data Science

Objectives

This course covers techniques of multivariate statistical and time series analysis. Students should be able, given a set of data and a particular goal, choose the appropriate methodology and have critical capacity in relation to the results obtained. They should also have knowledge of the advantages, limitations and conditions for the use of various data analysis methods presented by the course.

General characterization

Code

200178

Credits

7.5

Responsible teacher

Jorge Morais Mendes

Hours

Weekly - Available soon

Total - Available soon

Teaching language

Portuguese. If there are Erasmus students, classes will be taught in English

Prerequisites

Statistics and linear algebra (recomended)

Bibliography

  • Everitt, B. and Hothorn, T. (2011). An Introduction to Applied Multivariate Analysis with R, Springer;
  • Johnson, R.A and Winchern (2007), D. W., Applied Multivariate Statistical Analysis, 6th edition, Pearson Prentice Hall;
  • Sharma, S., (1996) Applied Multivariate Techniques, John Wiley & Sons;
  • Kutner, M.H., Nachtsheim, C.J., Neter, J., Li,W. (2004) Applied Linear Statistical Models, 5th edition,  McGraw-Hill.

Teaching method

The course is based on theoretical and practical classes. The classes are aimed at solving problems and exercises.

Evaluation method

  • (60%) Final exam (1st or 2nd round dates)
  • (40%) Project

Remarks: A minimum grade of 9.5 points is required in final exam.

Subject matter

1. Introduction to Multivariate Statistics Data Analysis

2. Fundamentals on data manipulation - introducing R software

3. Multivariate normal distribution

4. Graphical representation of multivariate data

5. Principal components analysis

6. (Exploratory) Factor Analysis

7. Cluster analysis

8. Linear regression méthods

9. Time series analysis

Programs

Programs where the course is taught: