Applied Multivariate Data Analysis

Objectives

Knowledge and understanding of main techniques for Multivariate Descriptive Analysis . Presentation of several applications of such techniques where we will develop univariate, bivariate and multivariate statistical analysis of data  with quantitative,categorical,ordinal or both variables.

Use of SAS software ( JMP Pro. and Enterprise Guide ).

General characterization

Code

200186

Credits

7.5

Responsible teacher

Hours

Weekly - Available soon

Total - Available soon

Teaching language

Portuguese. If there are Erasmus students, classes will be taught in English

Prerequisites

Basic background of Univariate and Bivariate Descriptive Statistical Analysis and of Linear Algebra.

Bibliography

Teaching method

An emphasis will be given to present in detail the statistical approach underlying the multivariate methodologies applied, so that students may understand the origin of the results they will analyse. A strong emphasis will also be attributed to the interpretation of the outputs., linking them to the more theoretical aspects of the methodologies regarding the context from which the data comes from. Work assignment 1 basically focus this link between theory and practice in the context of PCA learning.

A high importance will be also given to the practical application of the multivariate methodologies under study, using statistical software ( JMP Pro and/or SAS Enterprise Guide) both on academic and also real world datasets to generate outputs and interpret them. Work assignment II will focus more to apply statistical software analyse to real world data.

To further enrich the experience of students helping them to a better understanding the statistical multivariate methods and prepare them to apply in a real context, several published papers will be made available. To further help students in their student path , a set of Frequently Asked Questions and of solved exercises and examples will be made available.

Finally both teachers are always available to answer questions , scheduling extra support sessions whenever students need.

Evaluation method

The evaluation procedure considers two data analysis works assignments and a final exam. First work will allow to evaluate the degree of knowledge and understanding of Principal Component Analysis (PCA), namely concerning concepts and definitions inherent to this metho and particularly a set of indicators useful for data interpretation purposes. Second work will have as main goal the presentation of multivariate statistical analysis of real data , where students are expected to be able to apply one or more adequate techniques to data treatment using software ( JMP Pro and/or SAS Enterprise Guide). Works may be done individually or by organized groups with no more than three people.

The final exam will be done presentially and the structure of exam will be the following:

PCA- 3 questions , one of them for interpretation of outputs.

FCA-  2 questions, one of them for interpretation of outputs.

MFCA- 2 questions

CLUSTER ANALYSIS- 1 question.

Work assignment 1 and 2 will count always for the final grade and are mandatory- first work weight 0.2, second work weight 0.4 and final exam 0.4. A minimum classification for any work or exam will be necessary for approval: eight points.

Subject matter

1-Initial examination of univariate, bivariate and multivariate data.

2-Caracterization of p-dimensional cluster of individuals: centroide and total inertia.

3-Principal Component Analysi(PCA): Scope of the method, a geometric analysis and indicators for interpretation. Applications.

4-Factorial Correspondence Analysis as a particular case of PCA with a chi-square distance. Applications.

5-Generalization: Multiple Factorial Correspondence Analysis. Applications.

6-Data Clustering Methods: Agglomerative hierarchical methods and centet-based clustering algorithms covering different types of variables.

7-Use of SAS ( JMP and Enterprise Guide) for statistical treatment of multivariate data.