Data Analysis

Objectives

This course covers techniques of multivariate statistical analysis. Students should be able, given a set of data and a particular goal, choose the appropriate methodology and have critical capacity in relation to the results obtained.
They should also have knowledge of the advantages, limitations and conditions for the use of various data analysis methods presented by discipline.

General characterization

Code

200001

Credits

7.5

Responsible teacher

Paulo Jorge Mota de Pinho Gomes

Hours

Weekly - Available soon

Total - Available soon

Teaching language

Portuguese. If there are Erasmus students, classes will be taught in English

Prerequisites

  1. Not applicable.

Bibliography

Branco João- uma Introdução à Análise de Clusters, Sociedade Portuguesa de Estatística 2004; Bry X. - Analyses Factorielles Multiples - Economica, Poche/Techniques quantitatives 1996; Escofier B. , Pagès J. - Analyses Factorielles Simples et Multiples, Dunod 1990; Gomes Paulo - Análise de Dados , ISEGI , 1993; Lebart L., Morineau A., Warwick K. -Multivariate D

Teaching method

  1. Data Analysis Course will be presented by slides in powerpoint adopting an heuristic and theorical-practical approach where students are invited to understand to wich extent multivariate statistical techniques can produce inovative reports about behaviour of individuals or about variables under study and their interrelations. Focus will be done concerning data analysis applications in very different fields of knowledge but tacking in account statistical limits of described methods for treatment of more complexes data matrices.
  2. Some quite practical sessions will be organized in a computer room , tacking various multivariate real data and applying SAS software  (SAS Enterprise Guide ).
  3. Additionaly students are invited to put specific and general questions each session and by e-mail improving a FAQ system helping learning work.

Evaluation method

  1. Evaluation procedure consider two data analysis works and a final exam.
  2. First work will be proposed in the middle of Data Analysis Course and will allow to evaluate degree of knowledge and understanding of Principal Component Analysis (PCA  ) and respective adaptation to Contingence Tables analysis ( Correspondance Factor Analysis, CFA ), namely concerning concepts and definitions inherents to such methods and particularly set of indicators useful for data interpretation purposes.
  3. Second work will be proposed two weeks before end of Course having  as main goal presentation of a multivariate statistical analysis of real data , where we expect students be able to apply one or more adequate techniques to data treatment, using SAS software ( SAS Enterprise Guide).
  4. Works can be done individually or by previous organized groups with no more than three people.
  5. First work has weihgt 0.2 , second one has weight 0.4 and final exam has weiht 0.4.
  6. Minimum classification for any work or final exam: eight points ( 40%).

Subject matter

  1. Introduction
  2. Descriptive Principal Component Analysis   ( PCA ): scope of the method ; a geometric interpretation of PCA ;Fitting data points in p-dimensional space ; indicators for data interpretation ; extension- a Three-Way PCA ; applications.
  3. Correspondence factorial analysis as a particular case of PCA with a chi-squared distance ; generalization - Multiple Correspondance Analysis ; applications.
  4. Clustering methods : non- hierarchical methods ; hierarchical methods ; applications.
  5. Use of SAS Software  ( SAS Enterprise Guide ) for statistical treatment of multivariate real data.