Applied Multivariate Data Analysis
Objectives
Knowledge and understanding of main techniques for Multivariate Descriptive Statistical Analysis (MDSAL;
Presentation of numerous applications of such techniques where we will develop univariate,bivariate band multivariate statistical analysis of data with quantitative variables or qualitative , or both.
Use of SAS software (JMP Pro. and Enterprise Guide ).
General characterization
Code
200186
Credits
7.5
Responsible teacher
Leonor Bacelar Valente da Costa Nicolau
Hours
Weekly - Available soon
Total - Available soon
Teaching language
Portuguese. If there are Erasmus students, classes will be taught in English
Prerequisites
Bibliography
Teaching method
An emphasis will be given to present in detail the statistical approach underlying the multivariate methodologies applied , so that students may understand the origin of the results they will analyse. A strong emphasis will also be attributed to interpretation of results , linking them to the more theoritical aspects of the methodologies and also to the interpretation regarding the context from which the data comes from. Work assignment 1 will focus more this link between theory and practise.
A high importance will be also given to the practical application of the multivariate methodologies under study, using statistical software ( JMP Pro and SAS Enterprise Guide) both on academic and also real world datasets to generate outputs using all the methodolies under study . Work assignment II will focus more the ability of students to apply statistical software analyse to real world data and adequately interpret results.
To further enrich the experience of students and help them understand methodologies and prepare them to apply them in a real context, several published papers using the methodologies under study will be made available. To further help students in their study path, a set of Frequently Asked Questions ( continuously under updating with new questions from students) and of solved exercises and examples will be made available.
Finally both teachers are always available to answer questions , scheduling extra support sessions whenever students need and through a AMA whatsapp group where teachers and students are free to communicate 24/7.
Evaluation method
Evaluation
The evaluation procedure considers two data analysis works assignments and a final exam: First work will allow to evaluate the degree of knowledge and understanding of Principal Component Analysis( PCA ) , namely concerning concepts and definitions inherent to such method and particularly a set of indicators useful for data interpretation purposes . Second work will have as main goal the presentation of multivariate statistical analysis of real data, where students are expected to be able to apply one or more adequate techniques to data treatment ,using software ( JMP-Pro and/or Enterprise Guide). Works may be done individually or by organized groups with no more than three people.
The final exam will be done online on Moodle, with a total duration of 120 minutes and the browser lockdown option activated during the exam. The questions will be sequential with one question per page, without any possibility of going back to change the answer after it is submitted. Each question will be selected randomly from several themed groups of possible questions with identical dificulty levels within each group.
The struture of exam will be the following:
PCA: 3 multiple answer questions, graded 2.0 each, and an open-ended question to interpret outputs, graded 5.0
FCA: 2 multiple answer questions , graded 1.5 each, and an open-ended question to interpret outputs, graded 3.0
MFCA: 2 multiple answer questions, graded 1.5 each.
Work assignment 1 and 2 will count always for the final grade- first work weight 0.2, second work weight 0.4 and final exam 0.4- A minimum classification for any work or final exam will be necessary for approval: eight points (40%).
Subject matter
1. Basics on Multivariate Decriptive Statistical Analysis (MDSA).
2. Initial examinarion of multivariate data.
3. Caracterization of variable space and individual space .
4. Caracterization of p- dimensional cluster of individuals: centroide of cluster and total inertia.
5. Principal Component Analysis: Scope of the method; a geometric analysis and indicators for interpretation. Applications
6. Factorial Correspondence Analysis as a particular case of PCA with a chi- square distance. Applications.
7- Generalization: Multiple Factorial Correspondence Analysis . Applications
8. Data Custering methods: Agglomerative hierarchical methods and center-based clustering algorithms. Applications.
9. Use f SAS ( JMP and Enterprise Guide ) for statistical treatment of multivariate data.
Programs
Programs where the course is taught:
- Specialization in Information Analysis and Management
- Specialization in Risk Analysis and Management
- Specialization in Business Intelligence
- Specialization in Information Systems
- specialization in Digital Transformation
- Specialization in Business Intelligence – Working Hours
- specialization in Information Systems - working hours
- PostGraduate in Information Analysis and Management
- PostGraduate Risk Analysis and Management
- PostGraduate in Business Intelligence
- PostGraduate in Smart Cities
- PostGraduate in Information Management and Business Intelligence in Healthcare
- PostGraduate Information Systems Management
- PostGraduate in Enterprise Information Systems
- Pós-Graduação em Transformação Digital