Data Management and Statistical Analysis


In clinical research it is of utmost importance to guarantee the quality of data. So, before data analysis, a good design of the databases that will be used to collect the data is essential. In fact, the existence of software that allows us to validate the registered data,  made the task of obtaining more accurate data easier. The first aim of  teaching "Data management and Statistical analysis" is to promote the acquisition of knowledge about the design of a relational database as well as about the construction of the interface that will be used by researchers. Microsoft Office Access has been chosen to accomplish this task. Secondly, in order to analyse the data, students must learn how to use SPSS (Statistical Package for the Social Sciences) after  they have been taught the basic concepts of Biostatistics. 

Learning outcomes:
The training should promote several abilities and, by the end of this course, the student should be able to:
a) Relational Databases:
- know what a database management system is
- build the logical data model
- build the physical data model through the preparation of tables and creation of relationships between them
- build forms
- build queries
b) Statistics
- know how to enter data with SPSS
- do an exploratory analysis of the data
- identify which is the statistical model that best fits the data
- implement the model using SPSS.

General characterization





Responsible teacher

Profª Doutora Ana Luisa Papoila


Weekly - Available soon

Total - Available soon

Teaching language





- Bland, M. (2000). An introduction to medical statistics. Third edition. Oxford University Press.
- Daniel, W.W. (2005). Biostatistics: A foundation for analysis in the health sciences. Eighth edition. John Wiley & Sons.
- Pestana, M. H. e Gageiro, J.N. (2005). Análise de dados para ciências sociais: A complementaridade do SPSS. Edições Sílabo, Lisboa.
- V. Carvalho, A. Azevedo e A. Abreu. Sofware Obrigatório: Microsoft Access 2007. Ed. Centro Atlântico.

Teaching method

The training unit theoretical and practical syllabus represents a structured outline of growing complexity. In fact, it begins by the study of instruments that allow the accomplishment of simple tasks such as the description of  samples  using an exploratory analysis, and ends with more difficult tasks such as the corresponding inferential analysis where more complex statistical techniques are required.

The teaching/learning is based in the integration of:
a) Theoretical teaching: themes are exposed by a teacher with the demand of students’ participation;
b) Practical teaching: resolution of practical exercises using MS-ACCESS and SPSS.

The interaction between students and teachers is, either at the classroom or by e-mail. Classes will take, at most, 120 minutes and will take place at a classroom with computers (1 for each student). The number of students may not exceed 20 (limitation imposed by the number of available computers).

Evaluation method

The evaluation process is based in a continuous approach based on the follow-up of the student knowledge and abilities, the class attendance and the frequency of  the participation in the classes. In addition, a formal written examination will take place.

Subject matter

Introduction to Databases. Characteristics of a database management system. Building the logical model. Building the physical model.  Creating forms for data input. Building queries.

Summarizing data. Presenting data. Statistical inference: estimation (sampling distribution, point estimation and confidence intervals) and significance tests: one-sample tests,  two independent samples tests (z test, t tests, and Mann-Whitney test), paired samples tests (paired t-test and Wilcoxon test). Analysis of cross-tabulations (Chi-squared test for association and Fisher's exact test) and McNemar's test for matched samples. Correlation coefficients (Pearson, Spearman and Kendall). Simple linear regression.


Programs where the course is taught: