Applied Statistics
Objectives
 Give the students and enlarged view of the Linear Model (Simple and Multiple Regression, Analysis of Variance, Analysis of Covariance) with emphasis placed on the related inferential processes, namely the one related with the test of fit of the models and the tests to individual parameters or groups of parameters.
 Introduce the Linear Model as a tool that generalizes some of the inferential techniques given to the students in previous courses in their Curriculum, as the usual T tests for two samples (independent or paired).
 Use of a software to implement the models studied and to carry the tests proposed.
General characterization
Code
3121
Credits
6.0
Responsible teacher
Miguel dos Santos Fonseca
Hours
Weekly  4
Total  56
Teaching language
Português
Prerequisites
Are prerequisites for a good performance and understanding of the topics in this course:
 a good knowledge of the matters related with the couses Statistics and Probability I and II;
 a relatively good knowledge of matricial Algebra;
 a predisposition, by the students, to relate the topics taught in this course with topics they were taught in other previous courses in their undergraduate courses, namely topics on estimation and testing of hypotheses.
Bibliography
Bibliography:
Coelho, C. A. (1998). Análise de Regressão.
Draper, N. R. e Smith, H. (1998). Applied Regression Analysis. 3.a ed., WileyInterscience, J. Wiley & Sons, New York.
Weisberg, S. (1985). Applied Linear Regression. 2.a ed., J. Wiley & Sons, New York.
Scheffé, H. (1959). The Analysis of Variance. J. Wiley & Sons.
Myers, R. H. (1986). Classical and Modern Regression with Applications. Duxbury Press, Boston.
Sen, A. e Srivastava, M. (1990). Regression Analysis  Theory, Methods and Applications, Springer, New York.
Seber, G. A. F. (1977). Linear Regression Analysis. J. Wiley & Sons, New York. [Chap. 38, 12]
Montgomery, D. C. e Peck, E. A. (1982). Introduction to Linear Regression Analysis. J. Wiley & Sons, New York. [Chap.19]
Dagnelie, P. (1981). Principes d’Experimentation, Les Presses Agronomiques de Gembloux, Gembloux, Bélgica. [Chap.112]
Montgomery, D. (2012). Design and Analysis of Experiments John Wiley & Sons
Teaching method
 Classes (theoretical classes) (2h per week)
 Labs (2h per week), with the use of a software adequate to the implementation of the models and tests studied
(In fact all classes will lay upon an essentially theoreticalpractical setting where the theoreticl bases, the examples and the problems are integrated, being the labs used to solve some of the proposed problems.)
Evaluation method
The evaluation will be done in 2 moments:
Individual assignment (50% of grade)  TBA
Individual assignment (50% of grade)  TBA
Subject matter
 Brief review of some fundamental tests and its implementation through an adeqaute siftware
 Parametric tests for the expected value and variance of a normal distributed random variable
 Goodnessoffit tests for discrete and continuous distributions: the chisqure, KolmogorovSnirnov, and ShapiroWilk tests
 Nonparametric tests: the sign test and the WilcoxonMannWhitney test
 The Linear Model

Possible formulations of the linear model. The conditional expected value as a modeling tool for random variables. Linear, generalized linear and nonlinear models.

The assumptions concerning the linear model. Homocedasticity. Distribution of the error.

Continuous and categorical explanatory variables. The models of Regression Analysis, Analysis of Variance and Analysis of Covariance.

The space generated by the explanatory variables. Projections onto this space and his subspaces. Models and submodels.


The Linear Regression

Estimation in the Linear Regression model. Estimation of parameters and linear functions of parameters. The Least Squares method. Brief reference to the equivlence between the Maximum Likelihood methos and the Least Sqaures method for linear models.

Distributions for the parameter estimators. Confidence intervals.

Estimating the error variance.

Inference in the Linear Regression model.

The table of model analysis and the test of fit of the model. Sums of Squares associated with the Model, Error and Total and corresponding degrees of freedom. The independence between the Sums of Squares for the model and the error. Partitioning the model um of squares.

Tests for the parameters. The test for a single parameter. The test for a linear combination of parameters. Conditional tests.

Tests for sets of arameters. Testing between models and their submodels (tests between nested models). The ‘partial’ F test.


Residual analysis and a test for outliers. Diagnostic for ‘influent points’.

Variable transformations: i) transformations to linearize the relation between the response variable and the explanatory variable(s); ii) transformations of the response variable to stabilize the variance.

Matrix appoach of the Linear model and the Linear Regression Model: representation of the model, matrix expressions for the vector of parameter estimators, variances and covariances of the parameter estimators, residuals and ‘estimated values’ of the response variable; confidence intervals (and tests) for the expected value of an ‘estimated value’ of the response variabe; matrix expressions of the Sums of Squares; the number of degrees of freedom associated with each sum of squares as the rank of the matrices that define each of the associated quadratic forms. The WorkingHotelling confidence band; Multiple comparisons and simultaneous confidence intervals. The WorkingHotelling confidence band and the Scheffé contrast method.

Analysis of some problems that may arise when building a Multiple Linear Reression model, as colinearity among the predictor variables.

Using Multiple Linear Regression models to tests among Simple or Multiple Linear Regression models.

The tes of ’lackoffit’. The ’lackoffit’ test as a test between two nested models.

Expeditious methos to find a submodel that fits the data: the Backward, Forward and Stepwise methods.


The Analysis of Variance

Implementing the test to the equality of the mean of two Normal populaions with assumed equal variances, based on two independent samples, through the use of a Linear Regression Model. Generalization to several populations. The oneway Analysis of Variance model (completely randomized, fixed effects). Advantages in undertaking the Linear Model approach, relative to the classical approach. Multiple comparisons of means. Tests and confidence intervals for linear combinations of parameters or population means. The Scheffé contrast method.

The Regression model corresponding to the factorial design with two, three or more factors with fixed effects.

The implementation of the test to the mean of differences based on paired samples through the us of a Linear Regression model. Generalization to several samples. The randomized block design, with a fixed effects factor.
