# Linear Models in Statistics

## Objectives

• Give the students and enlarged view of the Linear Model  (Simple and Multiple Regression, Analysis of Variance, Analysis of Covariance) with emphasis placed on the related inferential processes, namely the one related with the test of fit of the models and the tests to individual parameters or groups of parameters.
• Introduce the Linear Model as a tool that generalizes some of the inferential techniques given to the students in previous courses in their Curriculum, as the usual T tests for two samples (independent or paired).
• Use of a software to implement the models studied and to carry the tests proposed.

## General characterization

12909

6.0

Weekly - 4

Total - 56

Português

### Prerequisites

Are pre-requisites for a good performance and understanding of the topics in this course:

• a good knowledge of the matters related with the couses Statistics and Probability I and II;
• a relatively good knowledge of matricial Algebra;
• a predisposition, by the students, to relate the topics taught in this course with topics they were taught in other previous courses in their undergraduate courses, namely topics on estimation and testing of hypotheses.

### Bibliography

Bibliography:

Coelho, C. A. (1998). Análise de Regressão.

Draper, N. R. e Smith, H. (1998). Applied Regression Analysis. 3.a ed., Wiley-Interscience, J. Wiley & Sons, New York.

Weisberg, S. (1985). Applied Linear Regression. 2.a ed., J. Wiley & Sons, New York.

Scheffé, H. (1959). The Analysis of Variance. J. Wiley & Sons.

Myers, R. H. (1986). Classical and Modern Regression with Applications. Duxbury Press, Boston.

Sen, A. e Srivastava, M. (1990). Regression Analysis - Theory, Methods and Applications, Springer, New York.

Seber, G. A. F. (1977). Linear Regression Analysis. J. Wiley & Sons, New York. [Chap. 3-8, 12]

Montgomery, D. C. e Peck, E. A. (1982). Introduction to Linear Regression Analysis. J. Wiley & Sons, New York. [Chap.1-9]

Dagnelie, P. (1981). Principes d’Experimentation, Les Presses Agronomiques de Gembloux, Gembloux, Bélgica. [Chap.1-12]

Montgomery, D. (2012). Design and Analysis of Experiments John Wiley & Sons

### Teaching method

Classes are theoretical/practical with oral presentation of concepts, methodologies, and examples, complemented with problem solving. Specific student difficulties will be addressed during classes or in individual sessions scheduled with the professor. Students need to attend a minimum of two thirds of the classes in order to be evaluated. .

### Evaluation method

The evaluation will be done in 2 moments:

Individual assignment (50% of grade) - T1

Individual assignment (50% of grade) - T2

## Subject matter

1. Brief review of some fundamental tests and its implementation through an adeqaute siftware
• Parametric tests for the expected value and variance of a normal distributed random variable
• Goodness-of-fit tests for discrete and continuous distributions: the chi-squre, Kolmogorov-Snirnov, and Shapiro-Wilk tests
• Non-parametric tests: the sign test and the Wilcoxon-Mann-Whitney test
2. The Linear Model
• Possible formulations of the linear model. The conditional expected value as a modeling tool for random variables. Linear, generalized linear and non-linear models.
• The assumptions concerning the linear model. Homocedasticity. Distribution of the error.
• Continuous and categorical explanatory variables. The models of Regression Analysis, Analysis of Variance and Analysis of Covariance.
• The space generated by the explanatory variables. Projections onto this space and his subspaces. Models and submodels.
3. The Linear Regression
• Estimation in the Linear Regression model. Estimation of parameters and linear functions of parameters. The Least Squares method. Brief reference to the equivlence between the Maximum Likelihood methos and the Least Sqaures method for linear models.
• Distributions for the parameter estimators. Confidence intervals.
• Estimating the error variance.
• Inference in the Linear Regression model.
• The table of model analysis and the test of fit of the model. Sums of Squares associated with the Model, Error and Total and corresponding degrees of freedom. The independence between the Sums of Squares for the model and the error. Partitioning the model um of squares.
• Tests for the parameters. The test for a single parameter. The test for a linear combination of parameters. Conditional tests.
• Tests for sets of arameters. Testing between models and their submodels (tests between nested models). The ‘partial’ F test.
• Residual analysis and a test for outliers. Diagnostic for ‘influent points’.
• Variable transformations: i) transformations to linearize the relation between the response variable and the explanatory variable(s); ii) transformations of the response variable to stabilize the variance.
• Matrix appoach of the Linear model and the Linear Regression Model: representation of the model, matrix expressions for the vector of parameter estimators, variances and covariances of the parameter estimators, residuals and ‘estimated values’ of the response variable; confidence intervals (and tests) for the expected value of an ‘estimated value’ of the response variabe; matrix expressions of the Sums of Squares; the number of degrees of freedom associated with each sum of squares as the rank of the matrices that define each of the associated quadratic forms. The Working-Hotelling confidence band; Multiple comparisons and simultaneous confidence intervals. The Working-Hotelling confidence band and the Scheffé contrast method.
• Analysis of some problems that may arise when building a Multiple Linear Reression model, as colinearity among the predictor variables.
• Using Multiple Linear Regression models to tests among Simple or Multiple Linear Regression models.
• The tes of ’lack-of-fit’. The ’lack-of-fit’ test as a test between two nested models.
• Expeditious methos to find a submodel that fits the data: the Backward, Forward and Stepwise methods.
4. The Analysis of Variance
• Implementing the test to the equality of the mean of two Normal populaions with assumed equal variances, based on two independent samples, through the use of a Linear Regression Model. Generalization to several populations. The one-way Analysis of Variance model (completely randomized, fixed effects). Advantages in undertaking the Linear Model approach, relative to the classical approach. Multiple comparisons of means. Tests and confidence intervals for linear combinations of parameters or population means. The Scheffé contrast method.
• The Regression model corresponding to the factorial design with two, three or more factors with fixed effects.
• The implementation of the test to the mean of differences based on paired samples through the us of a Linear Regression model. Generalization to several samples. The randomized block design, with a fixed effects factor.

## Programs

Programs where the course is taught: