Remerciez-le!

Remerciez @Admin pour avoir partagé cet document gratuitement, de la manière la plus simple, en partageant sur les réseaux sociaux.

Reml A Guide to REML in GenStat® by Roger Payne, Sue Welham and Simon Harding.

Reml A Guide to REML in GenStat® by Roger Payne, Sue Welham and Simon Harding. GenStat Release 14 was developed by VSN International Ltd, in collaboration with practising statisticians at Rothamsted and other organisations in Britain, Australia and New Zealand. Main authors: R.W. Payne, S.A. Harding, D.A. Murray, D.M. Soutar, D.B. Baird, A.I. Glaser, S.J. Welham, A.R. Gilmour, R. Thompson, R. Webster. Other contributors: A.E. Ainsley, N.G. Alvey, C.F. Banfield, R.I. Baxter, K.E. Bicknell, I.C. Channing, B.R. Cullis, P.G.N. Digby, A.N. Donev, M.F. Franklin, J.C. Gower, T.J. Hastie, S.K. Haywood, A.F. Kane, A. Kobilinsky, W.J. Krzanowski, P.W. Lane, S.D. Langton, P.J. Laycock, P.K. Leech, J.H. Maindonald, G.W. Morgan, J.A. Nelder, A. Papritz, H.D. Patterson, D.L. Robinson, G.J.S. Ross, P.J. Rowley, H.R. Simpson, R.J. Tibshirani, A.D. Todd, G. Tunnicliffe Wilson, L.G. Underhill, P.J. Verrier, R.W.M. Wedderburn, R.P. White and G.N. Wilkinson. Published by: VSN International, 5 The Waterhouse, Waterhouse Street, Hemel Hempstead, Hertfordshire HP1 1ES, UK E-mail: info@genstat.co.uk Website: http://www.genstat.co.uk/ GenStat is a registered trade of VSN International. All rights reserved. © 2011 VSN International Contents Introduction 1 1 Linear mixed models 2 1.1 Split-plot design 3 1.2 Commands for REML analysis 14 1.3 Practical 19 1.4 Means plots 20 1.5 Practical 23 1.6 Predictions 23 1.7 Practical 31 1.8 A non-orthogonal design 31 1.9 Practical 38 1.10 Residual plots 39 1.11 Practical 41 2 Meta analysis with REML 42 2.1 Example: a series of fungicide trials 43 2.2 Commands for meta analysis 50 2.3 Practical 52 3 Spatial analysis 53 3.1 Traditional blocking 54 3.2 Correlation modelling 56 3.3 The VSTRUCTURE directive 69 3.4 Practical 74 3.5 The variogram 74 3.6 Practical 75 4 Repeated measurements 76 4.1 Correlation models over time 77 4.2 Practical 82 4.3 Random coefficient regression 82 4.4 Practical 87 Index 88 Introduction The REML algorithm provides several important types of analysis, that are useful in a wide range of application areas including biology, medicine, industry and finance. In biology they are usually known as linear mixed models, but in some application areas (e.g. education) they may be called multi-level models. GenStat's REML facilities are powerful and comprehensive, but nevertheless very straightforward and easy to use. This book is designed to introduce you to these techniques, and give you the knowledge and confidence to use them correctly and effectively. It has been written to provide the notes for VSN’s course on the use of REML in GenStat, but it can be used equally well as a self-learning tool. One of the key features of REML is that it can analyse data that involve more than one source of error variation. In this respect it is similar to the GenStat ANOVA algorithm, and the similarities and differences between the two methods are explored in detail in Chapter 1. An important advantage of REML over ANOVA is that it can analyse unbalanced designs. It also has a powerful prediction algorithm that extends the ideas in GenStat’s regression prediction algorithm to cover random as well as fixed effects. Chapter 2 covers the use of REML for meta analysis, showing how you can do a simultaneous analysis of several disparate data sets to obtain combined estimates for the treatments of interest. A further advantage of REML is explored in Chapter 3, where we show how it can model spatial correlations between observations in two-dimensions. These methods have proved very successful, for example in the analysis of field experiments to assess new plant varieties. The designs often contain too many varieties for the conventional blocking techniques (e.g. the use of randomized-block designs) to be effective. So instead, for example, auto-regressive models are fitted to the spatial correlations across the field. Chapter 4 examines the use of correlation modelling in the analysis of repeated measurements. Here the correlation is in a single dimension, namely time, and REML provides a powerful alternative to conventional methods such as repeated-measures ANOVA or the analysis of contrasts over time. The book works through a series of straightforward examples, with frequent practicals to allow you to try the methods for yourself. The examples work mainly through the menus of GenStat for Windows, so there is no need for prior knowledge of the GenStat command language. However, we do assume that you will be familiar with ordinary analysis of variance. (If not, we recommend that you work through Chapters 1-5 of the Guide to ANOVA and Design in GenStat.) 1 Linear mixed models The REML algorithm is designed to analyse linear mixed models (also known as multi- level models). The word mixed here indicates that the model contains fixed terms like treatments, as well as random terms, like rows and columns of a field experiment or aspects such as litters in animal experiments. The important feature of REML is that it can handle several random terms (in addition to the usual residual term). The GenStat ANOVA algorithm can also handle several random terms, and we start by comparing the analyses from ANOVA with those from REML. In this chapter you will learn • how to use the Linear Mixed Models (and Analysis of Variance Ú) menus • what output is given by a GenStat REML analysis, and how it compares to GenStat ANOVA • how to assess treatment terms by Wald and F statistics • how to plot means • how to form predictions • how to plot (and assess) residuals • the commands VCOMPONENTS, REML, VDISPLAY, VGRAPH, VPREDICT and VPLOT Ú Note: the topics marked Ú are optional. 1.1 Split-plot design 3 V3 N3 V3 N2 V3 N2 V3 N3 V3 N1 V3 N0 V3 N0 V3 N1 V1 N0 V1 N1 V2 N0 V2 N2 V1 N3 V1 N2 V2 N3 V2 N1 V2 N0 V2 N1 V1 N1 V1 N2 V2 N2 V2 N3 V1 N3 V1 N0 V3 N2 V3 N0 V2 N3 V2 N0 V3 N1 V3 N3 V2 N2 V2 N1 V1 N3 V1 N0 V1 N2 V1 N3 V1 N1 V1 N2 V1 N0 V1 N1 V2 N1 V2 N0 V3 N2 V3 N3 V2 N2 V2 N3 V3 N1 V3 N0 V2 N1 V2 N2 V1 N2 V1 N0 V2 N3 V2 N0 V1 N3 V1 N1 V3 N3 V3 N1 V2 N3 V2 N2 V3 N2 V3 N0 V2 N0 V2 N1 V1 N0 V1 N3 V3 N0 V3 N1 V1 N1 V1 N2 V3 N2 V3 N3 1.1 Split-plot design The design used most often to illustrate the need for several random (or error) terms in ANOVA is the split-plot design. In the split-plot design shown here, the treatments are three varieties of oats (Victory, Golden rain and Marvellous) and four levels of nitrogen (0, 0.2, 0.4 and 0.6 cwt). As it is feasible to work with smaller plots for fertiliser than for varieties, the six blocks were initially split into three whole-plots and then each whole-plot was split into four subplots. The varieties were allocated (at random) to the whole-plots within each block, and the nitrogen levels (at random) to the subplots within each whole-plot. In a randomized-block design, we have a hierarchical structure with blocks and then plots within blocks. 4 1 Linear mixed models Figure 1.1 Figure 1.2 The data files for the examples and exercises used in this Guide can be accessed using the Example Data Sets menu (Figure 1.2). Click on File on the menu bar, and select the Open Example Data Sets option, as shown in Figure 1.1. In the menu, it is convenient to "filter" by the topic A Guide to REML using the drop- down list box in the upper part of the menu. The menu will then list only the files used in this Guide. The data for the split-plot experiment are in the file Oats.gsh. ijk The model describes the yield y from block i, whole-plot j, subplot k by the equation ijk r s rs i ij ijk y = ì + v + a + va + b + w + å where the fixed part of the model consists of ì the overall constant (grand mean), r v the main effect of variety r (where r is the variety assigned to unit ijk), s a the main effect of nitrogen application at level s (where s is the nitrogen level assigned to subplot ijk), and rs va their interaction. The random model terms are i b the effect of block i, ij w the effect of whole-plot j within block i, and ijk å the random error (i.e. residual) for unit ijk (which here is the same as the subplot effect, since the subplots are the smallest units of the experiment). The model can be written in matrix notation as i i i i i i y = 3 X â + 3 Z u + å where y is the vector of data values, i i â is the vector of fixed effects for treatment term i with design matrix X , i i u is the vector of random effects for random term uploads/Management/ reml-guide.pdf