Remerciez-le!

Remerciez @Admin pour avoir partagé cet document gratuitement, de la manière la plus simple, en partageant sur les réseaux sociaux.

User’s guide to climatol An R contributed package for homogenization of climato

User’s guide to climatol An R contributed package for homogenization of climatological series (and functions for drawing wind-rose and Walter&Lieth diagrams) Version 2.2, distributed under the GPL license, version 2 or newer By José A. Guijarro State Meteorological Agency (AEMET), Balearic Islands Ofﬁce, Spain http://www.climatol.eu/index.html January, 2014 User’s guide to climatol by José A. Guijarro is licensed under a Creative Commons Attribution-NoDerivatives 3.0 Unported License. Exceptions: Translations to any language other than English or Spanish are also freely allowed. II Foreword The “Climatol” R contributed package is mostly devoted to the problem of homogenizing clima- tological series, that is to say, remove the perturbations produced by changes in the conditions of observation or in the nearby environment to allow the series to reﬂect only (as far as possible) the climatic variations. The R standard documentation of the package provides descriptions of the functions and their parameters, and users should refer to it whenever needed. This guide, on the other hand, has been written as a complement, trying to focus more on explaining the methodology underlying the algorithms of the package, how to call its functions, and how to interpret and use their results. This guide is structured in two parts: a Quick start (in the following few pages) for those anxious to begin homogenizing their data, and an Extended guide where the different aspects of the package are treated more thoroughly. Most examples of this guide can be reproduced with data ﬁles contained in climatol-dat.zip, downloadable from http://www.climatol.eu/climatol-dat.zip, which contains real series from a Mediterranean area, although the names and coordinates of the stations are ﬁctitious. Acknowledgements This package has greatly beneﬁted from fruitful discussions in the frame of COST Action ES0601, entitled Advances in homogenisation methods of climate series: an integrated ap- proach (HOME). My acknowledgments to all the participants, and to the European Science Foundation for promoting and funding this enriching meetings. I must also acknowledge the Spanish State Meteorological Agency (AEMET) for its continuous support to my participation in this Action. III Quick start First we need to prepare the input data in two plain text ﬁles with adequate formats. In one of them you must provide the coordinates and names of the stations, containing a line of the form X Y Z CODE NAME for each station, where the coordinates X and Y may be in km (from e.g., an UTM projection) or in geographical degrees (longitude and latitude, in this order) with their fractional part in decimals (not in the degrees, minutes and seconds form). The other parameters are the elevation above sea level Z in m1, an identiﬁcation CODE of the station, and the NAME of the station itself, that must be enclosed in quotes if it contains more than one word. (It is advisable to put all names between quotes to avoid errors). The name of this ﬁle must be VAR_FIRSTY-LASTY.est, where VAR is an abbreviation of the climatic variable being analyzed, and FIRSTY and LASTY are the ﬁrst and last years of the studied period. The data must be arranged in another single ﬁle containing station data blocks in the same order as they appear in the station ﬁle. The ﬁle base name will be that of the station ﬁle, using the extension dat. Example: Suppose you are going to homogenize monthly average minimum temperatures from 1956 to 2005, and you choose Tminas a short name for that variable. The stations ﬁle would be Tmin_1956-2005.est, and could begin, as in the accompanying example data, with: 27.0 53.9 456 S03 "La Perla" 31.8 26.5 123 S08 "El Palmeral" 49.2 30.0 154 S11 "Miraflores" 43.4 29.6 156 S13 "Torremar" ... (etc) And the data ﬁle should be named Tmin_1956-2005.dat, and their ﬁrst lines could be: NA NA NA NA NA NA NA NA NA NA NA NA -0.4 1.8 5.5 6.5 15.1 17.4 16.7 16.4 12.2 6.0 2.6 2.3 1.5 4.0 6.5 8.7 12.4 12.1 20.3 NA 14.7 11.0 3.2 0.5 ... (etc) This would be the data for the ﬁrst station of your network2, in chronological order: January to December of 1956, the same for 1957 in the second line, 1958 in the third line, etc. In this example, data from 1956 and August 1958 are missing, and are replaced by NA(Not Available), which is the standard missing data code in R (though others may be used). When all the data 1The altitude term was changed by elevation in this guide in February 2016 following McVicar TR and Körner C (2013): On the use of elevation, altitude, and height in the ecological and climatological literature; Oecologia, 171:335-337 2Actually, these are not the ﬁrst lines of our example data, which do not have any missing data in the ﬁrst three lines. That is why they have been replaced by these other in the text, in order to illustrate how to proceed when missing data are present, which is the usual case. IV from the ﬁrst station are listed, data from the second station follow, an so on until all station data are reported. It is important to note that all station must report data for every month of the study period (1956-2005 in our example), and hence the need of including missing codes to ﬁll any missing data. For convenience, 12 values (a whole year) have been placed in each line, but this is not compulsory; data may be placed in a free space separated format with any number (even variable) of data items in each line, because they will be read sequentially. (Important note: no month must be simultaneously void of data in all the stations of the ﬁle, since this would result in an abnormal process termination). All you have to do to homogenize your data is to start R in your working directory (where your data and station ﬁles are located), load the homogeneity functions, either with the command library(climatol) if you made a regular installation of the package, or with source("depurdat.R") if you have this ﬁle3 in your working directory, and issue the automatic homogenization com- mand, that in our example would be: homogen("Tmin", 1956, 2005, deg=FALSE) This command accepts other optional parameters, the more important being the following: nm Number of data per year in each station (12 by default: monthly values. Set to nm=1 if you are analyzing annual data, nm=4 for seasonal data, etc). deg Set to FALSEif coordinates are in km (the distance unit used internally in the package), or left in its default TRUE value if they are in geographical degrees. std Type of normalization. By default, data will be normalized using both the mean and the standard deviation, but if your variable has a natural zero (e.g., precipitation), std=2 can be preferable (data will be normalized just as ratios to the mean values). Another option is std=1, for only applying differences to the mean values. (See comment in next parameter). rtrans Root transformation to apply to the data: 2 for square root, 3 for cubic root, etc (fractional numbers are allowed). Useful if your variable distribution is far from normal, as with wind speeds or precipitations from arid regions. If a near normal distribution is achieved, full normalization (std=3) can be a better option than ratios to de mean. na.strings Character string to be treated as a missing value. It defaults to the R standard "NA", but can be set to any other strings as, e.g.: na.strings="-999.0". Another example to homogenize seasonal precipitations (four data per year) for the period 1961- 2005, with station coordinates in geographical degrees, applying a cubic root transformation to the data (no example ﬁle provided): homogen("SsPrp", 1961, 2005, nm=4, rtrans=1.8) 3The ﬁle depurdat.R holds the homogenization functions of the package V The command of the ﬁrst example would generate the following ﬁles (in the same working directory): Tmin_1956-2005.esh Station ﬁle after the homogenization. It has the same structure of the input ﬁle Tmin_1956-2005.est, but with additional columns (see the extended guide) and, probably, lines (when the process detects an abrupt shift in the mean, the se- ries will be split, creating a new one with the same coordinates and adding an incremental number to the name and code of the station). Tmin_1956-2005.dah Homogenized data ﬁle with missing data ﬁlled, analogous to the input data ﬁle Tmin_1956-2005.dat. Tmin_1956-2005.txt Log ﬁle of the process, with all messages issued to the screen (in- cluding the ﬁnal summaries). Tmin_1956-2005.pdf File with a (potentially long) collection of diagnostic graphics gen- erated during the process. The log and graphic ﬁles may suggest to re-run the process with different parametrizations (see the extended guide for an explanation), while the homogenized data ﬁles may be post- processed with the function dahstat. For example, if we want a listing of normal values for the period 1971-2000 from the above homogenized temperatures, we can get it in a ﬁle named Tmin_1971-2000.med with the command: dahstat("Tmin", 1956, 2005, 1971, 2000) As you can see, the parameters are the name of the variable, the ﬁrst and last years of the study uploads/Geographie/ climatol-guide 1 .pdf