CRUNCEP data set





The dataset is a combination of two existing datasets: The CRU TS.3.1 0.5°x0.5° monthly climatology covering the period 1901 to 2012 and the NCEP reanalysis 2.5°x2.5° 6 hours time step beginning in 1948 and available in near real time.


The rationale of this combined dataset was based on several considerations:

1/ The temporal coverage: How to construct a homogeneous dataset that cover all the 20th century up to now. 2/ CRU climatology offer a good spatial resolution but only monthly mean field are available which is a too low resolution for modelling purposes. On the other hand NCEP reanalysis as a temporal resolution of 6 hours compatible with models. But the spatial resolution is low and precipitation of such reanalysis is know to be less reliable than CRU data based on station data.


Method for generating the data:


Between 1948 and 2012 data is based on CRU climatology, NCEP is then used only to generate the diurnal and daily variability. The NCEP is first bi-linearly interpolated to the 0.5°x0.5° resolution of CRU for all fields except precipitation. Hence because the high spatial and temporal heterogeneity of precipitation, linear interpolation will conduct to smooth too much the precipitation. CRU provides a cloudiness that is converted to incoming solar radiation based on calculation of clear sky incoming solar radiation as a function of date and latitude of each pixel based on method of . Likewise the relative humidity is converted to specific humidity as a function of temperature and surface pressure.

We calculate the monthly average of each NCEP field: M. Then the 6h value is calculated as C6h=C*m/M except for temperature where C6h=C+(m-M). Where C6h is the 6h value calculated, m is the NCEP 6h value interpolated to the CRU grid.

Before 1948

The procedure is the same as for 1948-2002 except that for variability we use data from 1948 and then the same variability is applied every year.

from V5, the year used to define the variability of each year before 1948 is no more fixed to 1948 by randomly picked in the period 1948 to 1968.


From CRU data only rainfall, cloudiness, relative humidity and temperature are available. For the others fields (pressure, longwave radiation, windspeed) we directly use the field coming from NCEP reinterpolated to the 0.5°x0.5° grid. Before 1948 we take the value from 1948 (hence there is no interannual wariability for these fields)


 New precipitation product from version 4 :

Recently several studies using cruncep was showing than then number of rainy days in cruncep was very high compared to the number of rainy days given in the CRU database. It is obviously related to the fact that precipitation of cruncep is re scaled on the monthly mean of CRU but the intra-monthly variability is driven by NCEP. Compared the CRU statistics that are based on in-situ data the NCEP precipitation is smother because of the low resolution (2.5°x2.5) and also probably because climate models tend to underestimate large storm events.

This is a problem for several that are very sensitive to precipitation spread. So I try to develop a new product that combine the variability of NCEP but limit the number of rain days to what is given by CRU.

The algorithm I used is as follow:

for each pixel and each month I rank the 6h precipitation for highest to lowest one.

This allow me to define a threshold precipitation level that fit the number of days given by CRU.

Then if 6h precipitation is less than the threshold, precipitation is set to 0. the residual precipitation is added to the first following time step where threshold is reach. Hence the monthly among of precipitation is conserved.

This allow to define precipitations that follow the double constraint of monthly total precipitation and number of rainy days given by CRU and the precipitation distribution given by NCEP.

Files description:


File format is NetCDF. There is one file for each climate variable. There is 180 rows and 360 column. Table 1 summarizes the files, name of variables and units.





Filename suffix

Variable name






















summed over the 6 hours











m s-1


Table 1: Filenames and variables


Land Use Change File:


File name is, variable name is VEGETATION_TYPES and as 4 dimensions: The longitude, latitude, PFT and time. There is one map per year and it covers the period 1901 to 2008.

For each year we represent the fraction of each PFT used in ORCHIDEE.

These PFT’ are:

  1. Bare soil

  2. Tropical Broadleaved Evergreen

  3. Topical Broadleaved Raingreen

  4. Temperate Needleleaved Evergreen

  5. Temperate Broadleaved Evergreen

  6. Temperate Broadleaved Summergreen

  7. Boreal needleleaved evergreen

  8. Boreal broadleaved summergreen

  9. Boreal needleleaved  summergreen

  10. C3 grassland

  11. C4 grassland

  12. C3 crops

  13. C4 crops

Map was obtained as described in Davin et al 2007


Soil map:


Soil map was obtained for Harmonized World Soil Database (HWSD)


History of CRUNCEP:



Version 1 (2008) : based on CRU T2.1 (1901-2002)

Version 2 :  New calculation of radiation from cloud cover

Version 3 (2010) : based on CRU T3.1 (1901-2007). Improved extrapolation for 2008-2009

Version 3.1 (2011) : based on CRU T 3.2 (1901-2009)

Version 4 (2013) : based on CRU T 3.21 (1901-2012), optional new rainfall product in rain_wetdays.tar: directory v4_1901_2012

Version 5 (2014): based on CRU T 3.22 (1901-2013), new rainfall based on wetdays, random variability prior to 1948 directory v5_1901_2013