Climate datasets are available on the CDS store at https://cds.climate.copernicus.eu/datasets To access and download climate data from the Copernicus data store a user must 1) Create an ECMWF account - click on the Log-in / register button (top right icon on the CDS website) 2) Install the cds api and create a .cdsapirc file in their $HOME directory (usually /home/username/.cdsapirc on Linux/MacOS or %USERPROFILE%.cdsapirc e.g. C:\Users\Username folder for Windows users). The .cdsapirc file contains two lines, the URL of the CDS and a crypted key to access the data. Detailed instructions and examples are available at https://cds.climate.copernicus.eu/how-to-api 3) Send a request using the CDS api in Python to retrieve netcdf (or grib) files
This example focuses on Jupyter notebook/Python there are packages in R to use the same API - details are available at https://bluegreen-labs.github.io/ecmwfr/
####################################################################################
# Install required packages automatically if not already installed on the system
####################################################################################
import sys
!{sys.executable} -m pip install numpy
!{sys.executable} -m pip install matplotlib
!{sys.executable} -m pip install pandas
!{sys.executable} -m pip install cdsapi
!{sys.executable} -m pip install netCDF4
!{sys.executable} -m pip install cartopy
!{sys.executable} -m pip install xarray
!{sys.executable} -m pip install datetime
Please paste your URL and KEY below to acquire access to the Climate Data Store.
To open an account or obtain a key, please follow instructions at Climate Data Store
Please note that the following will replace any existing key if you run it outside this Container!
import os
url = "https://cds.climate.copernicus.eu/api"
key = "<PERSONAL-ACCESS-TOKEN>"
os.system("echo 'url: %s' > ~/.cdsapirc" %url)
os.system("echo 'key: %s' >> ~/.cdsapirc" %key)
##############################
# Import required packages
##############################
import cdsapi
import netCDF4 as nc
import numpy as np
import pandas as pd
import math
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from cartopy.util import add_cyclic_point
import cartopy.mpl.ticker as cticker
import xarray as xr
import datetime
from calendar import monthrange
from datetime import datetime
from datetime import timedelta
Climate datasets are available on the CDS store at https://cds.climate.copernicus.eu/datasets User guide and information about the CDS are available at https://cds.climate.copernicus.eu/user-guide
For the first example, we will use the CDS website to copy and paste the request directly 1) In the search bar type "ERA5 daily" 2) Select "ERA5 post-processed daily statistics on single levels from 1940 to present" [https://cds.climate.copernicus.eu/datasets/derived-era5-single-levels-daily-statistics?tab=overview] 3) 3 Tabs are available: (i) "Overview" provides details about the dataset (ii) "Download" allows to select variables and download the data and (iii) "Documentation" provides links to related scientifc publications and technical reports 4) Click on the Download Tab 5) Tick the boxes "2m temperature", "2025", ", July and August", "Day -> Select all", "Frequency 1-hourly" 6) In "Terms of use" the user needs to accept the lience agreement (to do once for a particular dataset) 7) The data can then be downloaded directly using a browser by pressing the "Submit Form" button, but we will use the API in the following example 8) Under API request click on "Show API request code". This piece of code needs to be copied in the box below. Note that we already imported the cdsapi package earlier ("import cdsapi") and we mofied the output file name and directory (last line of the code below)
# Define output directory and file name
outdir = "Data"
outfile = "example1_ERA5.nc" # Name of the output netcdf file
# Create output directory if it does not exist
if not os.path.exists(outdir):
os.mkdir(outdir)
outfile = outdir + "/" + outfile
#############################################################################
# Copy and paste request from website below (1st example is provided below)
#############################################################################
dataset = "derived-era5-single-levels-daily-statistics"
request = {
"product_type": "reanalysis",
"variable": ["2m_temperature"],
"year": "2025",
"month": ["07", "08"],
"day": [
"01", "02", "03",
"04", "05", "06",
"07", "08", "09",
"10", "11", "12",
"13", "14", "15",
"16", "17", "18",
"19", "20", "21",
"22", "23", "24",
"25", "26", "27",
"28", "29", "30",
"31"
],
"daily_statistic": "daily_mean",
"time_zone": "utc+00:00",
"frequency": "1_hourly"
}
client = cdsapi.Client()
client.retrieve(dataset, request).download(outfile) # we added an output file name that will be saved in outdir
We have now downloaded global daily temperature data based on the ERA5 dataset for July-August 2025. The output file (Data/example1_ERA5.nc) is in netcdf format (*.nc) and contains gridded temperature data and the associated metadata. We will first read the netcdf file directly into Python and print some information about the variables, dimensions and attributes
#######################################################################################
# Read climate data file (previously defined as outfile) and print basic information
#######################################################################################
ds = nc.Dataset(outfile)
# Print some information about the netcdf file
print(ds)
# The former command probides information about the file format - variables - their dimensions and metadata
# Temperature is a 3D variable t2m(valid_time, latitude, longitude)
# Time has 62 points valid_time(62) for July-August 2025, and the related latitude(721), longitude(1440) have 721 and 1440 data points
# Metadata can also be accessed using a catalogue/dictionnary
print(ds.__dict__)
# More information about the temperature variable
print(ds['t2m']) # Temperature is in Kelvin hand the 3D temperature array (ntime=62, nlat=721, nlon=1440)
print(ds['latitude']) # lat in degrees
print(ds['longitude']) # Lon in degrees
print(ds['valid_time']) # Time
# We can also print the latitude and longitude numerical values below
# standard resolution from our former API request is 0.25deg for both lat and Lon
print(ds['latitude'][:])
print(ds['longitude'][:])
##########################################################################
# Plot a map - averaging over the time dimension
##########################################################################
# xarray can also be used to read nc files and print file info
dset = xr.open_dataset(outfile)
print(dset)
# Calculate average on the time dimension
ds_mean=dset.mean(dim='valid_time')
ds_mean = ds_mean -273.15
# Make the figure larger
fig = plt.figure(figsize=(11,8.5))
# Set the axes using the specified map projection
ax=plt.axes(projection=ccrs.PlateCarree())
levels = np.linspace(-15, 35, 26)
# Make a filled contour plot
cs=ax.contourf(dset['longitude'], dset['latitude'], ds_mean['t2m'],
transform = ccrs.PlateCarree(),cmap='coolwarm',extend='both',levels = levels)
# Add coastlines
ax.coastlines()
# Define the xticks for longitude
ax.set_xticks(np.arange(-180,181,60), crs=ccrs.PlateCarree())
lon_formatter = cticker.LongitudeFormatter()
ax.xaxis.set_major_formatter(lon_formatter)
# Define the yticks for latitude
ax.set_yticks(np.arange(-90,91,30), crs=ccrs.PlateCarree())
lat_formatter = cticker.LatitudeFormatter()
ax.yaxis.set_major_formatter(lat_formatter)
# Add colorbar
cbar = plt.colorbar(cs,shrink=0.7,orientation='horizontal',label='Surface Air Temperature (C)')
##########################################################################
# Plot time series for a selected location (Larnaca)
##########################################################################
latsel = 34.9
lonsel = 33.6 # Larnaca
# Extract a dataset closest to specified point
dsloc = dset.sel(longitude=lonsel, latitude=latsel, method='nearest')
# select a variable to plot
dsloc['t2m'].plot()
##########################################################################
# Plot time series for a selected location
##########################################################################
latsel = 34.9
lonsel = 33.6 # Larnaca
# Extract a dataset closest to specified point
dsloc = dset.sel(longitude=lonsel, latitude=latsel, method='nearest')
# select a variable to plot
dsloc['t2m'].plot()
The following example will now be based on a generic function to retrieve ERA5 rainfall and temperature data from the CDS and create a single rainfall and temperature file for Cyprus. Note that there are several variables and option to send an API request to the CDS. More details for ERA5 daily data can be found at https://confluence.ecmwf.int/display/CKB/ERA5+family+post-processed+daily+statistics+documentation
def download_era5_data(varlist, latmin, latmax, lonmin, lonmax, year_start, year_end, gridres, outdir):
# Output directory = input climate data
if not(os.path.isfile(outdir)):
os.system("mkdir -p "+outdir)
years_vector = np.arange(year_start, year_end+1, 1)
years = [str(years_vector) for years_vector in years_vector]
yearnow = datetime.now().year
monthnow = datetime.now().month
########################################################
dataset = "derived-era5-single-levels-daily-statistics"
for var in varlist:
for yr in years:
if yr == str(yearnow):
nmonth = monthnow - 1
else:
nmonth = 12
months_vector = np.arange(1, nmonth +1 , 1)
months = [f"{months_vector:02}" for months_vector in months_vector]
for mn in months:
outfile = var+"_1d_"+yr+"_"+mn+"_ERA5.nc"
outfile = outdir + "/" +outfile
ndays = monthrange(int(yr), int(mn))[1]
days_vector = np.arange(1, ndays+1, 1)
days = [f"{days_vector:02}" for days_vector in days_vector]
request = {
'product_type': ['reanalysis'],
'variable': var,
'year': yr,
'month': mn,
'day': days,
'data_format': 'netcdf',
'grid': gridres,
'area': [latmax, lonmin, latmin, lonmax],
"daily_statistic": "daily_mean",
"time_zone": "utc+00:00",
"frequency": "1_hourly"
}
if not(os.path.isfile(outfile)):
client = cdsapi.Client()
client.retrieve(dataset, request, outfile)
########################################################
# Next block calls the function we defined earlier
#########################################################
#varlist = ["2m_temperature", "total_precipitation"] # variables (temperature and rainfall)
varlist = ["total_precipitation"] # variables (rainfall only)
# Define domain (Cyprus)
latmin = 34.5
latmax = 35.75
lonmin = 32.25
lonmax = 34.75
# Define start year for ERA5 data - test for 2024
year_start = 2024
year_end = 2024
# Spatial resolution of ERA5 data 0.25, 0.5 or 1 deg res
gridres = "0.25/0.25"
# Call the former function to automate the data download process
download_era5_data(varlist, latmin, latmax, lonmin, lonmax, year_start, year_end, gridres, outdir)
We will now download temperature data for Europe based on the CERRA data https://doi.org/10.24381/cds.622a565a
In the Dwonload Tab Select the following:
Variable - 2m temperature
Level Type - Surface
Data Type - Reanalysis
Product type - Analysis
Year 2024
Month August
Day Select all
Time Select all
Data Format netcdf
Accept terms of conditions
Then you just need to copy and paste the API request (example below)
# Define output directory and file name
outfile = "example2_CERRA.nc" # Name of the output netcdf file
# Create output directory if it does not exist
if not os.path.exists(outdir):
os.mkdir(outdir)
outfile = outdir + "/" + outfile
#############################################################################
# Copy and paste request from website below (1st example is provided below)
#############################################################################
dataset = "reanalysis-cerra-single-levels"
request = {
"variable": ["2m_temperature"],
"level_type": "surface_or_atmosphere",
"data_type": ["reanalysis"],
"product_type": "analysis",
"year": ["2024"],
"month": ["08"],
"day": [
"01", "02", "03",
"04", "05", "06",
"07", "08", "09",
"10", "11", "12",
"13", "14", "15",
"16", "17", "18",
"19", "20", "21",
"22", "23", "24",
"25", "26", "27",
"28", "29", "30",
"31"
],
"time": [
"00:00", "03:00", "06:00",
"09:00", "12:00", "15:00",
"18:00", "21:00"
],
"data_format": "netcdf"
}
client = cdsapi.Client()
client.retrieve(dataset, request).download(outfile)
ncdump and ncgen are useful tools to visualize the content of a netcdf file Note that these tools function on MacOsX (brew) or Linux based system - instructions about installing ubuntu on a Windows machine and the other software are provided at the end We will use a linux console to showcase ncdump and CDO functionalities
On the Jupyter notebook in Cyprus you can open a linux console (File -> New -> console) Otherwise you will have to install ubuntu on Windows using the information provided at the end
We will now use the example files we downloaded earlier
First you need to be in the Data directory that should be in the current directory:
cd Data (change directory) in a console
ls -la (list files in the current directory)
Then you can print information about the header of one netcdf file by typing:
ncdump -h example1_ERA5.nc
You will see all variable names, their associated dimensions and attributes
We can also print te values of a specific variable, for example try:
ncdump -v latitude example1_ERA5.nc
This command will pring the latitude values in the file (from 90N to 90S by 0.25deg increment)
To check the version of the Netcdf file type:
ncdump -k example1_ERA5.nc
This command should return netCDF-4, the version of the netcdf file
https://www.youtube.com/watch?v=ggp6pEHllgU
A tutorial is also available at https://ncar.github.io/CESM-Tutorial/notebooks/resources/netcdf.html
The Climate Data operator is a powerful tool to manipulate and process climate data and netcdf files (interpolation, computatio, file concatenation, time averaging and statistics).
CDO is based on operators and is usually called by typing
cdo operatorname infile.nc outfile.nc
First we will concatenate daily files, the generic syntax is:
cdo mergetime list_of_files*.nc outfile.nc
in the Data directory type:
cdo mergetime total_precipitation1d* precip_2024.nc
This command will concatenate all data into a single file
You can now check that the files have been merged by typing
ncdump -h precip_2024.nc
The time dimension = 366 - hence we have concatenated rainfall data files into a single record for 2024. Usually you can then remove the other files and only work with your 2024 data
We can calculate monthly means using the monmean operator, the generic syntax is:
cdo monmean infile.nc outfile.nc
For our example, in the Data directory type:
cdo monmean precip_2024.nc precip_2024_monmean.nc
Same, if you now type 'ncdump -h precip_2024_monmean.nc', the time dimension is now = 12 (monthly data for 2024)
We will now use the sellonlatbox to spatially subset global data (example1_ERA5.nc). The generic syntax is:
cdo sellonlatbox, lonmin,lonmax,latmin,latmax infile.nc outfile.nc
For Europe, type in the data directory:
cdo sellonlatbox,-15.,30.,25.,60. example1_ERA5.nc example1_ERA5_Europe.nc
We have now subset the European region from the global data in example1_ERA5_Europe.nc.
If you have installed ncview (see instructions at the end for installation) you can rapidly visualyze the data using
ncview example1_ERA5_Europe.nc &
You can also use CDO for spatial interpolation. We will now interpolate the CERRA data(2D lat-lon grid) onto the ERA5 data grid (regular 0.25deg grid) using bilinear interpolation. The generic syntax is:
cdo remapbil,targetgridfile.nc infile.nc infile_interp.nc
For our example, in the data directory type:
cdo remapbil,example1_ERA5_Europe.nc example2_CERRA.nc example2_CERRA_interp.nc
The CERRA data has been interpolated onto the ERA5 grid for Europe
You can check the new file dimensions by typing
ncdump -h example2_CERRA_interp.nc
or use ncview to visualyze the interpolated data:
ncview example2_CERRA_interp.nc &
# CDO is a powerful tool to massively process climate data files and can be combined with bash scripts to automate large file
# processing or by using Python or R and using system commands (os.system() in Python and system(command, options) in R)
# For example, we can use ncdump directly into this notebook using a system command:
command = "ncdump -h ./Data/example2_CERRA.nc" # command in string
os.system(command) # will launch the command directly into Python but you need to use Ubuntu or MacOsx
The following shows how to concatenate the files that were downloaded from the CDS for Cyprus and do a few plots
We will now use a terminal to access the loal Linux system and use cdo with command lines (details to install useful tools on Windows are provided below).
Open PowerShell or Terminal as an Administrator
Click the Start button and search for "Terminal" or "PowerShell".
Right-click the top result and select Run as administrator.
Confirm any User Account Control (UAC) prompts.
Run the Installation Command
In the administrator terminal, type the following command and press Enter:
bash
wsl --install
This command will perform several actions:
Enable the necessary Windows optional features.
Download and install the latest WSL kernel.
Set WSL2 as the default environment.
Download and install the default Linux distribution (Ubuntu) from the Microsoft Store.
Restart Your Computer
After the command finishes, you'll be prompted to restart your computer to complete the installation.
Set Up Your Linux Distribution
Once your computer restarts, a Linux terminal will automatically open to begin the setup process.
You will be asked to create a Unix username and password.
Enter your desired username and then create and confirm a password for it.
After Installation
Your Linux distribution is now ready to use. You can access it by opening the Start menu and searching for the name of your distribution, such as "Ubuntu".
For a full list of available Linux distributions, open an administrator terminal and run wsl --list --online.
To install a different distribution, use the command wsl --install -d
Step 1: Install NetCDF Install the NetCDF tools in a linux terminal:
sudo apt install netcdf-bin
Step 2: Verify the Installation Check if the installation was successful by running: ncdump -h
Step 1: Open the WSL Terminal Launch your WSL terminal (e.g., Ubuntu).
Step 2: Update Packages Run the following command to update your package list:
sudo apt update && sudo apt upgrade
Step 3: Install CDO by running:
sudo apt install cdo
Step 4: Verify the Installation Check the installed version of CDO:
cdo -V
For ubuntu OS:
sudo apt-get install ncview
For MacOSX
brew install ncview
Then you can type in a terminal:
ncview file.nc &
To visualize the data