Work on netcdf files using Python
In python, it is very convenient to use the Xarray module. If you use conda, you can install it as follows:
conda install xarray
Open netcdf files and display their content
your script to open a netcdf file should look like this:
import xarray as xr
# Open netcdf file:
file_in = 'Ocean.nc'
nc1 = xr.open_dataset(file_in)
print(nc1)
If several files are used with the same grid but different dimension names (e.g. ‘z’ and ‘deptht’), it can be convenient to rename dimensions:
if ( "depth" in ds.dims ):
nc1=nc1.rename({'depth':'z'})
If a dimension has only one value and is not used, you can remove it directly, e.g. for dimension ‘z’:
nc1 = nc1.squeeze('z')
If some variables are not well defined (e.g. conflict with dimension names), you can drop them:
nc1 = xr.open_dataset(file_in,drop_variables={'x','y'})
If your netcdf does not follow the CF conventions, you need to open the file as follows:
nc1 = xr.open_dataset(file_in,decode_cf=False)
You can also set decode_coords=False
or decode_times=False
if you don’t want to interpret coordinates or the time axis but still want to interpret other attributes (e.g. _FillValue).
To open multiple files as a single dataset, e.g.:
files1 = ['seaice_conc_monthly_20050'+month.astype('str')+'_v02r00.nc' for month in np.arange(1,10)]
files2 = ['seaice_conc_monthly_2005'+month.astype('str')+'_v02r00.nc' for month in np.arange(10,13)]
files = files1+files2
nc1 = xr.open_mfdataset(files)
Indexing and basic operations on variables
To get the ‘longitude’ dimension:
print(nc1.longitude.size)
mx=nc1.longitude.size
To get the shape or attributes of a variable named ‘toce’:
print(nc1.toce.shape)
print(nc1.toce.attrs.get('long_name'))
To get the values of a variable (for computational reasons, prefer calculations on xarray data arrays rather than extracting the values):
toce=nc1.toce.values
sst=nc1.toce.values[:,0,:,:]
To run simple operations, e.g. on multi-dimensional ‘toce’ variable:
tID=nc1.toce.get_axis_num('time') # gives integer corresponding to the 'time' axis
mean_sst=nc1.toce.mean(axis=tID) # equivalent to nc1.toce.mean(axis=0)
meanT=nc1.toce.mean()
minT=nc1.toce.min(skipna=True)
heat_content=(nc1.toce*nc1.e3t).sum('z')
To select specific dimension values or slices, e.g. for a variable var of dimensions (time, ygrid, xgrid):
var0=nc1.var.isel(time=0)
var1=nc1.var.isel(xgrid=slice(0,10),ygrid=(5,15))
var2=nc1.var.sel(time="2005-02-01")
var3=nc1.var.sel(xgrid=slice(-5.e6,-1.e6),ygrid=slice(0,1.e6))
Note that isel
selects the index, while sel
finds the nearest neighbor. To interpolate at a given coordinate value:
new_depth=np.array([0., 100., 200., 500.])
var4=nc1.var.interp(z=new_depth)
If variables ‘longitude’ and ‘latitude’ are defined on the (ygrid, xgrid) grid, i.e. are not the coordinates, you can select a specific area as follows:
var5=nc1.var.where((nc1.latitude<-60)&(nc1.latitude>-80)&(nc1.longitude<-90)&(nc1.longitude>-130))
This is also a powerful way to replace values:
var6=nc1.var.where( (nc1.var<1.), 99.) # replaces values >=1. with 99.
Writing netcdf files
If you just want to modify a netcdf file:
# read file
file_in = 'Ocean.nc'
nc1 = xr.open_dataset(file_in)
# modify a variable, e.g.:
nc1.sst = nc1.sst + 273.15
# save to a new netcdf file
file_out = 'Ocean_new.nc'
nc1.to_netcdf(file_out)
If you want to create a netcdf file from scratch, you need to first define the xarray dataset:
# define variables and coordinates (float32 for simple precision):
ds = xr.Dataset(
{
"thetao": (["time", "depth", "latitude", "longitude"], np.float32(THETAO_var)),
"zos": (["time", "latitude", "longitude"], np.float32(ZOS_var)),
"deptho": (["latitude", "longitude"], np.float32(DEPTHO_var))
},
coords={
"longitude":np.float32(lon),
"latitude": np.float32(lat),
"depth": np.float32(dep),
"time": time
},
)
# attributes/encoding for coordinate time:
ds.time.encoding['units'] = 'days since 1900-01-01'
ds.time.encoding['_FillValue'] = None
ds.time.attrs['standard_name'] = 'time'
# attributes for variable thetao:
ds.thetao.attrs['_FillValue'] = 1.e20
ds.thetao.attrs['units'] = 'degC'
ds.thetao.attrs['long_name'] = 'Sea Water Potential Temperature'
# global attribute:
ds.attrs['history'] = 'created using this script'
# write file with UNLIMITED time:
ds.to_netcdf(file_miso3d,unlimited_dims="time")
Further Reading
A list of useful tools and tutorials is provided on the MEOM groups’s page.