UGRID and LFRic: Polygon mesh intersections¶
I'm aiming to investigate whether it is possible to use the existing python tools to compute intersections of cells definied on the Met Office's LFRic cubesphere. A (possibly out of date but still enlightening) presentation about the LFRic datamodel can also be found here.
I'll start in this notebook with baby steps by using a c12 mesh (6 faces, 12x12 on each face), which is just 864 faces. Naturally, this is several orders of magnitude fewer than the expected mesh of c1024, which is ~6,300,000 faces, but you have to start somewhere, right?
For reference, the LFRic mesh is a cubed-sphere with quadrilateral, approximately equal area, cells.
from IPython.display import Image
Image(filename="cubesphere_ortho.png")
The go-to tool for geometry based predicates in Python is Shapely, which I'll combine with Cartopy to add a quasi-spherical geometry treatment. I'll not yet move on to real spherical treatment until I know the limitations of these tools for this data.
My ultimate aim is to develop a familiarity with UGRID data so that I can design an Iris based data-model that can be used for easily interacting with the LFRic output.
Here goes...
To start with, I've been given a NetCDF file that uses the UGRID conventions for mesh definition:
!ncdump -h cubedsphere.12x12.fixed.nc
Whilst I'm doing this work to ultimately develop a data-model for Iris, interacting with a NetCDF file directly - without any data-model or interpretation - is more easily achieved with XArray.
There seems to be some confusion about what Iris and XArray are. My very simplistic take on the two:
- Use XArray if you want to interact with a Python object that looks just like a NetCDF file
- Use Iris if you want to interact with a Python object that abstracts the file into standard conventions (CF-conventions)
The key difference being that XArray is very familiar to those comfortable with NetCDF but the lack of standards-based abstraction makes it is easy to write code for one file that does not work with another (for example, when a variable name changes).
There are definitely pros and cons to both, and I hope that one-day we will be able to bridge the gap between the two. For now though, it is fairly easy to convert from one to the other using XArray's xarray.DataArray.to_iris
function.
import xarray as xr
ds = xr.open_dataset('cubedsphere.12x12.fixed.nc')
ds
import numpy as np
xs, ys = np.rad2deg(ds['Mesh2_node_x']), np.rad2deg(ds['Mesh2_node_y'])
Now, let's get those faces as geometries:
import shapely.geometry as sgeom
def lfric_geoms(ds):
faces = ds['Mesh2_face_nodes']
offset = faces.attrs.get('start_index', 0)
if offset:
faces -= offset
faces.attrs['start_index'] = 0
for face in faces:
geom = sgeom.Polygon([(xs[i], ys[i]) for i in face])
yield geom
mesh_geoms = list(lfric_geoms(ds))
len(mesh_geoms)
So for those 864 faces, how long is it taking us to construct them?
%timeit list(lfric_geoms(ds))
Let's dive straight in and use cartopy to get a land geometry. There are so many ways of doing this (looking at you GeoPandas), but here is one I wrote a few years ago that still holds up:
import cartopy.io.shapereader as shpreader
shpfilename = shpreader.natural_earth(
resolution='110m', category='cultural', name='admin_0_countries')
countries = list(shpreader.Reader(shpfilename).records())
[Australia] = [country for country in countries
if country.attributes['NAME'] == 'Australia']
AU_geom = Australia.geometry
So I intend to check the intersection of this Australia geometry with the mesh. Whenever you do repeated intersection calculations with shapely, you should consider "preparing" it to factorise out some of the initialisation
from shapely.prepared import prep
AU_prep = prep(AU_geom)
%timeit [AU_prep.intersects(cell) for cell in mesh_geoms]
AU_geoms = [cell for cell in mesh_geoms
if AU_prep.intersects(cell)
# Pick out only small-ish cells for now. More detail later on...
if cell.area < 2000]
len(AU_geoms)
So let's take a look at these matching geometries... cartopy is the go-to tool here.
I'll also get hold of matplotlib and use the interactive backend so that I can interact with the figure while developing in the notebook.
import cartopy.crs as ccrs
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rcParams['figure.figsize'] = (4, 4)
%matplotlib notebook
This little utility function is a god-send. I should probably get it into cartopy...
def set_extent_from_geom(ax, geom):
x0, y0, x1, y1 = geom.bounds
ax.set_extent([x0, x1, y0, y1], ccrs.Geodetic())
OK, all ducks in a row, now plot the mesh cells that we found to intersect with Australia.
plt.figure()
pc_180 = ccrs.PlateCarree(central_longitude=180)
ax = plt.axes(projection=pc_180)
ax.add_geometries(AU_geoms, ccrs.PlateCarree(),
facecolor='blue', edgecolor='k', alpha=0.3)
set_extent_from_geom(ax, AU_geom.buffer(30))
ax.coastlines();
This is a good start. Let's do this on a global scale and produce a land-sea mask.
from shapely.ops import unary_union
shpfilename = shpreader.natural_earth(resolution='110m',
category='physical',
name='land')
land = unary_union(list(shpreader.Reader(shpfilename).geometries()))
# There is a problem with this geometry, so the old "buffer it by 0"
# trick should fix it up for us.
land = land.buffer(0)
land
So how many of these mesh cells intersect with land?
land_prep = prep(land)
land_cells = [cell for cell in mesh_geoms
if land_prep.intersects(cell)]
print(len(land_cells))
This seems a little low. Let's take a look:
plt.figure()
ax = plt.axes(projection=ccrs.PlateCarree())
plt.title('Land geometries')
ax.add_geometries(land_cells, ccrs.PlateCarree(),
facecolor='blue', edgecolor='k', alpha=0.3)
ax.coastlines();
The problem is that some geometries don't back project correctly to a projected space such as plate carree. Until we have spherical geometries, this is always going to be painfull.
Let's leave the painfull cells out for now..
def drop_large(geoms, area):
"""Generator of geometries that have an area below the given value."""
for geom in geoms:
if geom.area < area:
yield geom
plt.figure()
ax = plt.axes(projection=ccrs.PlateCarree())
plt.title('Land geometries with an area <1000 $degrees^2$ (in PlateCarree)')
ax.add_geometries(drop_large(land_cells, 1000),
ccrs.PlateCarree(),
facecolor='blue', edgecolor='k', alpha=0.3)
ax.coastlines();
Clearly we have had a bit of sucess here, but we must be suffering from a value range of [0,360] and [-180,180]. Our land geometry lies within [-180, 180], but our mesh doesn't:
def multipoly_geom_xlimit(multipoly):
if not multipoly.geoms:
raise ValueError('Not a multipolygon')
xmin = min([min(geom.exterior.coords)[0] for geom in multipoly.geoms])
xmax = max([max(geom.exterior.coords)[0] for geom in multipoly.geoms])
return xmin, xmax
print('Land min: {}; max: {}'.format(*multipoly_geom_xlimit(land)))
print('Cells min: {}; max: {}'.format(float(xs.min()), float(xs.max())))
In projected space these ranges are critical, and require a reprojection of one or the other geometry. Let's re-project the land geometry, as there are fewer of those. In truth though, I usually try to normalise to [-180, 180].
The trick is to convert the geometry into a PlateCarree with a central longitude at the antimeridian, and then to add 180 to the values to get us back to longitude and latitudes.
pc_180 = ccrs.PlateCarree(central_longitude=180)
# Instead of the geometry in lon/lat, let's project it to lon+180/lat.
land_180 = pc_180.project_geometry(land, ccrs.PlateCarree())
# Same trick as before - try to make this a valid geometry.
land_180 = land_180.buffer(0)
# Finally, translate the projected lon+180/lat geometry back to lon/lat.
import shapely.affinity
land_180 = shapely.affinity.translate(land_180, xoff=180)
land_180
Let's double check this geometry is as expected:
print('Land_180 min: {}; max: {}'.format(*multipoly_geom_xlimit(land_180)))
So using the original mesh cells in the range [0, 360] and our projected land in the range [0, 360], let's compute the intersections!
land_180_prep = prep(land_180)
land_180_geoms = [cell for cell in mesh_geoms
if land_180_prep.intersects(cell)]
print(len(land_180_geoms))
Great! We have more matching cells than we did before!
def draw_mesh(ax, geoms):
# Note: Writing functions that take the axes as an argument is a
# recommended paragdim. It allows easy subplot creation ;)
ax.coastlines()
ax.add_geometries(
geoms,
ccrs.PlateCarree(),
facecolor='blue', edgecolor='k', alpha=0.2)
plt.figure()
ax = plt.axes(projection=ccrs.PlateCarree())
plt.title('Land geometries with an area <1000 $degrees^2$ (in PlateCarree)')
draw_mesh(ax, drop_large(land_180_geoms, 1000));
Nope, that wasn't it. Must be an issue with the faces on the western portion of the projection.
Turns out, cartopy doesn't handle the PlateCarree geometries that are greater than 180. This does surprise me, which itself is a surprise given I wrote cartopy. For now, let's just move those cells we've identified back by 360 degrees if they are too far over.
def fixup_wraparound_geoms(geoms):
# If a cell is over 180, move it back 360.
for geom in geoms:
if max(geom.exterior.coords)[0] > 180:
yield shapely.affinity.translate(geom, xoff=-360)
else:
yield geom
plt.figure()
ax = plt.axes(projection=ccrs.PlateCarree())
plt.title('Land cells with an area $<1000 degrees^2$ in PlateCarree')
draw_mesh(ax,
fixup_wraparound_geoms(drop_large(land_180_geoms, 1000)));
More progress. Notice that we are identifying corner cells here, so no worries there!
Now let's look at the antimeridian issue with the stripe through Europe and Africa missing.
I guess at the dateline there are geometries that have a wraparound which has impacted their geometry orientation (that was probably why we were getting those really wide cells in our first plot, and why filtering large area geometries has been an effective way of removing them). This is probably a bug with the data, as it does state that the geometry should be CCW (on the sphere).
Observation: all UGRIDs are implicitly spherical if they are global and nodes are shared between geometries that cross the antimeridian.
This is getting a bit manual. Instead of playing whack-a-mole with these mesh geometries, let's try projecting the mesh the cells into plate carree from a geodetic coordinate system using cartopy.
That will make some of the cells have many more than 4 nodes, but it will be much more representative of their shape in PlateCarree. It seems highly unlikely that this approach will scale beyond the c12 mesh though...
def project_from_geodetic(geoms):
pc = ccrs.PlateCarree()
geod = ccrs.Geodetic()
for geom in geoms:
yield pc.project_geometry(geom, geod)
mesh_geoms_prj = list(project_from_geodetic(lfric_geoms(ds)))
How long did that take? Remember that the original lfric_geoms
call was ~3s:
%timeit mesh_geoms_prj = list(project_from_geodetic(lfric_geoms(ds)))
~4.5s is far better than I thought it would be, if I'm honest.
OK, let's use these nicely projected geometries and plot them all:
def draw_mesh(ax, geoms):
# Note: Writing functions that take the axes as an argument is a
# recommended paragdim. It allows easy subplot creation ;)
ax.coastlines()
ax.add_geometries(
geoms,
ccrs.PlateCarree(),
facecolor='blue', edgecolor='k', alpha=0.2)
plt.figure()
draw_mesh(plt.axes(projection=ccrs.PlateCarree()),
mesh_geoms_prj)
plt.title('Land cells with an area $<1000 degrees^2$ in PlateCarree');
Wowzers, that seems to have done the trick nicely. Let's look at this in a few different ways:
plt.figure()
draw_mesh(plt.axes(projection=ccrs.Orthographic(-30, 30)),
mesh_geoms_prj);
plt.figure(figsize=[10, 5])
ax1 = plt.subplot(1, 2, 1, projection=ccrs.NorthPolarStereo())
ax2 = plt.subplot(1, 2, 2, projection=ccrs.SouthPolarStereo())
for ax in [ax1, ax2]:
draw_mesh(ax, mesh_geoms_prj)
lim = 1e7
ax.set_extent([-lim, lim, -lim, lim], ccrs.SouthPolarStereo())
And finally, do these nicely projected mesh cells actually function as we expect when it comes to land intersection?
land_geoms = [cell for cell in mesh_geoms_prj
if land_prep.intersects(cell)]
print(len(land_geoms))
plt.figure()
draw_mesh(plt.axes(projection=ccrs.PlateCarree()),
land_geoms)
plt.title('All land geoms in PlateCarree');
Success! The indices of these cells can be back computed fairly easily:
land_geoms_ids = [id(geom) for geom in land_geoms]
land_mask = np.array([id(geom) in land_geoms_ids for geom in mesh_geoms_prj])
print('n-faces: {}, n-land-faces: {}'.format(land_mask.size, land_mask.sum()))
ds['face_lats'][land_mask]
What do the land cells look like on a South polar stereo plot?
plt.figure()
draw_mesh(plt.axes(projection=ccrs.SouthPolarStereo()),
land_geoms)
plt.title('All land geoms in South Polar Stereographic')
lim = 1.6e7
plt.gca().set_extent([-lim, lim, -lim, lim], ccrs.SouthPolarStereo());
Looks rather splendid. Remember, the containment testing happened in PlateCarree space (poles, wraparounds and all), yet we seem to have got away with this working even for Antarctica.
Conclusion¶
The approach of projecting the spherical/geodetic geometries into PlateCarree with cartopy works.
I do not expect this approach to scale to millions of cells (as is the case for c1024 - a 1024x1024x6 mesh of faces), but I will put metrics on that in a follow-up notebook. The scaling issue comes from the fact that Shapely is operating at the python level, and creating that many objects of any kind is going to be slow. This is doubly true if we are having to project those shapely geometries through cartopy (another Python layer).
What's more, it seems unlikely that we really want to visualise the results in this way for a c1024 mesh. There are more efficient rasterisation techniques (e.g. AGG and datashader) that will work very effectively and provide equivalent fidelity to a figure (but still not render all ~6,200,00 faces in the figure itself).
The wraparound and poles are going to consistently prove to be problematic unless we can handle truly spherical geometries. Cartopy gets us close, but there is no good non-spherical representation of geometries that can be done with GEOS/Shapely.
Is S2 the solution to this perhaps? Can that be made to scale? Does ESMF have a trick up its sleve for containment testing on the sphere?