Hurricane Ike Maximum Water Levels¶
Compute maximum water levels during Hurricane Ike on a 9 million node triangular mesh ADCIRC storm surge model. Visualize the results using HoloViz TriMesh rendering with Datashader.
[1]:
import xarray as xr
import numpy as np
import pandas as pd
import fsspec
Start a dask cluster to crunch the data¶
[2]:
from dask.distributed import Client
[3]:
from dask_gateway import Gateway
gateway = Gateway()
cluster = gateway.new_cluster()
[4]:
#from dask_kubernetes import KubeCluster
#cluster = KubeCluster()
[5]:
cluster
[6]:
cluster.adapt(minimum=4, maximum=20);
[7]:
client = Client(cluster)
Read the data using the cloud-friendly zarr data format¶
[8]:
ds = xr.open_zarr(fsspec.get_mapper('s3://pangeo-data-uswest2/esip/adcirc/ike', anon=False, requester_pays=True))
[9]:
#ds = xr.open_zarr(fsspec.get_mapper('gcs://pangeo-data/rsignell/adcirc_test01'))
[10]:
ds['zeta']
[10]:
<xarray.DataArray 'zeta' (time: 720, node: 9228245)> dask.array<zarr, shape=(720, 9228245), dtype=float64, chunksize=(10, 141973), chunktype=numpy.ndarray> Coordinates: * time (time) datetime64[ns] 2008-09-05T12:00:00 ... 2008-09-10T11:50:00 x (node) float64 dask.array<chunksize=(141973,), meta=np.ndarray> y (node) float64 dask.array<chunksize=(141973,), meta=np.ndarray> Dimensions without coordinates: node Attributes: location: node long_name: water surface elevation above geoid mesh: adcirc_mesh standard_name: sea_surface_height_above_geoid units: m
xarray.DataArray
'zeta'
- time: 720
- node: 9228245
- dask.array<chunksize=(10, 141973), meta=np.ndarray>
Array Chunk Bytes 53.15 GB 11.36 MB Shape (720, 9228245) (10, 141973) Count 4681 Tasks 4680 Chunks Type float64 numpy.ndarray - time(time)datetime64[ns]2008-09-05T12:00:00 ... 2008-09-...
- base_date :
- 2008-09-01 11:50:00
- long_name :
- model time
- standard_name :
- time
array(['2008-09-05T12:00:00.000000000', '2008-09-05T12:10:00.000000000', '2008-09-05T12:20:00.000000000', ..., '2008-09-10T11:30:00.000000000', '2008-09-10T11:40:00.000000000', '2008-09-10T11:50:00.000000000'], dtype='datetime64[ns]')
- x(node)float64dask.array<chunksize=(141973,), meta=np.ndarray>
- long_name :
- longitude
- positive :
- east
- standard_name :
- longitude
- units :
- degrees_east
Array Chunk Bytes 73.83 MB 1.14 MB Shape (9228245,) (141973,) Count 66 Tasks 65 Chunks Type float64 numpy.ndarray - y(node)float64dask.array<chunksize=(141973,), meta=np.ndarray>
- long_name :
- latitude
- positive :
- north
- standard_name :
- latitude
- units :
- degrees_north
Array Chunk Bytes 73.83 MB 1.14 MB Shape (9228245,) (141973,) Count 66 Tasks 65 Chunks Type float64 numpy.ndarray
- location :
- node
- long_name :
- water surface elevation above geoid
- mesh :
- adcirc_mesh
- standard_name :
- sea_surface_height_above_geoid
- units :
- m
How many GB of sea surface height data do we have?
[11]:
ds['zeta'].nbytes/1.e9
[11]:
53.1546912
Take the maximum over the time dimension and persist the data on the workers to use later. This is the computationally intensive step.
[12]:
max_var = ds['zeta'].max(dim='time').persist()
Visualize data on mesh using HoloViz.org tools¶
[13]:
import numpy as np
import datashader as dshade
import holoviews as hv
import geoviews as gv
import cartopy.crs as ccrs
import hvplot.xarray
import holoviews.operation.datashader as dshade
dshade.datashade.precompute = True
hv.extension('bokeh')
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/core/element.py:74: SyntaxWarning: "is" with a literal. Did you mean "=="?
if key is ():
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/core/layout.py:225: SyntaxWarning: "is" with a literal. Did you mean "=="?
if key is ():
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/element/tabular.py:60: SyntaxWarning: "is" with a literal. Did you mean "=="?
if heading is ():
[14]:
v = np.vstack((ds['x'], ds['y'], max_var)).T
verts = pd.DataFrame(v, columns=['x','y','vmax'])
[15]:
points = gv.operation.project_points(gv.Points(verts, vdims=['vmax']))
[16]:
tris = pd.DataFrame(ds['element'].values.astype('int')-1, columns=['v0','v1','v2'])
[17]:
tiles = gv.tile_sources.OSM
[18]:
value = 'max water level'
label = '{} (m)'.format(value)
trimesh = gv.TriMesh((tris, points), label=label)
mesh = dshade.rasterize(trimesh).opts(
cmap='rainbow', colorbar=True, width=600, height=400)
[19]:
tiles * mesh
[19]:
Extract a time series at a specified lon, lat location¶
Because Xarray does not yet understand that x
and y
are coordinate variables on this triangular mesh, we create our own simple function to find the closest point. If we had a lot of these, we could use a more fancy tree algorithm.
[20]:
# find the indices of the points in (x,y) closest to the points in (xi,yi)
def nearxy(x,y,xi,yi):
ind = np.ones(len(xi),dtype=int)
for i in range(len(xi)):
dist = np.sqrt((x-xi[i])**2+(y-yi[i])**2)
ind[i] = dist.argmin()
return ind
[21]:
#just offshore of Galveston
lat = 29.2329856
lon = -95.1535041
[22]:
ind = nearxy(ds['x'].values,ds['y'].values,[lon], [lat])
[23]:
ds['zeta'][:,ind].hvplot(x='time', grid=True)
[23]: