Hurricane Ike Maximum Water Levels¶

Compute maximum water levels during Hurricane Ike on a 9 million node triangular mesh ADCIRC storm surge model. Visualize the results using HoloViz TriMesh rendering with Datashader.

[1]:

import xarray as xr
import numpy as np
import pandas as pd
import fsspec

Start a dask cluster to crunch the data¶

[2]:

from dask.distributed import Client

[3]:

from dask_gateway import Gateway
gateway = Gateway()
cluster = gateway.new_cluster()

[4]:

#from dask_kubernetes import KubeCluster
#cluster = KubeCluster()

[5]:

cluster

[6]:

cluster.adapt(minimum=4, maximum=20);

[7]:

client = Client(cluster)

Read the data using the cloud-friendly zarr data format¶

[8]:

ds = xr.open_zarr(fsspec.get_mapper('s3://pangeo-data-uswest2/esip/adcirc/ike', anon=False, requester_pays=True))

[9]:

#ds = xr.open_zarr(fsspec.get_mapper('gcs://pangeo-data/rsignell/adcirc_test01'))

[10]:

ds['zeta']

[10]:

<xarray.DataArray 'zeta' (time: 720, node: 9228245)>
dask.array<zarr, shape=(720, 9228245), dtype=float64, chunksize=(10, 141973), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2008-09-05T12:00:00 ... 2008-09-10T11:50:00
    x        (node) float64 dask.array<chunksize=(141973,), meta=np.ndarray>
    y        (node) float64 dask.array<chunksize=(141973,), meta=np.ndarray>
Dimensions without coordinates: node
Attributes:
    location:       node
    long_name:      water surface elevation above geoid
    mesh:           adcirc_mesh
    standard_name:  sea_surface_height_above_geoid
    units:          m

xarray.DataArray

'zeta'

time: 720
node: 9228245

dask.array<chunksize=(10, 141973), meta=np.ndarray>

	Array	Chunk
Bytes	53.15 GB	11.36 MB
Shape	(720, 9228245)	(10, 141973)
Count	4681 Tasks	4680 Chunks
Type	float64	numpy.ndarray

Coordinates: (3)

time

(time)

datetime64[ns]

2008-09-05T12:00:00 ... 2008-09-...

base_date :: 2008-09-01 11:50:00
long_name :: model time
standard_name :: time

array(['2008-09-05T12:00:00.000000000', '2008-09-05T12:10:00.000000000',
       '2008-09-05T12:20:00.000000000', ..., '2008-09-10T11:30:00.000000000',
       '2008-09-10T11:40:00.000000000', '2008-09-10T11:50:00.000000000'],
      dtype='datetime64[ns]')

(node)

float64

dask.array<chunksize=(141973,), meta=np.ndarray>

long_name :: longitude
positive :: east
standard_name :: longitude
units :: degrees_east

	Array	Chunk
Bytes	73.83 MB	1.14 MB
Shape	(9228245,)	(141973,)
Count	66 Tasks	65 Chunks
Type	float64	numpy.ndarray

(node)

float64

dask.array<chunksize=(141973,), meta=np.ndarray>

long_name :: latitude
positive :: north
standard_name :: latitude
units :: degrees_north

	Array	Chunk
Bytes	73.83 MB	1.14 MB
Shape	(9228245,)	(141973,)
Count	66 Tasks	65 Chunks
Type	float64	numpy.ndarray

Attributes: (5)
location :
node
long_name :
water surface elevation above geoid
mesh :
adcirc_mesh
standard_name :
sea_surface_height_above_geoid
units :
m

How many GB of sea surface height data do we have?

[11]:

ds['zeta'].nbytes/1.e9

[11]:

53.1546912

Take the maximum over the time dimension and persist the data on the workers to use later. This is the computationally intensive step.

[12]:

max_var = ds['zeta'].max(dim='time').persist()

Visualize data on mesh using HoloViz.org tools¶

[13]:

import numpy as np
import datashader as dshade
import holoviews as hv
import geoviews as gv
import cartopy.crs as ccrs
import hvplot.xarray
import holoviews.operation.datashader as dshade

dshade.datashade.precompute = True
hv.extension('bokeh')

/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/core/element.py:74: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if key is ():
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/core/layout.py:225: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if key is ():
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/element/tabular.py:60: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if heading is ():

[14]:

v = np.vstack((ds['x'], ds['y'], max_var)).T
verts = pd.DataFrame(v, columns=['x','y','vmax'])

[15]:

points = gv.operation.project_points(gv.Points(verts, vdims=['vmax']))

[16]:

tris = pd.DataFrame(ds['element'].values.astype('int')-1, columns=['v0','v1','v2'])

[17]:

tiles = gv.tile_sources.OSM

[18]:

value = 'max water level'
label = '{} (m)'.format(value)
trimesh = gv.TriMesh((tris, points), label=label)
mesh = dshade.rasterize(trimesh).opts(
              cmap='rainbow', colorbar=True, width=600, height=400)

[19]:

tiles * mesh

[19]:

Extract a time series at a specified lon, lat location¶

Because Xarray does not yet understand that x and y are coordinate variables on this triangular mesh, we create our own simple function to find the closest point. If we had a lot of these, we could use a more fancy tree algorithm.

[20]:

# find the indices of the points in (x,y) closest to the points in (xi,yi)
def nearxy(x,y,xi,yi):
    ind = np.ones(len(xi),dtype=int)
    for i in range(len(xi)):
        dist = np.sqrt((x-xi[i])**2+(y-yi[i])**2)
        ind[i] = dist.argmin()
    return ind

[21]:

#just offshore of Galveston
lat = 29.2329856
lon = -95.1535041

[22]:

ind = nearxy(ds['x'].values,ds['y'].values,[lon], [lat])

[23]:

ds['zeta'][:,ind].hvplot(x='time', grid=True)

[23]: