Using ICESat-2 ATL15 (Gridded Arctic Land Ice Height) to investigate ice-surface height anomalies#

Written by#

Key learning outcomes:#

  • How to gather data from disparate sources.

  • What is a Coordinate Reference System (CRS) and why it matters.

  • How to use geometries including Points and Polygons to define an area of interest and subset data.

  • The basics of how the icepyx library simplifies obtaining and interacting with ICESat-2 data.

  • How Xarray can simplify the import of multi-dimensional data.

  • Open, plot, and explore gridded raster data.

Computing environment#

We will set up our computing environment with library imports and utility functions

Tip: If you need to import a library that is not pre-installed, use %pip install <package-name> alone within a Jupyter notebook cell to install it for this instance of CryoCloud (the pip installation will not persist between logins).

# Needed for pandas read_excel
%pip install openpyxl
%matplotlib widget

# Import internal libraries
import earthaccess
import geopandas as gpd
import h5py
import hvplot.xarray
import icepyx as ipx
from IPython.display import Image, display
import matplotlib.colors as colors
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from mpl_toolkits.axes_grid1 import make_axes_locatable
import numpy as np
import pandas as pd
import os
from pyproj import CRS, Transformer
import rioxarray
import s3fs
import xarray as xr

# define utility function
def ll2ps(lon, lat):
    """
    Transform coordinates from geodetic coordinates (lon, lat)
    to Greenland (epsg:3413) coordinates (x, y)
    x, y = ll2ps(lon, lat)
    Inputs
    * lon, lat in decimal degrees (lon: W is negative; lat: S is negative)
    Outputs
    * x, y in [m]
    """
    crs_ll = CRS("EPSG:4326")
    crs_xy = CRS("EPSG:3413")
    ll_to_xy = Transformer.from_crs(crs_ll, crs_xy, always_xy = True)
    x, y = ll_to_xy.transform(lon, lat)
    return x, y

Greenland subglacial lakes#

There are two classes of subglacial lakes: stable and active. Stable lakes have had a stable volume over the observational record whereas active episodically drain and fill. We can observe active subglacial lakes using ice-surface deformation time series.

In this tutorial, we will focus on Greenland where active subglacial lakes have been inferred from surface height changes.

Northern hemisphere subglacial lakes compiled in a global inventory by Livingstone and others (2020):

N. hemisphere subglacial lakes

We can investigate active subglacial lakes using ice surface height anomalies as the overlying ice deforms when active lakes drain and fill episodically. These videos from NASA’s Science Visualization Studio illustrate how we observe active lakes filling and draining through time.

Locating the Greenland subglacial lakes#

Consulting the supplementary data of the Livingstone and others (2020) inventory, we can construct a geopandas geodataframe to investigate Greenland’s subglacial lakes.

Opening non-cloud-hosted data uploaded to your CryoCloud Jupyter hub#

If we are working with a dataset that is not cloud hosted or unavailable via a URL, we can upload that dataset to CryoCloud.

First, decide where the uploaded data will live on your CryoCloud hub. It could be a data directory folder that you create in your CryoCloud’s base directory or put the data into your working directory (whichever file management technique you prefer).

Second, download the paper’s supplementary data by clicking here.

Third, upload the supplementary data spreadsheet into your new data directory folder or your working directory. The supplementary data spreadsheet (.xlsx) is already uploaded to our working directory.

# Read in spreadsheet using pandas read_excel
use_cols = ['Name / Location', 'Lat. oN', 'Lon. oE', 'Lake Type', 'References']
import_rows = np.arange(0,65)
# You may add a path/to/file if you'd like to try reading spreadsheet from where you saved it in a data directory 
df = pd.read_excel('43017_2021_246_MOESM1_ESM.xlsx', 'Greenland', usecols=use_cols, skiprows = lambda x: x not in import_rows)

# View pandas dataset head (first 5 rows)
df.head()
Name / Location Lat. oN Lon. oE Lake Type References
0 NaN 78.000000 -48.000000 NaN Ekholm et al. (1998)
1 L2 78.005022 -68.393971 Stable Palmer et al. (2013)
2 L1 77.969500 -68.440875 Stable Palmer et al. (2013)
3 Flade Isblink ice cap 81.160000 -16.580000 Active Willis et al. (2015)
4 Inuppaat Quuat 67.611136 -48.709000 Active Howat et al. (2015); Palmer et al. (2015)

However, this data has a direct download URL, so we can read the data directly into our notebook, skipping the download and upload steps:

Open data directly via URL#

# Read in spreadsheet using pandas read_excel
url = 'https://static-content.springer.com/esm/art%3A10.1038%2Fs43017-021-00246-9/MediaObjects/43017_2021_246_MOESM1_ESM.xlsx'
use_cols = ['Name / Location', 'Lat. oN', 'Lon. oE', 'Lake Type', 'References']
import_rows = np.arange(0,65)
df = pd.read_excel(url, sheet_name='Greenland', usecols=use_cols, skiprows = lambda x: x not in import_rows)

# View pandas dataset head (first 5 rows)
df.head()
Name / Location Lat. oN Lon. oE Lake Type References
0 NaN 78.000000 -48.000000 NaN Ekholm et al. (1998)
1 L2 78.005022 -68.393971 Stable Palmer et al. (2013)
2 L1 77.969500 -68.440875 Stable Palmer et al. (2013)
3 Flade Isblink ice cap 81.160000 -16.580000 Active Willis et al. (2015)
4 Inuppaat Quuat 67.611136 -48.709000 Active Howat et al. (2015); Palmer et al. (2015)

This is looking good, but we can make it even better by storing the data in a GeoPandas GeoDataFrame which offers additional functionality beyond pandas. You can add a geometry column of Shapely objects that make geospatial data processing and visualization easier. Here we have Shapely points:

# Create GeoPandas GeoDataFrame from Pandas DataFrame
gdf = gpd.GeoDataFrame(
    df, geometry=gpd.points_from_xy(df['Lon. oE'], df['Lat. oN']))

# Set the Coordinate Reference System (CRS) of the geodataframe
if gdf.crs is None: 
    # set CRS WGS84 in lon, lat
    gdf.set_crs('epsg:4326', inplace=True)
    
# Display GeoDataFrame
gdf
Name / Location Lat. oN Lon. oE Lake Type References geometry
0 NaN 78.000000 -48.000000 NaN Ekholm et al. (1998) POINT (-48.00000 78.00000)
1 L2 78.005022 -68.393971 Stable Palmer et al. (2013) POINT (-68.39397 78.00502)
2 L1 77.969500 -68.440875 Stable Palmer et al. (2013) POINT (-68.44088 77.96950)
3 Flade Isblink ice cap 81.160000 -16.580000 Active Willis et al. (2015) POINT (-16.58000 81.16000)
4 Inuppaat Quuat 67.611136 -48.709000 Active Howat et al. (2015); Palmer et al. (2015) POINT (-48.70900 67.61114)
... ... ... ... ... ... ...
59 Sermeq Kujalleq (Jakobshavn) 69.109843 -42.055159 Stable Bowling et al. (2019) POINT (-42.05516 69.10984)
60 Sermeq Kujalleq (Jakobshavn) 69.108047 -41.954370 Stable Bowling et al. (2019) POINT (-41.95437 69.10805)
61 Isunguata Sermia 1 67.180000 -50.188000 Active Livingstone et al. (2019) POINT (-50.18800 67.18000)
62 Isunguata Sermia 2 67.178000 -50.149000 Active Livingstone et al. (2019) POINT (-50.14900 67.17800)
63 Isunguata Sermia 3 67.180000 -50.128000 Active Livingstone et al. (2019) POINT (-50.12800 67.18000)

64 rows × 6 columns

Let’s look at Greenland’s active subglacial lake inventory by filtering on the Lake Type column:

# Let's look at Greenland's active subglacial lake inventory
gdf[gdf['Lake Type'] == 'Active']
Name / Location Lat. oN Lon. oE Lake Type References geometry
3 Flade Isblink ice cap 81.160000 -16.580000 Active Willis et al. (2015) POINT (-16.58000 81.16000)
4 Inuppaat Quuat 67.611136 -48.709000 Active Howat et al. (2015); Palmer et al. (2015) POINT (-48.70900 67.61114)
5 Sioqqap Sermia, [SS1] 63.541856 -48.450597 Active Bowling et al. (2019) POINT (-48.45060 63.54186)
6 Sioqqap Sermia, [SS2] 63.260248 -48.206633 Active Bowling et al. (2019) POINT (-48.20663 63.26025)
61 Isunguata Sermia 1 67.180000 -50.188000 Active Livingstone et al. (2019) POINT (-50.18800 67.18000)
62 Isunguata Sermia 2 67.178000 -50.149000 Active Livingstone et al. (2019) POINT (-50.14900 67.17800)
63 Isunguata Sermia 3 67.180000 -50.128000 Active Livingstone et al. (2019) POINT (-50.12800 67.18000)

Now let’s plot the lake locations to ensure our data read-in went as expected. But first, we need to make sure all projections for the data we want to show are the same.

What is a CRS/EPSG?#

  • A Coordinate Reference System (CRS) tells you how the Earth’s 3-D surface is projected onto a 2-D plane map. Below are examples of various projections of continguous USA from an excellent Data Carpentry tutorial on the CRS topic:

Coordinate Reference System comparison

  • A particular CRS can be referenced by its EPSG code (e.g., epsg:4326 as we used in the GeoPandas GeoDataFrame example above). The EPSG is a structured dataset of CRS and Coordinate Transformations. It was originally compiled by the now defunct European Petroleum Survey Group (EPSG) but continues to be one common way of representing a specific CRS.

  • Most map projections make land areas look proportionally larger or smaller than they actually are. Below is intuitive visualization of this from another excellent site of tutorials on geospatial topics called Earth Lab:

How projections distort a human head

  • Because of this distortion, we select a projection that is best suited for the geographic region we are studying to minimize distortions.

  • Previously we used epsg:4326, which is a geographic coordinate system that makes use of the World Geodetic System 1984 (WGS84) ellipsoid as a datum reference elevation. A datum consists in an ellipsoid relative to which the latitude and longitude of points are defined, with a geoid defining the surface at zero height (visual from ESA’s navipedia):

What is a datum

  • A geographic coordinate system is more than just the ellipsoid. It adds a coordinate system and units of measurement. epsg:4326 uses the very familiar longitude and latitude and degrees as the coordinate system and units. We selected epsg:4326 because longitude and latitude were the coordinates in the dataset.

  • There are numerous formats that are used to document a CRS. Three common formats include: proj.4, EPSG, and Well-known Text (WKT) formats.

  • Useful websites to look up CRS strings are spatialreference.org and espg.io. You can use the search on these sites to find an EPSG code.

  • Often you have data in one format or projection (CRS) and you need to transform it to a more regionally accurate CRS or match it to another datasets’ CRS to plot them together. Let’s transform our geopandas data to the NSIDC Sea Ice North Polar Stereographic projection (epsg:3413) easily visualize our lakes on a projection that minimizes distortion of our area of interest (Greenland).

# Create a figure with two subplots: one with Plate Carree projection (EPSG:4326), one with Stereographic projection (EPSG:3413)
fig, axs = plt.subplots(1, 2, figsize=(6, 3), subplot_kw={'projection': ccrs.PlateCarree()})

# Define Greenland's extent (rough bounding box around Greenland)
greenland_extent = [-75, -10, 55, 85]  # [west, east, south, north]

# Plot Greenland in EPSG:4326
axs[0].set_extent(greenland_extent)
axs[0].add_feature(cfeature.LAND)
axs[0].add_feature(cfeature.COASTLINE)
axs[0].add_feature(cfeature.BORDERS)
axs[0].set_title("EPSG:4326 (WGS84) Projection")

# Now plot Greenland in EPSG:3413 (Polar Stereographic)
axs[1] = plt.subplot(122, projection=ccrs.Stereographic(central_latitude=90, central_longitude=-45))
axs[1].set_extent(greenland_extent, crs=ccrs.PlateCarree())
axs[1].add_feature(cfeature.LAND)
axs[1].add_feature(cfeature.COASTLINE)
axs[1].add_feature(cfeature.BORDERS)
axs[1].set_title("EPSG:3413 (Polar Stereographic Projection)")

plt.suptitle('Greenland Boundaries in Two Projections')
plt.show()

Using the Shapely Polygon geometry to create a search radius and subset data#

  • The Shapely points used before are useful to tell us the centroid of the lakes.

  • However, we’d like to be able to create a search radius around that point to search for and subset data.

  • Some datasets are massive, so using Shapely polygons as a search radius around your region of interest will speed up your computation.

# Create GeoSeries of our GeoDataFrame that converts to Arcic polar stereographic projection 
# and makes 10-km radius buffered polygon around each lake point
gs = gdf.to_crs('3413').buffer(10000)

# Create a copy of the original GeoPandas GeoDataFrame so that we don't alter the original
gdf_polys = gdf.copy(deep=True)

# Create new GeoDataFrame to store the polygons
gdf_polys = gpd.GeoDataFrame(gdf, 
                             # We use the GeoDataFrame (gdf) copy, gdf_polys, and replaced its geometry with the GeoSeries of polygons
                             geometry=gs, 
                             # We must specify the crs, so we use the one used to create the GeoSeries of polygons
                             crs='epsg:3413'
                             # But then we switch the crs back to good ol' 4623 lon, lat
                             ).to_crs('4623')

# Look at active lakes to ensure it worked as expected
gdf_polys[gdf_polys['Lake Type'] == 'Active']
Name / Location Lat. oN Lon. oE Lake Type References geometry
3 Flade Isblink ice cap 81.160000 -16.580000 Active Willis et al. (2015) POLYGON ((-16.06723 81.11390, -16.09787 81.106...
4 Inuppaat Quuat 67.611136 -48.709000 Active Howat et al. (2015); Palmer et al. (2015) POLYGON ((-48.47640 67.61446, -48.47613 67.605...
5 Sioqqap Sermia, [SS1] 63.541856 -48.450597 Active Bowling et al. (2019) POLYGON ((-48.25472 63.54482, -48.25457 63.536...
6 Sioqqap Sermia, [SS2] 63.260248 -48.206633 Active Bowling et al. (2019) POLYGON ((-48.01287 63.26285, -48.01281 63.254...
61 Isunguata Sermia 1 67.180000 -50.188000 Active Livingstone et al. (2019) POLYGON ((-49.96016 67.18562, -49.95932 67.176...
62 Isunguata Sermia 2 67.178000 -50.149000 Active Livingstone et al. (2019) POLYGON ((-49.92117 67.18356, -49.92035 67.174...
63 Isunguata Sermia 3 67.180000 -50.128000 Active Livingstone et al. (2019) POLYGON ((-49.90015 67.18552, -49.89933 67.176...

Now that we know our active lake locations, let’s grab data to study the filling and draining of the lakes. We will use ICESat-2 surface elevations to investigate.

What is ICESat-2?#

ICESat-2 (Ice, Cloud, and land Elevation Satellite 2), part of NASA’s Earth Observing System, is a satellite mission for measuring ice sheet elevation and sea ice thickness, as well as land topography, vegetation characteristics, and clouds. It does so using an altimeter or an altitude meter, which is an instrument used to measure the altitude of an object above a fixed level (the datum we talked about earlier). This is typically achieved by measuring the time it takes for a lidar or radar pulse, released by a satellite-based altimeter, to travel to a surface, reflect, and return to be recorded by an onboard instrument. ICESat-2 uses three pairs of laser pulsers and the detector to count the reflected photons.

ICESat-2 laser configuration (from Smith and others, 2019):

ICESat-2 laser configuration

What is ATL14/15?#

ATL15 is one of the various ICESat-2 data products. ATL15 provides various resolutions (1 km, 10 km, 20 km, and 40 km) of gridded raster data of height change at 3-month intervals, allowing for visualization of height-change patterns and calculation of integrated active subglacial lake volume change (Smith and others, 2022).

ATL14 is an accompanying high-resolution (100 m) digital elevation model (DEM) that provides spatially continuous gridded data of ice sheet surface height.

Learn more about the ICESat-2 ATL14/15 Gridded Antarctic and Arctic Land Ice Height Change data product dataset here.

Streaming cloud-hosted data from NASA Earth Data Cloud#

We will be working with cloud-hosted data files. This guide explains how to find and access Earthdata cloud-hosted data. Here is a complete list of earthdata cloud-hosted data products currently available from NSIDC.

Using icepyx to simplify searching for ICESat-2 data#

icepyx is a community and Python software library that simplifies the process of searching (querying), accessing (via download or in the cloud), and working with (including subsetting, visualizing, and reading in) ICESat-2 data products. A series of examples introduce users to its functionality, and the icepyx community always welcomes new members, feature requests, bug reports, etc.

To search for data spatially, icepyx accepts shapefiles, kml files, and geopackage files, as well as bounding boxes and polygons, as input bounding regions. We have two options for supplying this information: (1) save one of the lakes’ geometries as a geopackage or (2) extract the exterior coordinates and supply them directly. You may run either of the next two cells; then we’ll use the spatial information to query ICESat-2 data for that region.

# (1) save one of the lakes' geometries as a geopackage
# Export to geopackage to subset ICESat-2 data
gdf_polys[gdf_polys['Name / Location'] == 'Flade Isblink ice cap'].to_file(os.getcwd() + '/Flade_Isblink_poly.gpkg')
spatial_extent = 'Flade_Isblink_poly.gpkg'
# (2) use the shapely package to supply the polygon coordinates directly as a list
from shapely.geometry import mapping
spatial_extent = list(mapping(gdf_polys[gdf_polys['Name / Location'] == 'Flade Isblink ice cap'].geometry.iloc[0])["coordinates"][0])
# Specifying the necessary icepyx parameters
short_name = 'ATL15'  # The data product we would like to query
date_range = ['2018-09-15','2023-03-02']
# Setup the Query object
region = ipx.Query(short_name, spatial_extent, date_range)

We can visualize our spatial extent on an interactive map with background imagery.

# Visualize area of interest
region.visualize_spatial_extent()
# Let's find out some information about the available data granuales (files)
region.avail_granules()
{'Number of available granules': 4,
 'Average size of granules (MB)': 226.04341173171997,
 'Total size of all granules (MB)': 904.1736469268799}
# Let's see the granule IDs and cloud access urls
gran_ids = region.avail_granules(ids=True, cloud=True)
gran_ids
[['ATL15_GL_0321_40km_004_01.nc',
  'ATL15_GL_0321_20km_004_01.nc',
  'ATL15_GL_0321_10km_004_01.nc',
  'ATL15_GL_0321_01km_004_01.nc'],
 ['s3://nsidc-cumulus-prod-protected/ATLAS/ATL15/004/ATL15_GL_0321_40km_004_01.nc',
  's3://nsidc-cumulus-prod-protected/ATLAS/ATL15/004/ATL15_GL_0321_20km_004_01.nc',
  's3://nsidc-cumulus-prod-protected/ATLAS/ATL15/004/ATL15_GL_0321_10km_004_01.nc',
  's3://nsidc-cumulus-prod-protected/ATLAS/ATL15/004/ATL15_GL_0321_01km_004_01.nc']]
# Let's grab the s3 URL of the highest resolution available product
s3url = gran_ids[1][3]
s3url
's3://nsidc-cumulus-prod-protected/ATLAS/ATL15/004/ATL15_GL_0321_01km_004_01.nc'

You can manually find s3 URL’s for cloud-hosted data from NASA Earth Data

Learn more about finding cloud-hosted data from NASA Earth data cloud here

The next step (accessing data in the cloud) requires a NASA Earthdata user account. You can register for a free account here. We provide two options for reading in your data: (1) by setting up an s3 file system and using Xarray directly; or (2) by using icepyx (which uses Xarray under the hood). Currently, the read time is similar with both methods. The h5coro library will soon be available to help speed up this process.

The file system method requires you complete a login step. icepyx will automatically ask for your credentials when you perform a task that needs them. If you do not have them stored as environment variables or in a .netrc file, you will be prompted to enter them.

# (1) authenticate
auth = earthaccess.login()
Enter your Earthdata Login username:  icepyx_devteam
Enter your Earthdata password:  ········
# (1) set up our s3 file system using our credentials
s3 = earthaccess.get_s3fs_session(daac='NSIDC')
# (1) Open s3url data file and store in Xarray Dataset
# This cell takes 10s of secs to load
with s3.open(s3url,'rb') as f:
    ATL15_dh = xr.open_dataset(f, group='delta_h').load()

# View Xarray Dataset
ATL15_dh
# (2) create a Read object; you'll be asked to authenticate at this step if you haven't already
reader = ipx.Read(s3url)
# (2) see what variables are available
reader.vars.avail()
['delta_h/Polar_Stereographic',
 'delta_h/time',
 'delta_h/x',
 'delta_h/y',
 'dhdt_lag1/dhdt/Bands',
 'dhdt_lag1/dhdt_sigma/Bands',
 'dhdt_lag1/ice_area/Bands',
 'dhdt_lag1/Polar_Stereographic',
 'dhdt_lag1/time',
 'dhdt_lag1/x',
 'dhdt_lag1/y',
 'dhdt_lag12/dhdt/Bands',
 'dhdt_lag12/dhdt_sigma/Bands',
 'dhdt_lag12/ice_area/Bands',
 'dhdt_lag12/Polar_Stereographic',
 'dhdt_lag12/time',
 'dhdt_lag12/x',
 'dhdt_lag12/y',
 'dhdt_lag16/dhdt/Bands',
 'dhdt_lag16/dhdt_sigma/Bands',
 'dhdt_lag16/ice_area/Bands',
 'dhdt_lag16/Polar_Stereographic',
 'dhdt_lag16/time',
 'dhdt_lag16/x',
 'dhdt_lag16/y',
 'dhdt_lag20/dhdt/Bands',
 'dhdt_lag20/dhdt_sigma/Bands',
 'dhdt_lag20/ice_area/Bands',
 'dhdt_lag20/Polar_Stereographic',
 'dhdt_lag20/time',
 'dhdt_lag20/x',
 'dhdt_lag20/y',
 'dhdt_lag4/dhdt/Bands',
 'dhdt_lag4/dhdt_sigma/Bands',
 'dhdt_lag4/ice_area/Bands',
 'dhdt_lag4/Polar_Stereographic',
 'dhdt_lag4/time',
 'dhdt_lag4/x',
 'dhdt_lag4/y',
 'dhdt_lag8/dhdt/Bands',
 'dhdt_lag8/dhdt_sigma/Bands',
 'dhdt_lag8/ice_area/Bands',
 'dhdt_lag8/Polar_Stereographic',
 'dhdt_lag8/time',
 'dhdt_lag8/x',
 'dhdt_lag8/y',
 'orbit_info/bounding_polygon_dim1',
 'orbit_info/bounding_polygon_lat1',
 'orbit_info/bounding_polygon_lon1',
 'quality_assessment/phony_dim_1',
 'quality_assessment/qa_granule_fail_reason',
 'quality_assessment/qa_granule_pass_fail',
 'tile_stats/N_bias',
 'tile_stats/N_data',
 'tile_stats/Polar_Stereographic',
 'tile_stats/RMS_bias',
 'tile_stats/RMS_d2z0dx2',
 'tile_stats/RMS_d2zdt2',
 'tile_stats/RMS_d2zdx2dt',
 'tile_stats/RMS_data',
 'tile_stats/sigma_tt',
 'tile_stats/sigma_xx0',
 'tile_stats/sigma_xxt',
 'tile_stats/x',
 'tile_stats/y']
# (2) Indicate which variables you'd like to read in.
# More information on managing ICESat-2 variables is available in the icepyx documentation and examples.
reader.vars.append(keyword_list=["delta_h"])
# view the variables that will be loaded into memory in your DataSet
reader.vars.wanted
{'Polar_Stereographic': ['delta_h/Polar_Stereographic'],
 'time': ['delta_h/time'],
 'x': ['delta_h/x'],
 'y': ['delta_h/y']}
# (2) load your data into memory
# if you are asked if you want to proceed, enter 'y' and press return/enter
# Depending on your hub settings, the warning letting you know this operation will take a moment may or may not show up
ATL15_dh = reader.load()
ATL15_dh
<xarray.Dataset>
Dimensions:              (x: 1541, y: 2741, time: 21)
Coordinates:
  * x                    (x) float64 -6.7e+05 -6.69e+05 ... 8.69e+05 8.7e+05
  * y                    (y) float64 -3.35e+06 -3.349e+06 ... -6.11e+05 -6.1e+05
  * time                 (time) datetime64[ns] 2019-01-01T06:00:00 ... 2024-0...
Data variables:
    Polar_Stereographic  int8 ...
    ice_area             (time, y, x) float32 ...
    delta_h              (time, y, x) float32 ...
    delta_h_sigma        (time, y, x) float32 ...
    data_count           (time, y, x) float32 ...
    misfit_rms           (time, y, x) float32 ...
    misfit_scaled_rms    (time, y, x) float32 ...
Attributes:
    description:  delta_h group includes variables describing height differen...

We can acquaint ourselves with this dataset in a few ways:

  • The data product’s overview page (Smith and others, 2022) to get the very basics such as geographic coverage, CRS, and what the data product tells us (quarterly height changes).

  • The Xarray Dataset read-in metadata: clicking on the written document icon of each data variable will expand metadata including a data variable’s dimensions, datatype, etc.

  • The data product’s data dictionary (Smith and others, 2021) to do a deep dive on what individual variables tell us.

  • The data product’s Algorithm Theoretical Basis Document

We’ll be plotting the delta_h data variable in this tutorial, here’s what we can learn about from these sources:

  • ATL14/15’s overview page: this is likely the ‘quarterly height changes’ described, but let’s dive deeper to be sure

  • ATL14/15’s Xarray Dataset imbedded metadata tells us a couple things: delta_h =height change at 1 km (the resolution selected earlier) and height change relative to the datum (Jan 1, 2020) surface

  • ATL14/15’s data dictionary: delta_h = quarterly height change at 40 km

Ok, since the data is relative to a datum, we have two options:

  1. Difference individual time slices to subtract out the datum, like so:

    (time\(_0\) - datum) - (time\(_1\) - datum) = time\(_0\) - datum - time\(_1\) + datum = time\(_0\) - time\(_1\)

  2. Subtract out the datum directly. The datum is the complementary dataset high-resolution DEM surface contained in tha accompanying dataset ATL14.

In this tutorial we’ll use the first method. We’ll use some explanatory data analysis to illustrate this.

# Let's make a simple plot of the first minus the zeroth time slices
fig, ax = plt.subplots(figsize=(4,5), tight_layout=True)
dhdt = ATL15_dh['delta_h'][1,:,:] - ATL15_dh['delta_h'][0,:,:]
cb = ax.imshow(dhdt, origin='lower', norm=colors.CenteredNorm(), cmap='coolwarm_r', aspect='auto')
ax.set_xlabel('longitude'); ax.set_ylabel('latitude')
plt.colorbar(cb, fraction=0.02, label='height change [m]')
plt.show()

Hmmm…doesn’t look like much change over this quarter. Why? Check out the bounds of the colorbar, we’ve got some pretty extreme values (colorbar is defaulting to ±10 m!) that appear to be along the margin. It’s making more sense now. We can change the bounds of the colorbar to plot see more of the smaller scale change in the continental interior.

# Let's calculate some basic stats to determine appropriate coloarbar bounds
print(dhdt.min().values)
print(dhdt.max().values)
-5.8791504
6.1810913

We can use a TwoSlopeNorm to achieve different mapping for positive and negative values while still keeping the center at zero:

# We can make that same plot with more representative colorbar bounds
fig, ax = plt.subplots(figsize=(4,5), tight_layout=True)
dhdt = ATL15_dh['delta_h'][1,:,:] - ATL15_dh['delta_h'][0,:,:]
divnorm = colors.TwoSlopeNorm(vmin=dhdt.min(), vcenter=0, vmax=dhdt.max())
cb = ax.imshow(dhdt, origin='lower', norm=divnorm, cmap='coolwarm_r', 
               aspect='auto'
               )

ax.set_xlabel('longitude'); ax.set_ylabel('latitude')
plt.colorbar(cb, fraction=0.02, label='height change [m]')
plt.show()

Now the colorbar bounds are more representative, but we still have the issue that extreme values along the margin are swapping any signals we might see in the continental interior.

# Let's use the Xarray DataArray quantile method to find the 1% and 99% quantiles (Q1 and Q3) of the data
dhdt.quantile([0.01,0.99])
<xarray.DataArray 'delta_h' (quantile: 2)>
array([-0.28271484,  2.78289673])
Coordinates:
  * quantile  (quantile) float64 0.01 0.99

We can use these quantiles as the colorbar bounds so that we see the data variability by plotting the most extreme values at the maxed out value of the colorbar. We’ll adjust the caps of the colorbar (using extend) to express that there are data values beyond the bounds.

# Let's make the same plot but using the quantiles as the colorbar bounds
fig, ax = plt.subplots(figsize=(4,5), tight_layout=True)
dhdt = ATL15_dh['delta_h'][1,:,:] - ATL15_dh['delta_h'][0,:,:]
divnorm = colors.TwoSlopeNorm(vmin=dhdt.quantile(0.01), vcenter=0, vmax=dhdt.quantile(0.99))
cb = ax.imshow(dhdt, origin='lower', norm=divnorm, cmap='coolwarm_r', 
               aspect='auto'
               )
ax.set_xlabel('longitude'); ax.set_ylabel('latitude')
plt.colorbar(cb, fraction=0.02, extend='both', label='height change [m]', ticks=[-0.25,0,1,2])
plt.tight_layout()
plt.show()
# Let's make the same plot but for all the available time slices and let's turn it in a function so that we can reuse this code for a smaller subset of data
# create empty lists to store data
def plot_icesat2_atl15(xmin, xmax, ymin, ymax, dataset):
    # subset data using bounding box in epsg:3134 x,y
    mask_x = (dataset.x >= xmin) & (dataset.x <= xmax)
    mask_y = (dataset.y >= ymin) & (dataset.y <= ymax)
    ds_sub = dataset.where(mask_x & mask_y, drop=True)
    
    # Create empty lists to store data
    vmins_maxs = []

    # Find the min's and max's of each inter time slice comparison and store into lists
    for idx in range(len(ds_sub['time'].values)-1): 
        dhdt = ds_sub['delta_h'][idx+1,:,:] - ds_sub['delta_h'][idx,:,:]
        vmin=dhdt.quantile(0.01)
        vmins_maxs += [vmin]
        vmax=dhdt.quantile(0.99)
        vmins_maxs += [vmax]
        if (min(vmins_maxs)<0) & (max(vmins_maxs)>0):
            vcenter = 0
        else: 
            vcenter = max(vmins_maxs) - min(vmins_maxs)
        divnorm = colors.TwoSlopeNorm(vmin=min(vmins_maxs), vcenter=vcenter, vmax=max(vmins_maxs))

    # create fig, ax
    fig, axs = plt.subplots(7,2, sharex=True, sharey=True, figsize=(10,10))

    idx = 0
    for ax in axs.ravel():   
        ax.set_aspect('equal')
        dhdt = ds_sub['delta_h'][idx+1,:,:] - ds_sub['delta_h'][idx,:,:]
        cb = ax.imshow(dhdt, origin='lower', norm=divnorm, cmap='coolwarm_r', 
                       extent=[xmin[0], xmax[0], ymin[0], ymax[0]])
        # Change polar stereographic m to km
        km_scale = 1e3
        ticks_x = ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/km_scale))
        ax.xaxis.set_major_formatter(ticks_x)
        ticks_y = ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/km_scale))
        ax.yaxis.set_major_formatter(ticks_y)
        # Create common axes labels
        fig.supxlabel('easting (km)'); fig.supylabel('northing (km)')
        # Increment the idx
        idx = idx + 1
     
    fig.colorbar(cb, extend='both', ax=axs.ravel().tolist(), label='height change [m]')
    plt.show()

Let’s zoom into an individual active lake to see more detail. First let’s remind ourselves of the Greenland active subglacial lakes by filtering on lake type:

gdf_polys_active = gdf_polys[gdf_polys['Lake Type'] == 'Active']
gdf_polys_active
Name / Location Lat. oN Lon. oE Lake Type References geometry
3 Flade Isblink ice cap 81.160000 -16.580000 Active Willis et al. (2015) POLYGON ((-16.06723 81.11390, -16.09787 81.106...
4 Inuppaat Quuat 67.611136 -48.709000 Active Howat et al. (2015); Palmer et al. (2015) POLYGON ((-48.47640 67.61446, -48.47613 67.605...
5 Sioqqap Sermia, [SS1] 63.541856 -48.450597 Active Bowling et al. (2019) POLYGON ((-48.25472 63.54482, -48.25457 63.536...
6 Sioqqap Sermia, [SS2] 63.260248 -48.206633 Active Bowling et al. (2019) POLYGON ((-48.01287 63.26285, -48.01281 63.254...
61 Isunguata Sermia 1 67.180000 -50.188000 Active Livingstone et al. (2019) POLYGON ((-49.96016 67.18562, -49.95932 67.176...
62 Isunguata Sermia 2 67.178000 -50.149000 Active Livingstone et al. (2019) POLYGON ((-49.92117 67.18356, -49.92035 67.174...
63 Isunguata Sermia 3 67.180000 -50.128000 Active Livingstone et al. (2019) POLYGON ((-49.90015 67.18552, -49.89933 67.176...
# We can call the bounds of the geometry of the Shapely Polygon we created earlier
gdf_sub = gdf_polys_active[gdf_polys_active['Name / Location'] == 'Inuppaat Quuat']

Now using these geometry bounds we can plot the area immediately around the active subglacial lake.

# If the figure doesn't pop up, it is because the holoviews plot was run immediately after and interferred with 
# matplotlib. Just rerun this cell and it will be fixed
%matplotlib widget

# Assigning the min, max lon, lat coords of the lower-left and upper-right corners of our bounding box 
# around the lake's buffer polygon
lon_min = gdf_sub.geometry.bounds.minx
lon_max = gdf_sub.geometry.bounds.maxx
lat_min = gdf_sub.geometry.bounds.miny
lat_max = gdf_sub.geometry.bounds.maxy

# Now we re-project the lon, lat into x, y coords in the 3413 CRS, which ATL15 uses
xmin, ymin = ll2ps(lon_min, lat_min)
xmax, ymax = ll2ps(lon_max, lat_max)

# Use our plotting fuction to plot up the data based on min, max x, y bounds
plot_icesat2_atl15(xmin, xmax, ymin, ymax, ATL15_dh)
# Let's explore plotting the data interactively using Xarray and Holoviews
hvplot.extension('matplotlib')
divnorm = colors.TwoSlopeNorm(vmin=-0.25, vcenter=0, vmax=1.25)
ATL15_dh['delta_h'].hvplot(groupby='time', cmap='coolwarm_r', norm=divnorm, invert=True, 
                           width=(ATL15_dh['x'].max()-ATL15_dh['x'].min())/3e3, height=(ATL15_dh['y'].max()-ATL15_dh['y'].min())/3e3, 
                           widget_type='scrubber', widget_location='bottom')
# clean up environment by deleting intermediary files
os.remove('Flade_Isblink_poly.gpkg')

Summary#

Congratulations! You’ve completed the tutorial. In this tutorial you have gained the skills to:

  • Transfrom Coordinate Reference Systems.

  • Open data into Pandas, GeoPandas and Xarray DataFrames/Arrays.

  • Use Shapely geometries to define an area of interest and subset data.

  • Learned to use icepyx for streamlining data access.

References#

Livingstone, S.J., Li, Y., Rutishauser, A. et al. Subglacial lakes and their changing role in a warming climate. Nat Rev Earth Environ 3, 106–124 (2022). doi:10.1038/s43017-021-00246-9

Smith, B., Fricker, H. A., Holschuh, N., Gardner, A. S., Adusumilli, S., Brunt, K. M., et al. (2019). Land ice height-retrieval algorithm for NASA’s ICESat-2 photon-counting laser altimeter. Remote Sensing of Environment, 233, 111352. doi:10.1016/j.rse.2019.111352

Smith, B., T. Sutterley, S. Dickinson, B. P. Jelley, D. Felikson, T. A. Neumann, H. A. Fricker, A. Gardner, L. Padman, T. Markus, N. Kurtz, S. Bhardwaj, D. Hancock, and J. Lee. (2022). ATLAS/ICESat-2 L3B Gridded Antarctic and Arctic Land Ice Height Change, Version 2 [Data Set]. Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://doi.org/10.5067/ATLAS/ATL15.002. Date Accessed 2023-03-16.

Smith, B., T. Sutterley, S. Dickinson, B. P. Jelley, D. Felikson, T. A. Neumann, H. A. Fricker, A. Gardner, L. Padman, T. Markus, N. Kurtz, S. Bhardwaj, D. Hancock, and J. Lee. “ATL15 Data Dictionary (V01).” National Snow and Ice Data Center (NSIDC), 2021-11-29. https://nsidc.org/data/documentation/atl15-data-dictionary-v01. Date Accessed 2023-03-16.