SWOT High Resolution data search and access
Overview¶
What you’ll do (quick preview)
- Sign in with Earthdata Login
- Search for SWOT_L2_HR_Raster_100m_D
- Open results with
earthaccess.open(...)
and inspect a few variables - Visualize multiple SWOT rasters across time
Introduction to SWOT¶
In this tutorial, we will open and visualize data from NASA/CNES’s Surface Water and Ocean Topography (SWOT) mission in the cloud. In the next tutorial, we’ll compare SWOT elevations to NASA’s Ice, Cloud, and Land Elevation Satellite-2 (ICESat-2) over the Bach Ice Shelf (Antarctic Peninsula).

SWOT Credit: PO.DAAC cookbook
We use the SWOT High Resolution (HR) Level-2 Water Mask Raster Image Data Product, Version D (SWOT_L2_HR_Raster_D
), which includes improved processing for ice-shelf environments. See PO.DAAC’s SWOT mission and data overview for product details, processing notes, and documentation.
Libraries needed to get started¶
%matplotlib widget
# For searching and accessing NASA data
import earthaccess
# For reading data, analysis and plotting
import xarray as xr
import hvplot.xarray
# For accessing the time dimension from filenames
from datetime import datetime
import re
# For plotting found datasets
import hvplot
import hvplot.xarray
import pprint # For nice printing of python objects
EarthData login¶
An Earthdata Login account is required to access (and in many cases stream) NASA data. If you don’t have one yet, register at https://earthaccess
library to authenticate.
Login requires your Earthdata Login username and password. The login
method will automatically search for these credentials as environment variables or in a .netrc
file, and if those aren’t available it will prompt you to enter your username and password. We use the prompt strategy here.
Saving your credentials
A .netrc
file is a text file located in our home directory that contains login information for remote machines. If you don’t have a .netrc
file, login
will create one for you if you use persist=True
.
earthaccess.login(strategy='interactive', persist=True)
BUT make sure you do not commit your .netrc
to your Github repo. This is easy to accidentally do via git add -A
and would be a major security risk.
auth = earthaccess.login()
# Sanity check so you know that your credentals worked.
assert auth.authenticated, "Earthdata Login failed — please re-try."
Search for SWOT cloud-native collections¶
earthaccess
leverages the Common Metadata Repository (CMR) API to search for collections and granules. Earthdata Search also uses the CMR API.
We can use the search_datasets
method to search for SWOT collections by setting keyword="SWOT"
.
Advanced search options
The argument passed to keyword
can be any string and can include wildcard characters ?
or *
. To see a full list of search parameters you can type earthaccess.search_datasets?
. Using ?
after a python object displays the docstring
for that object.
A count of the number of data collections (Datasets) found is given.
query = earthaccess.search_datasets(
keyword="SWOT",
cloud_hosted = True,
version = 'D'
)
print (f'{len(query)} datasets found.')
35 datasets found.
We can get a summary of each dataset, which includes links for where to find lengthier descriptions of the data. We look at the first five in the query here.
for collection in query[:5]:
pprint.pprint(collection.summary(), sort_dicts=True, indent=4)
print('') # Add a space between collections for readability
{ 'cloud-info': { 'Region': 'us-west-2',
'S3BucketAndObjectPrefixNames': [ 'podaac-swot-ops-cumulus-protected/SWOT_L2_LR_SSH_D/',
'podaac-swot-ops-cumulus-public/SWOT_L2_LR_SSH_D/'],
'S3CredentialsAPIDocumentationURL': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentialsREADME',
'S3CredentialsAPIEndpoint': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentials'},
'concept-id': 'C3233945000-POCLOUD',
'file-type': "[{'FormatType': 'Native', 'Format': 'netCDF-4'}]",
'get-data': [ 'https://cmr.earthdata.nasa.gov/virtual-directory/collections/C3233945000-POCLOUD',
'https://search.earthdata.nasa.gov/search/granules?p=C3233945000-POCLOUD'],
'short-name': 'SWOT_L2_LR_SSH_D',
'version': 'D'}
{ 'cloud-info': { 'Region': 'us-west-2',
'S3BucketAndObjectPrefixNames': [ 'podaac-swot-ops-cumulus-protected/SWOT_L2_HR_RiverSP_D/',
'podaac-swot-ops-cumulus-public/SWOT_L2_HR_RiverSP_D/'],
'S3CredentialsAPIDocumentationURL': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentialsREADME',
'S3CredentialsAPIEndpoint': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentials'},
'concept-id': 'C3233944997-POCLOUD',
'file-type': "[{'FormatType': 'Native', 'Format': 'Shapefile'}]",
'get-data': [ 'https://cmr.earthdata.nasa.gov/virtual-directory/collections/C3233944997-POCLOUD',
'https://search.earthdata.nasa.gov/search/granules?p=C3233944997-POCLOUD'],
'short-name': 'SWOT_L2_HR_RiverSP_D',
'version': 'D'}
{ 'cloud-info': { 'Region': 'us-west-2',
'S3BucketAndObjectPrefixNames': [ 'podaac-swot-ops-cumulus-protected/SWOT_L2_HR_LakeAvg_D/',
'podaac-swot-ops-cumulus-public/SWOT_L2_HR_LakeAvg_D/'],
'S3CredentialsAPIDocumentationURL': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentialsREADME',
'S3CredentialsAPIEndpoint': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentials'},
'concept-id': 'C3233944980-POCLOUD',
'file-type': "[{'FormatType': 'Native', 'Format': 'Shapefile'}]",
'get-data': [ 'https://cmr.earthdata.nasa.gov/virtual-directory/collections/C3233944980-POCLOUD',
'https://search.earthdata.nasa.gov/search/granules?p=C3233944980-POCLOUD'],
'short-name': 'SWOT_L2_HR_LakeAvg_D',
'version': 'D'}
{ 'cloud-info': { 'Region': 'us-west-2',
'S3BucketAndObjectPrefixNames': [ 'podaac-swot-ops-cumulus-protected/SWOT_L2_HR_LakeSP_D/',
'podaac-swot-ops-cumulus-public/SWOT_L2_HR_LakeSP_D/'],
'S3CredentialsAPIDocumentationURL': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentialsREADME',
'S3CredentialsAPIEndpoint': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentials'},
'concept-id': 'C3233944983-POCLOUD',
'file-type': "[{'FormatType': 'Native', 'Format': 'Shapefile'}]",
'get-data': [ 'https://cmr.earthdata.nasa.gov/virtual-directory/collections/C3233944983-POCLOUD',
'https://search.earthdata.nasa.gov/search/granules?p=C3233944983-POCLOUD'],
'short-name': 'SWOT_L2_HR_LakeSP_D',
'version': 'D'}
{ 'cloud-info': { 'Region': 'us-west-2',
'S3BucketAndObjectPrefixNames': [ 'podaac-swot-ops-cumulus-protected/SWOT_L2_HR_LakeSP_obs_D/',
'podaac-swot-ops-cumulus-public/SWOT_L2_HR_LakeSP_obs_D/'],
'S3CredentialsAPIDocumentationURL': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentialsREADME',
'S3CredentialsAPIEndpoint': 'https://archive.swot.podaac.earthdata.nasa.gov/s3credentials'},
'concept-id': 'C3233942286-POCLOUD',
'file-type': "[{'FormatType': 'Native', 'Format': 'Shapefile'}]",
'get-data': [ 'https://cmr.earthdata.nasa.gov/virtual-directory/collections/C2799438239-POCLOUD',
'https://search.earthdata.nasa.gov/search/granules?p=C2799438239-POCLOUD'],
'short-name': 'SWOT_L2_HR_LakeSP_obs_D',
'version': 'D'}
For each collection, summary
returns a subset of fields from the collection metadata and Unified Metadata Model (UMM) entry.
concept-id
is an unique identifier for the collection that is composed of a alphanumeric code and the provider-id for the DAAC.file-type
gives information about the file format of the collection files.get-data
is a collection of URLs that can be used to access data, dataset landing pages, and tools.short-name
is the name of the dataset that appears on the dataset set landing page. For SWOT,ShortNames
are generally how different products are referred to.version
is the version of each collection.
For cloud-hosted data, there is additional information about the location of the S3 bucket that holds the data and where to get credentials to access the S3 buckets. In general, you don’t need to worry about this information because earthaccess
handles S3 credentials for you. Nevertheless it may be useful for troubleshooting.
For the SWOT search results the end of the concept-id is POCLOUD
which means this data is located in the PODAAC cloud.
For SWOT short-name
refers to the following products. D
at the end refers to version D, but you can replace that with 2.0
to get version C.
ShortName | Product Description (with linked tutorials if available) |
---|---|
SWOT_L2_HR_Raster_D | SWOT Level 2 Water Mask Raster Image |
SWOT_L2_HR_Raster_100m_D | 100m spatial resolution |
SWOT_L2_HR_Raster_250m_D | 250m spatial resolution |
SWOT_L2_LR_SSH_D | SWOT Level 2 KaRIn Low Rate Sea Surface Height |
SWOT_L2_LR_SSH_BASIC_D | Contain a limited set of variables, aimed at the general user |
SWOT_L2_LR_SSH_EXPERT_D | Contain all related variables, intended for expert users |
SWOT_L2_LR_SSH_UNSMOOTH_D | Includes all related variables, on the finer resolution “native” grid, minimal smoothing applied |
SWOT_L2_LR_SSH_WINDWAVE_D | Wind and wave height data |
Some others | |
SWOT_L2_HR_PIXC_D | SWOT Level 2 Water Mask Pixel Cloud |
SWOT_L1B_HR_SLC_D | SWOT Level 1B High-Rate Single-look Complex |
SWOT_L2_HR_RiverSP_D | SWOT Level 2 River Single-Pass Vector |
SWOT_L2_HR_LakeSP_D | SWOT Level 2 Lake Single-Pass Vector |
SWOT_L2_NALT_GDR_2.0 | SWOT Level 2 Nadir Altimeter Geophysical Data Record with Waveforms |
If you want to see just those short-names so you can paste it into the earthaccess
data access below, you can use this method:
for collection in query[:5]:
pprint.pprint(collection.summary()['short-name'], sort_dicts=True, indent=4)
'SWOT_L2_LR_SSH_D'
'SWOT_L2_HR_RiverSP_D'
'SWOT_L2_HR_LakeAvg_D'
'SWOT_L2_HR_LakeSP_D'
'SWOT_L2_HR_LakeSP_obs_D'
Search SWOT data using spatial and temporal filters¶
Once, you have identified the dataset you want to work with, you can use the search_data
method to search a data set with spatial and temporal filters. Since we are using the SWOT HR Raster 100m product for this tutorial, we’ll search for those rasters over the Bach Ice Shelf in Antarctica, for May 1 and June 30, 2025.
Either concept-id
or short-name
can be used to search for granules from a particular dataset. If you use short-name
you also need to set version
. If you use concept-id
, this is all that is required because concept-id
is unique.
The temporal range is identified with standard date strings. Latitude-longitude corners of a bounding box are specified as lower left, upper right. Polygons and points, as well as shapefiles can also be specified.
This will display the number of granules that match our search.
# Open SWOT data
latmin,latmax = -72.5,-71.5
lonmin,lonmax = -73.4,-70.5
sbox = (lonmin, latmin, lonmax, latmax)
results = earthaccess.search_data(
short_name="SWOT_L2_HR_Raster_100m_2.0",
temporal=("2025-05-01", "2025-06-30"),
bounding_box=sbox
)
print(f'{len(results)} total')
4 total
We’ll get metadata for these 4 granules and display it. The rendered metadata shows a download link, granule size and two images of the data.
[display(r) for r in results]
Open, load and display data stored on S3¶
Direct-access to data from an S3 bucket is a two step process. First, the files are opened using the open
method. This first step creates a Python file-like object that is used to load the data in the second step.
Authentication is required for this step. The auth
object created at the start of the notebook is used to provide Earthdata Login authentication and AWS credentials “behind-the-scenes”. These credentials expire after one hour so the auth
object must be executed within that time window prior to these next steps.
rasters = earthaccess.open(results)
After reading the data in, we can open one file at a time. In this example, data are loaded into an xarray.Dataset
. Data could be read into numpy
arrays or a pandas.Dataframe
. However, each granule would have to be read using a package that reads HDF5 granules such as h5py
. xarray
does this all under-the-hood in a single line.
d1 = xr.open_dataset(rasters[0])
d1
We can open just that one file, but if we want to work with a large timeseries, it is more likely that we want all 4 datasets in one xarray.Dataset. We can do this in on command called xarray.open_mfdataset
, but in order to concatenate each dataset by time to add another dimension, we use the preprocess
function built into xarray to add the time dimension. To execute preprocess
to add a time dimension, we must first build a function that finds the time dimension from the file name and adds that extra dimension for each SWOT pass we have collected.
earthaccess.results.DataGranule.data_links(results[0], access='direct')
['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_HR_Raster_2.0/SWOT_L2_HR_Raster_100m_UTM18C_N_x_x_x_032_115_011F_20250502T023949_20250502T023955_PIC2_01.nc']
# Preprocess helper to add a time coordinate from the filename
# Looks for YYYYMMDDTHHMMSS anywhere in the source path
_TIME_RE = re.compile(r"(\d{8}T\d{6})")
def add_time_from_source(ds: xr.Dataset) -> xr.Dataset:
src = str(ds.encoding.get("source", "")) # xarray keeps this
m = _TIME_RE.search(src)
if m:
ts = datetime.strptime(m.group(1), "%Y%m%dT%H%M%S")
# attach as a proper dimension so open_mfdataset can concat
ds = ds.expand_dims(time=[ts])
else:
# fallback: leave unmodified if no timestamp can be found
pass
return ds
Then we can run xarray.open_mfdataset
with that preprocessing function included. This only lazy loads the data meaning we can do operations on the data and metadata but the data aren’t actually read into memory yet unless we need them. ds
is only about 1 Gb right now, but if we ran ds.compute()
to read all of the variables in, ds
would be ~25 Gb and potentially crash our memory.
# Open as a multi-file dataset concatenated by time - 30s runtime
ds = xr.open_mfdataset(
rasters,
engine="h5netcdf", # recommended for streamed HDF5/NetCDF via fsspec
preprocess=add_time_from_source,
combine="nested", # adding time during preprocess
concat_dim="time",
decode_cf=True,
)
ds
Notice that under dimensions, we now have time
and it is showing we have 4 time steps, aside from the x and y dimensions.
ds.time.values
array(['2025-05-02T02:39:49.000000000', '2025-05-02T02:39:53.000000000',
'2025-05-02T02:40:14.000000000', '2025-05-02T22:25:10.000000000'],
dtype='datetime64[ns]')
Now we can plot all of these time steps with a really nice visualization package called hvplot
. It give a little time widget on the right that allows you to scroll from one time step to the next. We are plotting the wse
variable, but this can be switched out for any other variable name. You can move the image, zoom, wheel zoom, save the image, reset the image, and toggle the hover capabilities on the upper right of the image.
timeplot = ds.wse.hvplot.image(y='y', x='x')
timeplot.opts(width=700, height=500, colorbar=True)