Datasets

Datasets#

CloudDrift provides convenience functions to access real-world ragged-array datasets.

>>> from clouddrift.datasets import gdp1h
>>> ds = gdp1h()
    <xarray.Dataset>
    Dimensions:                (traj: 17324, obs: 165754333)
    Coordinates:
        ids                    (obs) int64 ...
        lat                    (obs) float32 ...
        lon                    (obs) float32 ...
        time                   (obs) datetime64[ns] ...
    Dimensions without coordinates: traj, obs
    Data variables: (12/55)
        BuoyTypeManufacturer   (traj) |S20 ...
        BuoyTypeSensorArray    (traj) |S20 ...
        CurrentProgram         (traj) float64 ...
        DeployingCountry       (traj) |S20 ...
        DeployingShip          (traj) |S20 ...
        DeploymentComments     (traj) |S20 ...
        ...                     ...
        sst1                   (obs) float64 ...
        sst2                   (obs) float64 ...
        typebuoy               (traj) |S10 ...
        typedeath              (traj) int8 ...
        ve                     (obs) float32 ...
        vn                     (obs) float32 ...
    Attributes: (12/16)
        Conventions:       CF-1.6
        acknowledgement:   Elipot, Shane; Sykulski, Adam; Lumpkin, Rick; Centurio...
        contributor_name:  NOAA Global Drifter Program
        contributor_role:  Data Acquisition Center
        date_created:      2022-12-09T06:02:29.684949
        doi:               10.25921/x46c-3620
        ...                ...
        processing_level:  Level 2 QC by GDP drifter DAC
        publisher_email:   aoml.dftr@noaa.gov
        publisher_name:    GDP Drifter DAC
        publisher_url:     https://www.aoml.noaa.gov/phod/gdp
        summary:           Global Drifter Program hourly data
        title:             Global Drifter Program hourly drifting buoy collection

Currently available datasets are:

The GDP and the Spotters datasets are accessed lazily, so the data is only downloaded when specific array values are referenced. The ANDRO, GLAD, MOSAiC, Subsurface Floats, and YoMaHa’07 datasets are downloaded in their entirety when the function is called for the first time and stored locally for later use.