clouddrift.adapters.gdp#

This module provides functions and metadata to convert the Global Drifter Program (GDP) data to a clouddrift.RaggedArray instance. The functions defined in this module are common to both hourly (clouddrift.adapters.gdp1h) and six-hourly (clouddrift.adapters.gdp6h) GDP modules.

Functions

cast_float64_variables_to_float32(ds[, ...])

Cast all float64 variables except variables_to_skip to float32.

cut_str(value, max_length)

Cut a string to a specific length and return it as a numpy chararray.

decode_date(t)

The date format is specified as 'seconds since 1970-01-01 00:00:00' but the missing values are stored as -1e+34 which is not supported by the default parsing mechanism in xarray.

drogue_presence(lost_time, time)

Create drogue status from the drogue lost time and the trajectory time.

fetch_netcdf(url, file)

Download and save the file from the given url, if not already downloaded.

fill_values(var[, default])

Change fill values (-1e+34, inf, -inf) in var array to the value specified by default.

get_gdp_metadata()

Download and parse GDP metadata and return it as a Pandas DataFrame.

order_by_date(df, idx)

From the previously sorted DataFrame of directory files, return the unique set of drifter IDs sorted by their start date (the date of the first quality-controlled data point).

parse_directory_file(filename)

Read a GDP directory file that contains metadata of drifter releases.

rowsize(index, **kwargs)

str_to_float(value[, default])

Convert a string to float, while returning the value of default if the string is not convertible to a float, or if it's a NaN.

clouddrift.adapters.gdp.cast_float64_variables_to_float32(ds: Dataset, variables_to_skip: list[str] = ['time', 'lat', 'lon']) Dataset[source]#

Cast all float64 variables except variables_to_skip to float32. Extra precision from float64 is not needed and takes up memory and disk space.

Parameters#

dsxr.Dataset

Dataset to modify

variables_to_skiplist[str]

List of variables to skip; default is [“time”, “lat”, “lon”].

Returns#

dsxr.Dataset

Modified dataset

clouddrift.adapters.gdp.cut_str(value: str, max_length: int) chararray[source]#

Cut a string to a specific length and return it as a numpy chararray.

Parameters#

valuestr

String to cut

max_lengthint

Length of the output

Returns#

outnp.chararray

String with max_length characters

clouddrift.adapters.gdp.decode_date(t)[source]#

The date format is specified as ‘seconds since 1970-01-01 00:00:00’ but the missing values are stored as -1e+34 which is not supported by the default parsing mechanism in xarray.

This function returns replaced the missing value by NaN and returns a datetime instance.

Parameters#

tarray

Array of time values

Returns#

outdatetime

Datetime instance with the missing value replaced by NaN

clouddrift.adapters.gdp.drogue_presence(lost_time, time) ndarray[source]#

Create drogue status from the drogue lost time and the trajectory time.

Parameters#

lost_time

Timestamp of the drogue loss (or NaT)

time

Observation time

Returns#

outbool

True if drogues and False otherwise

clouddrift.adapters.gdp.fetch_netcdf(url: str, file: str)[source]#

Download and save the file from the given url, if not already downloaded.

Parameters#

urlstr

URL from which to download the file.

filestr

Name of the file to save.

clouddrift.adapters.gdp.fill_values(var, default=nan)[source]#

Change fill values (-1e+34, inf, -inf) in var array to the value specified by default.

Parameters#

vararray

Array to fill

defaultfloat

Default value to use for fill values

clouddrift.adapters.gdp.get_gdp_metadata() DataFrame[source]#

Download and parse GDP metadata and return it as a Pandas DataFrame.

Returns#

dfpd.DataFrame

Sorted list of drifters as a pandas DataFrame.

clouddrift.adapters.gdp.order_by_date(df: DataFrame, idx: list[int]) list[int][source]#

From the previously sorted DataFrame of directory files, return the unique set of drifter IDs sorted by their start date (the date of the first quality-controlled data point).

Parameters#

idxlist

List of drifters to include in the ragged array

Returns#

idxlist

Unique set of drifter IDs sorted by their start date.

clouddrift.adapters.gdp.parse_directory_file(filename: str) DataFrame[source]#

Read a GDP directory file that contains metadata of drifter releases.

Parameters#

filenamestr

Name of the directory file to parse.

Returns#

dfpd.DataFrame

List of drifters from a single directory file as a pandas DataFrame.

clouddrift.adapters.gdp.str_to_float(value: str, default: float = nan) float[source]#

Convert a string to float, while returning the value of default if the string is not convertible to a float, or if it’s a NaN.

Parameters#

valuestr

String to convert to float

defaultfloat

Default value to return if the string is not convertible to float

Returns#

outfloat

Float value of the string, or default if the string is not convertible to float.