Output Dictionary Key Format

You have a lot of control over how you want they keys of the output dictionary to appear. For demonstration purposes, consider the following search.

from intake_esgf import ESGFCatalog
cat = ESGFCatalog().search(
    experiment_id="historical",
    variant_label="r1i1p1f1",
    frequency="mon",
    source_id=["CESM2", "CanESM5"],
    variable_id=["tas", "gpp"],
)
print(cat)
Summary information for 4 results:
activity_drs                                                 [CMIP]
grid_label                                                     [gn]
datetime_start    [1850-01-15T12:00:00Z, 1850-01-16T12:00:00Z, 1...
mip_era                                                     [CMIP6]
table_id                                               [Amon, Lmon]
institution_id                                        [NCAR, CCCma]
member_id                                                [r1i1p1f1]
source_id                                          [CESM2, CanESM5]
variable_id                                              [tas, gpp]
datetime_stop          [2014-12-15T12:00:00Z, 2014-12-16T12:00:00Z]
experiment_id                                          [historical]
project                                                     [CMIP6]
dtype: object

By default, we will build keys out of the facet values that are different among the entries in the output dictionary. So since all the datasets are in the same activity, experiment and use the same variant and grid labels, these facets need not be in the output dictionary keys.

ds = cat.to_dataset_dict()
for key in ds.keys():
    print(key)
1850-01-16T12:00:00Z.Amon.CCCma.CanESM5.tas.2014-12-16T12:00:00Z
1850-01-16T12:00:00Z.Lmon.CCCma.CanESM5.gpp.2014-12-16T12:00:00Z
1850-01-15T11:44:59Z.Lmon.NCAR.CESM2.gpp.2014-12-15T12:00:00Z
1850-01-15T12:00:00Z.Amon.NCAR.CESM2.tas.2014-12-15T12:00:00Z

Ignoring Some Facets

However, on inspection you will notice that the institution and table are not needed either, but because they have different values were included in the keys by default. You can specify that certain facets be ignored in the output dictionary keys.

ds = cat.to_dataset_dict(ignore_facets=["institution_id", "table_id"])
for key in ds.keys():
    print(key)
1850-01-15T12:00:00Z.CESM2.tas.2014-12-15T12:00:00Z
1850-01-15T11:44:59Z.CESM2.gpp.2014-12-15T12:00:00Z
1850-01-16T12:00:00Z.CanESM5.gpp.2014-12-16T12:00:00Z
1850-01-16T12:00:00Z.CanESM5.tas.2014-12-16T12:00:00Z

Use All Facets

You may decide that you do not like our attempt to provide simpler keys in which case you may use the full set of facets.

ds = cat.to_dataset_dict(minimal_keys=False)
for key in ds.keys():
    print(key)
CMIP.gn.1850-01-15T12:00:00Z.CMIP6.Amon.NCAR.r1i1p1f1.CESM2.tas.2014-12-15T12:00:00Z.historical.CMIP6
CMIP.gn.1850-01-15T11:44:59Z.CMIP6.Lmon.NCAR.r1i1p1f1.CESM2.gpp.2014-12-15T12:00:00Z.historical.CMIP6
CMIP.gn.1850-01-16T12:00:00Z.CMIP6.Amon.CCCma.r1i1p1f1.CanESM5.tas.2014-12-16T12:00:00Z.historical.CMIP6
CMIP.gn.1850-01-16T12:00:00Z.CMIP6.Lmon.CCCma.r1i1p1f1.CanESM5.gpp.2014-12-16T12:00:00Z.historical.CMIP6

Change the Separator

You may also use a different separator. By default use the . symbol, but you may choose any character. This can be useful if you wish to use xarray-datatree to pass into their DataTree contructor.

ds = cat.to_dataset_dict(minimal_keys=False,separator="/")
for key in ds.keys():
    print(key)
CMIP/gn/1850-01-15T12:00:00Z/CMIP6/Amon/NCAR/r1i1p1f1/CESM2/tas/2014-12-15T12:00:00Z/historical/CMIP6
CMIP/gn/1850-01-15T11:44:59Z/CMIP6/Lmon/NCAR/r1i1p1f1/CESM2/gpp/2014-12-15T12:00:00Z/historical/CMIP6
CMIP/gn/1850-01-16T12:00:00Z/CMIP6/Lmon/CCCma/r1i1p1f1/CanESM5/gpp/2014-12-16T12:00:00Z/historical/CMIP6
CMIP/gn/1850-01-16T12:00:00Z/CMIP6/Amon/CCCma/r1i1p1f1/CanESM5/tas/2014-12-16T12:00:00Z/historical/CMIP6