hipscat.inspection.almanac#

Module Contents#

Classes#

Almanac

Single instance of an almanac, and available catalogs within namespaces

class Almanac(include_default_dir=True, dirs=None, storage_options: Dict[Any, Any] | None = None)[source]#

Single instance of an almanac, and available catalogs within namespaces

Params:
include_default_dir:

include directory indicated in HIPSCAT_ALMANAC_DIR environment variable. see AlmanacInfo.get_default_dir

dirs:

additional directories or files to look for almanac files in. we support a few types of input, with different behaviors:

  • str - a single directory (or file)

  • list[str] - multiple directories (or files)

  • dict[str:str] / dict[str:list[str]] - namespace dictionary. for each key in the dictionary, we put all almanac entries under a namespace. this is useful if you have name collisions e.g. between multiple surveys or user-provided catalogs.

_init_files(include_default_dir=True, dirs=None)[source]#

Create a list of all the almanac files we want to add to this instance.

Each almanac file corresponds to a single catalog.

Parameters:
  • include_default_dir – include directory indicated in HIPSCAT_ALMANAC_DIR environment variable. see AlmanacInfo.get_default_dir

  • dirs – additional directories to look for almanac files in

_add_files_to_namespace(directory, namespace='')[source]#

Get almanac files within a directory or list of directories.

Parameters:
  • directory – directory to scan

  • namespace – if provided, files in this directory will be in their own namespace in the almanac

_init_catalog_objects()[source]#

Create (unlinked) almanac info objects for all the files found in the previous steps.

Initialize the links between almanac catalogs.

For each type of link (e.g. primary or join), look for the catalog in the almanac, using whatever text we have. If found, add the object to the almanac info as a pointer. Additionally, add the reference to the linked object, so catalogs know about each other from either side.

_get_linked_catalog(linked_text, namespace) hipscat.inspection.almanac_info.AlmanacInfo | None[source]#

Find a catalog to be used for linking catalogs within the almanac.

e.g. for an association table, we will have a primary and join catalog. the association catalog is “receiving” the link of primary catalog info, and a link of join catalog info.

Parameters:
  • linked_text

    text provided for the linked catalog. this could take a few different forms:

    • empty or None (returns None)

    • short name of a catalog

    • namespaced name of a catalog

    • full path to a catalog base directory

    • path to a catalog base directory, with environment variables

  • namespace – the namespace in the catalog receiving the link. this is used to resolve the linked_text argument, so if you’re relying on namespaces, the receiving and linking catalog should be in the same namespace

Returns:

almanac info for the linked catalog, if found

catalogs(include_deprecated=False, types: List[str] | None = None)[source]#

Get names of catalogs in the almanac, matching the provided conditions.

Catalogs must meet all criteria provided in order to be returned (e.g. the criteria are ANDED together).

Parameters:
  • include_deprecated – include catalogs which contain some text in their deprecated field.

  • types – include ONLY catalogs within the list of provided types.

get_almanac_info(catalog_name: str) hipscat.inspection.almanac_info.AlmanacInfo[source]#

Fetch the almanac info for a single catalog.

get_catalog(catalog_name: str) hipscat.catalog.dataset.dataset.Dataset[source]#

Fetch the fully-populated hipscat metadata for the catalog name.

This will load the catalog_info.join and other relevant metadata files from disk.