hipscat.io.write_metadata#

Utility functions for writing metadata files

Module Contents#

Classes#

HipscatEncoder

Special json encoder for types commonly encountered with hipscat.

Functions#

write_json_file(metadata_dictionary, file_pointer[, ...])

Convert metadata_dictionary to a json string and print to file.

write_catalog_info(catalog_base_dir, dataset_info[, ...])

Write a catalog_info.json file with catalog metadata

write_provenance_info(catalog_base_dir, dataset_info, ...)

Write a provenance_info.json file with all assorted catalog creation metadata

write_partition_info(catalog_base_dir, ...[, ...])

Write all partition data to CSV file.

write_parquet_metadata(catalog_path[, storage_options])

Generate parquet metadata, using the already-partitioned parquet files

write_fits_map(catalog_path, histogram[, storage_options])

Write the object spatial distribution information to a healpix FITS file.

class HipscatEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]#

Bases: json.JSONEncoder

Special json encoder for types commonly encountered with hipscat.

NB: This will only be used by JSON encoding when encountering a type that is unhandled by the default encoder.

default(o)[source]#

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
write_json_file(metadata_dictionary: dict, file_pointer: hipscat.io.file_io.FilePointer, storage_options: Dict[Any, Any] | None = None)[source]#

Convert metadata_dictionary to a json string and print to file.

Parameters:
  • metadata_dictionary (dictionary) – a dictionary of key-value pairs

  • file_pointer (str) – destination for the json file

  • storage_options – dictionary that contains abstract filesystem credentials

write_catalog_info(catalog_base_dir, dataset_info, storage_options: Dict[Any, Any] | None = None)[source]#

Write a catalog_info.json file with catalog metadata

Parameters:
  • catalog_base_dir (str) – base directory for catalog, where file will be written

  • dataset_info (BaseCatalogInfo)

  • storage_options – dictionary that contains abstract filesystem credentials

write_provenance_info(catalog_base_dir: hipscat.io.file_io.FilePointer, dataset_info, tool_args: dict, storage_options: Dict[Any, Any] | None = None)[source]#

Write a provenance_info.json file with all assorted catalog creation metadata

Parameters:
  • catalog_base_dir (str) – base directory for catalog, where file will be written

  • dataset_info (BaseCatalogInfo)

  • tool_args (dict) – dictionary of additional arguments provided by the tool creating this catalog.

  • storage_options – dictionary that contains abstract filesystem credentials

write_partition_info(catalog_base_dir: hipscat.io.file_io.FilePointer, destination_healpix_pixel_map: dict, storage_options: Dict[Any, Any] | None = None)[source]#

Write all partition data to CSV file.

Parameters:
  • catalog_base_dir (str) – base directory for catalog, where file will be written

  • destination_healpix_pixel_map (dict) –

    dictionary that maps the HealpixPixel to a tuple of origin pixel information:

    • 0 - the total number of rows found in this destination pixel

    • 1 - the set of indexes in histogram for the pixels at the original healpix order

  • storage_options – dictionary that contains abstract filesystem credentials

write_parquet_metadata(catalog_path, storage_options: Dict[Any, Any] | None = None)[source]#

Generate parquet metadata, using the already-partitioned parquet files for this catalog

Parameters:
  • catalog_path (str) – base path for the catalog

  • storage_options – dictionary that contains abstract filesystem credentials

write_fits_map(catalog_path, histogram: numpy.ndarray, storage_options: Dict[Any, Any] | None = None)[source]#

Write the object spatial distribution information to a healpix FITS file.

Parameters:
  • catalog_path (str) – base path for the catalog

  • histogram (np.ndarray) – one-dimensional numpy array of long integers where the value at each index corresponds to the number of objects found at the healpix pixel.

  • storage_options – dictionary that contains abstract filesystem credentials