hipscat.io.paths#

Methods for creating partitioned data paths

Module Contents#

Functions#

pixel_directory(...)

Create path pointer for a pixel directory. This will not create the directory.

get_healpix_from_path(...)

Find the pixel_order and pixel_number from a string like the

pixel_catalog_files(...)

Create a list of path pointers for pixel catalog files. This will not create the directory

pixel_catalog_file(...)

Create path pointer for a pixel catalog file. This will not create the directory

create_hive_directory_name(base_dir, ...)

Create path pointer for a directory with hive partitioning naming.

create_hive_parquet_file_name(base_dir, ...)

Create path pointer for a single parquet with hive partitioning naming.

get_catalog_info_pointer(...)

Get file pointer to catalog_info.json metadata file

get_partition_info_pointer(...)

Get file pointer to partition_info.csv metadata file

get_provenance_pointer(...)

Get file pointer to provenance_info.json metadata file

get_common_metadata_pointer(...)

Get file pointer to _common_metadata parquet metadata file

get_parquet_metadata_pointer(...)

Get file pointer to _metadata parquet metadata file

get_point_map_file_pointer(...)

Get file pointer to point_map.fits FITS image file.

get_partition_join_info_pointer(...)

Get file pointer to partition_join_info.csv association metadata file

Attributes#

ORDER_DIRECTORY_PREFIX

DIR_DIRECTORY_PREFIX

PIXEL_DIRECTORY_PREFIX

JOIN_ORDER_DIRECTORY_PREFIX

JOIN_DIR_DIRECTORY_PREFIX

JOIN_PIXEL_DIRECTORY_PREFIX

CATALOG_INFO_FILENAME

PARTITION_INFO_FILENAME

PARTITION_JOIN_INFO_FILENAME

PROVENANCE_INFO_FILENAME

PARQUET_METADATA_FILENAME

PARQUET_COMMON_METADATA_FILENAME

POINT_MAP_FILENAME

ORDER_DIRECTORY_PREFIX = 'Norder'[source]#
DIR_DIRECTORY_PREFIX = 'Dir'[source]#
PIXEL_DIRECTORY_PREFIX = 'Npix'[source]#
JOIN_ORDER_DIRECTORY_PREFIX = 'join_Norder'[source]#
JOIN_DIR_DIRECTORY_PREFIX = 'join_Dir'[source]#
JOIN_PIXEL_DIRECTORY_PREFIX = 'join_Npix'[source]#
CATALOG_INFO_FILENAME = 'catalog_info.json'[source]#
PARTITION_INFO_FILENAME = 'partition_info.csv'[source]#
PARTITION_JOIN_INFO_FILENAME = 'partition_join_info.csv'[source]#
PROVENANCE_INFO_FILENAME = 'provenance_info.json'[source]#
PARQUET_METADATA_FILENAME = '_metadata'[source]#
PARQUET_COMMON_METADATA_FILENAME = '_common_metadata'[source]#
POINT_MAP_FILENAME = 'point_map.fits'[source]#
pixel_directory(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer, pixel_order: int, pixel_number: int | None = None, directory_number: int | None = None) hipscat.io.file_io.file_pointer.FilePointer[source]#

Create path pointer for a pixel directory. This will not create the directory.

One of pixel_number or directory_number is required. The directory name will take the HiPS standard form of:

<catalog_base_dir>/Norder=<pixel_order>/Dir=<directory number>

Where the directory number is calculated using integer division as:

(pixel_number/10000)*10000
Parameters:
  • catalog_base_dir (FilePointer) – base directory of the catalog (includes catalog name)

  • pixel_order (int) – the healpix order of the pixel

  • directory_number (int) – directory number

  • pixel_number (int) – the healpix pixel

Returns:

FilePointer directory name

get_healpix_from_path(path: str) hipscat.pixel_math.healpix_pixel.HealpixPixel[source]#

Find the pixel_order and pixel_number from a string like the following:

Norder=<pixel_order>/Dir=<directory number>/Npix=<pixel_number>.parquet

NB: This expects the format generated by the pixel_catalog_file method

Parameters:

path (str) – path to parse

Returns:

Constructed HealpixPixel object representing the pixel in the path. INVALID_PIXEL if the path doesn’t match the expected pattern for any reason.

pixel_catalog_files(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer, pixels: List[hipscat.pixel_math.healpix_pixel.HealpixPixel], storage_options: Dict | None = None) List[hipscat.io.file_io.file_pointer.FilePointer][source]#

Create a list of path pointers for pixel catalog files. This will not create the directory or files.

The catalog file names will take the HiPS standard form of:

<catalog_base_dir>/Norder=<pixel_order>/Dir=<directory number>/Npix=<pixel_number>.parquet

Where the directory number is calculated using integer division as:

(pixel_number/10000)*10000
Parameters:
  • catalog_base_dir (FilePointer) – base directory of the catalog (includes catalog name)

  • pixels (List[HealpixPixel]) – the healpix pixels to create pointers to

  • storage_options (dict) – the storage options for the file system to target when generating the paths

Returns (List[FilePointer]):

A list of paths to the pixels, in the same order as the input pixel list.

pixel_catalog_file(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer, pixel_order: int, pixel_number: int) hipscat.io.file_io.file_pointer.FilePointer[source]#

Create path pointer for a pixel catalog file. This will not create the directory or file.

The catalog file name will take the HiPS standard form of:

<catalog_base_dir>/Norder=<pixel_order>/Dir=<directory number>/Npix=<pixel_number>.parquet

Where the directory number is calculated using integer division as:

(pixel_number/10000)*10000
Parameters:
  • catalog_base_dir (FilePointer) – base directory of the catalog (includes catalog name)

  • pixel_order (int) – the healpix order of the pixel

  • pixel_number (int) – the healpix pixel

Returns:

string catalog file name

create_hive_directory_name(base_dir, partition_token_names, partition_token_values)[source]#

Create path pointer for a directory with hive partitioning naming. This will not create the directory.

The directory name will have the form of:

<catalog_base_dir>/<name_1>=<value_1>/.../<name_n>=<value_n>
Parameters:
  • catalog_base_dir (FilePointer) – base directory of the catalog (includes catalog name)

  • partition_token_names (list[string]) – list of partition name parts.

  • partition_token_values (list[string]) – list of partition values that correspond to the token name parts.

create_hive_parquet_file_name(base_dir, partition_token_names, partition_token_values)[source]#

Create path pointer for a single parquet with hive partitioning naming.

The file name will have the form of:

<catalog_base_dir>/<name_1>=<value_1>/.../<name_n>=<value_n>.parquet
Parameters:
  • catalog_base_dir (FilePointer) – base directory of the catalog (includes catalog name)

  • partition_token_names (list[string]) – list of partition name parts.

  • partition_token_values (list[string]) – list of partition values that correspond to the token name parts.

get_catalog_info_pointer(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer) hipscat.io.file_io.file_pointer.FilePointer[source]#

Get file pointer to catalog_info.json metadata file

Parameters:

catalog_base_dir – pointer to base catalog directory

Returns:

File Pointer to the catalog’s catalog_info.json file

get_partition_info_pointer(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer) hipscat.io.file_io.file_pointer.FilePointer[source]#

Get file pointer to partition_info.csv metadata file

Parameters:

catalog_base_dir – pointer to base catalog directory

Returns:

File Pointer to the catalog’s partition_info.csv file

get_provenance_pointer(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer) hipscat.io.file_io.file_pointer.FilePointer[source]#

Get file pointer to provenance_info.json metadata file

Parameters:

catalog_base_dir – pointer to base catalog directory

Returns:

File Pointer to the catalog’s provenance_info.json file

get_common_metadata_pointer(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer) hipscat.io.file_io.file_pointer.FilePointer[source]#

Get file pointer to _common_metadata parquet metadata file

Parameters:

catalog_base_dir – pointer to base catalog directory

Returns:

File Pointer to the catalog’s _common_metadata file

get_parquet_metadata_pointer(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer) hipscat.io.file_io.file_pointer.FilePointer[source]#

Get file pointer to _metadata parquet metadata file

Parameters:

catalog_base_dir – pointer to base catalog directory

Returns:

File Pointer to the catalog’s _metadata file

get_point_map_file_pointer(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer) hipscat.io.file_io.file_pointer.FilePointer[source]#

Get file pointer to point_map.fits FITS image file.

Parameters:

catalog_base_dir – pointer to base catalog directory

Returns:

File Pointer to the catalog’s point_map.fits FITS image file.

get_partition_join_info_pointer(catalog_base_dir: hipscat.io.file_io.file_pointer.FilePointer) hipscat.io.file_io.file_pointer.FilePointer[source]#

Get file pointer to partition_join_info.csv association metadata file

Parameters:

catalog_base_dir – pointer to base catalog directory

Returns:

File Pointer to the catalog’s partition_join_info.csv association metadata file