Orchestration
A common architecture for moving data is defined in orchestration/transfer_controller.py.
bl7012
Info goes here
bl733
Info goes here
bl832
Beamline 8.3.2 is the Hard X-ray Micro-tomography instrument at the Advanced Light Source.
config.py
The Config832 class is designed to configure and initialize various components needed for data transfer and orchestration workflows related to the ALS beamline 832.
-
Initialization:
TheConfig832class retrieves configuration data using thetransfer.get_config()function, which is used to set up necessary endpoints and applications for data transfer. -
Endpoints and Applications:
It constructs a set of endpoints usingtransfer.build_endpoints()and applications withtransfer.build_apps(). These endpoints represent different storage locations for data, both on local and remote systems (e.g.,spot832,data832,nersc832, etc.). -
Transfer Client:
ATransferClientinstance is created usingtransfer.init_transfer_client(), with a specified application (als_transfer). This client will handle the file transfer operations between different endpoints. -
Storage Locations:
Multiple endpoints related to data storage are stored as attributes, representing raw data, scratch space, and other storage locations for different systems (e.g.,data832_raw,nersc832_alsdev_scratch). -
Additional Configurations:
The configuration dictionary (config) also contains other configuration values likescicat(for metadata) andghcr_images832(for container images), which are also stored as attributes.
dispatcher.py
This script is designed to automate and orchestrate the decision-making process for the BL832 beamline, ensuring that the appropriate workflows are triggered based on predefined settings.
Key Components:
-
FlowParameterMapperClass:
This class is used to map the required parameters for each flow based on a predefined set of parameters. Theget_flow_parametersmethod filters and returns only the relevant parameters for the specified flow based on a dictionary of available parameters. -
DecisionFlowInputModel:
A Pydantic model that validates input parameters for the decision flow, including the file path, export control status, and configuration dictionary. It ensures that the input parameters are in the correct format before they are used in the decision-making process. -
setup_decision_settingsTask:
This task defines the settings for which flows should be executed (e.g., ALCF, NERSC, and 832 beamline file management). It logs the configuration and saves the decision settings as a JSON block for later use by other flows. -
run_specific_flowTask:
An asynchronous task that runs a specific flow based on the provided flow name and parameters. This task is dynamically executed and uses the Prefect deployment system to trigger flows as needed. -
dispatcherFlow:
The main flow, which coordinates the execution of various tasks:- Validates input parameters using
DecisionFlowInputModel. - Loads decision settings from a previously saved JSON block.
- Runs the
new_832_file_flow/new_file_832flow synchronously first if required. - Runs ALCF and NERSC flows asynchronously if they are enabled in the decision settings.
This flow ensures that the correct sub-flows are executed in the correct order, using dynamic parameters for each flow.
- Validates input parameters using
move.py
This script defines a set of tasks and flows for transferring files between different endpoints and systems, processing data, and managing file deletions. It leverages Globus for file transfers and Prefect for orchestration.
Key Components:
-
transfer_spot_to_dataTask:- This task transfers a file from the
spot832endpoint to thedata832endpoint using theGlobusTransferClient. - It ensures that the file paths are correctly formatted and calls the
start_transferfunction to perform the transfer.
- This task transfers a file from the
-
transfer_data_to_nerscTask:- Similar to the previous task, this one transfers data from
data832to thenersc832endpoint. - The task also handles file path formatting and logs transfer details.
- Similar to the previous task, this one transfers data from
-
process_new_832_fileFlow:- This is the main flow for processing new files in the system. It performs multiple steps:
- Transfers the file from
spot832todata832usingtransfer_spot_to_data. - Transfers the file from
data832toNERSCif export control is not enabled andsend_to_nerscis true, usingtransfer_data_to_nersc. - Ingests the file into SciCat using the
ingest_datasetfunction. If SciCat ingestion fails, it logs an error. - Schedules file deletion tasks for both
spot832anddata832using theschedule_prefect_flowfunction, ensuring files are deleted after a set number of days as configured inbl832-settings.
- Transfers the file from
- This is the main flow for processing new files in the system. It performs multiple steps:
-
test_transfers_832Flow:- This flow is used for testing transfers between
spot832,data832, andNERSC. It:- Generates a new unique file name.
- Transfers the file from
spot832todata832and then fromdata832toNERSC. - Logs the success of each transfer and checks that the process works as expected.
- This flow is used for testing transfers between
Configuration:
API_KEY: The API key is retrieved from the environment variableAPI_KEY, which is used for authorization with Globus and other services.TOMO_INGESTOR_MODULE: This is a reference to the module used for ingesting datasets into SciCat.Config832: The configuration for the beamline 832 environment, which includes the necessary endpoints for file transfers and other operations.
File Deletion Scheduling:
- After transferring the files, the flow schedules the deletion of files from
spot832anddata832after a predefined number of days, using Prefect's flow scheduling capabilities.
Error Handling:
- Throughout the script, errors are logged in case of failure during file transfers, SciCat ingestion, or scheduling tasks.
Use Case:
This script is primarily used for automating the movement of files from spot832 to data832, sending data to NERSC, ingesting the data into SciCat, and managing file cleanup on both spot832 and data832 systems. It allows for flexible configuration based on export control settings and NERSC transfer preferences.
job_controller.py
This script defines an abstract class and factory function for managing tomography reconstruction and multi-resolution dataset building on different High-Performance Computing (HPC) systems. It uses the Config832 class for configuration and handles different HPC environments like ALCF and NERSC. The abstraction allows for easy expansion to support additional systems like OLCF in the future.
Key Components:
-
TomographyHPCControllerClass (Abstract Base Class):- This abstract base class defines the interface for tomography HPC controllers, with methods for performing tomography reconstruction and generating multi-resolution datasets.
- Methods:
reconstruct(file_path: str) -> bool:
Performs tomography reconstruction for a given file. ReturnsTrueif successful,Falseotherwise.build_multi_resolution(file_path: str) -> bool:
Generates a multi-resolution version of the reconstructed tomography data for a given file. ReturnsTrueif successful,Falseotherwise.
-
HPCEnum:- An enum to represent different HPC environments, with members:
ALCF: Argonne Leadership Computing FacilityNERSC: National Energy Research Scientific Computing CenterOLCF: Oak Ridge Leadership Computing Facility (currently not implemented)
- An enum to represent different HPC environments, with members:
-
get_controllerFunction:- A factory function that returns an appropriate
TomographyHPCControllersubclass instance based on the specified HPC environment (ALCF,NERSC, orOLCF). - Parameters:
hpc_type: An enum value identifying the HPC environment (e.g.,ALCF,NERSC).config: AConfig832object containing configuration data.
- Returns: An instance of the corresponding
TomographyHPCControllersubclass. - Raises: A
ValueErrorif thehpc_typeis invalid or unsupported, or if no config object is provided.
- A factory function that returns an appropriate
alcf.py
This script is responsible for performing tomography reconstruction and multi-resolution data processing on ALCF using Globus Compute. It orchestrates file transfers, reconstructs tomography data, and builds multi-resolution datasets, then transfers the results back to the data832 endpoint.
Key Components:
-
ALCFTomographyHPCControllerClass:- This class implements the
TomographyHPCControllerabstract class for the ALCF environment, enabling tomography reconstruction and multi-resolution dataset creation using Globus Compute. - Methods:
reconstruct(file_path: str) -> bool:
Runs the tomography reconstruction using Globus Compute by submitting a job to ALCF.build_multi_resolution(file_path: str) -> bool:
Converts TIFF files to Zarr format using Globus Compute._reconstruct_wrapper(...):
A static method that wraps the reconstruction process, running theglobus_reconstruction.pyscript._build_multi_resolution_wrapper(...):
A static method that wraps the TIFF to Zarr conversion, running thetiff_to_zarr.pyscript._wait_for_globus_compute_future(future: Future, task_name: str, check_interval: int = 20) -> bool:
Waits for a Globus Compute task to complete and checks its status at regular intervals.
- This class implements the
-
schedule_prune_taskTask:- Schedules the deletion of files from a specified location, allowing the deletion of processed files from ALCF, NERSC, or data832.
- Parameters:
path: The file path to the folder containing files for deletion.location: The server location where the files are stored.schedule_days: The delay before deletion, in days.
- Returns:
Trueif the task was scheduled successfully,Falseotherwise.
-
schedule_pruningTask:- This task schedules multiple file pruning operations based on configuration settings. It takes paths for various scratch and raw data locations, including ALCF, NERSC, and data832.
- Parameters:
alcf_raw_path,alcf_scratch_path_tiff,alcf_scratch_path_zarr, etc.: Paths for various file locations to be pruned.one_minute: A flag for testing purposes, scheduling pruning after one minute.config: Configuration object for the flow.
- Returns:
Trueif all tasks were scheduled successfully,Falseotherwise.
-
alcf_recon_flowFlow:-
This is the main flow for processing and transferring files between
data832and ALCF. It orchestrates the following steps:- Transfer data from
data832to ALCF. - Run tomography reconstruction on ALCF using Globus Compute.
- Run TIFF to Zarr conversion on ALCF using Globus Compute.
- Transfer the reconstructed data (both TIFF and Zarr) back to
data832. - Schedule pruning tasks to delete files from ALCF and data832 after the defined period.
- Transfer data from
-
Parameters:
file_path: The file to be processed, typically in.h5format.config: Configuration object containing the necessary endpoints for data transfers.
-
Returns:
Trueif the flow completed successfully,Falseotherwise.
-
nersc.py
This script manages tomography reconstruction and multi-resolution data processing on NERSC using the SFAPI client. It submits jobs for reconstruction and multi-resolution processing, transfers data between NERSC and data832, and schedules pruning tasks for file cleanup.
Key Components:
-
NERSCTomographyHPCControllerClass:- This class implements the
TomographyHPCControllerabstract class for the NERSC environment, enabling tomography reconstruction and multi-resolution dataset creation. - Methods:
create_sfapi_client() -> Client:
Creates and returns a NERSC client instance using the provided credentials for accessing the NERSC SFAPI.reconstruct(file_path: str) -> bool:
Starts the tomography reconstruction process at NERSC by submitting a job script to the Perlmutter machine using SFAPI.build_multi_resolution(file_path: str) -> bool:
Converts TIFF files to Zarr format on NERSC by submitting a job script to Perlmutter.
- This class implements the
-
schedule_pruningTask:- Schedules the deletion of files from various locations (data832, NERSC) after a specified duration.
- Parameters:
raw_file_path,tiff_file_path,zarr_file_path: Paths to raw, TIFF, and Zarr files.- Uses configuration values to determine where to delete the files and when.
- Returns:
Trueif all prune tasks were scheduled successfully,Falseotherwise.
-
nersc_recon_flowFlow:- This is the main flow for performing tomography reconstruction and multi-resolution processing at NERSC. It performs the following steps:
- Run tomography reconstruction using NERSC's SFAPI.
- Run multi-resolution processing to convert TIFF files to Zarr format.
- Transfer the reconstructed TIFF and Zarr files from NERSC to
data832. - Schedule pruning tasks to delete processed files from NERSC and data832 after a defined period.
-
Parameters:
file_path: The path to the file to be processed.config: Configuration object containing the necessary endpoints for data transfers.
-
Returns:
Trueif both reconstruction and multi-resolution tasks were successful,Falseotherwise.
- This is the main flow for performing tomography reconstruction and multi-resolution processing at NERSC. It performs the following steps:
olcf.py
To be implemented ...