loaders.notebook
load_notebook_prebronze
def load_notebook_prebronze(table_config: TableConfig,
config_manager: ConfigManager)
Executes a pre-processing notebook defined in the configuration before loading the bronze layer. The function checks for the existence of a notebook, handles parameters, and executes it with a specified timeout. If an error is encountered during execution, an exception is raised.
Arguments:
table_config
TableConfig - Configuration object containing information about the source table, notebook path, filter criteria, parameters, and other properties related to the pre-processing step.config_manager
ConfigManager - Configuration manager object used to provide global configurations such as the KeyVault reference.
Returns:
dict
- A dictionary containing the initial file type, notebook path, and the output of the executed notebook (if available). If no notebook is defined, it returns a default dictionary indicating that no notebook execution occurred.
Raises:
Exception
- Raised when the notebook execution encounters an error, and its exit value indicates failure.
load_notebook_postbronze
def load_notebook_postbronze(table_config: TableConfig,
config_manager: ConfigManager)
Runs a post-bronze processing notebook if configured in the provided table configuration.
This function is used to execute a notebook post-loading the bronze data layer. It checks for the existence of a post-bronze notebook in the configuration, prepares the necessary parameters, and executes the notebook. If no notebook is defined, it simply logs and returns default metadata. If errors occur during notebook execution, they are logged and raised as exceptions.
Arguments:
table_config
TableConfig - The table configuration containing settings for the notebook execution including the notebook name, parameters, and timeout values.config_manager
ConfigManager - A configuration manager that provides access to additional configuration details, such as the key vault information.
Returns:
dict
- A dictionary containing metadata about the notebook execution, including the output or a message indicating no notebook was defined.
Raises:
Exception
- If the notebook execution finishes with an error status.
load_notebook_presilver
def load_notebook_presilver(table_config: TableConfig,
config_manager: ConfigManager)
Runs a pre-silver layer notebook based on the provided configuration and parameters.
The function initializes and runs a specified notebook (if defined) before loading
data into the silver layer, using properties from the given table_config
and additional
parameters from the config_manager
. If the notebook execution encounters errors or is
not defined, appropriate warnings or exceptions are raised.
Arguments:
table_config
TableConfig - The configuration for the table, containing notebook-specific settings, silver table information, and parameters for execution.config_manager
ConfigManager - The configuration manager responsible for accessing additional configuration settings like key vault resources.
Returns:
dict
- A dictionary containing initializations forfiletype
, notebook status information, and potential outputs from the executed notebook.
Raises:
Exception
- If the notebook execution returns an error status.
load_notebook_postsilver
def load_notebook_postsilver(table_config: TableConfig,
config_manager: ConfigManager)
Executes a specified notebook after the silver layer data processing step. It retrieves notebook configuration and executes it if provided. Additionally, parameters for notebook execution are prepared based on the table configuration and configuration manager.
Arguments:
table_config
TableConfig - Configuration containing details about the silver layer and the notebook to execute post-processing.config_manager
ConfigManager - Manager containing additional system-wide configurations such as key vault information.
Returns:
dict
- A dictionary containing metadata about the notebook execution, including its filetype, name, and the output (if any).
Raises:
Exception
- If the notebook execution fails and returns an error message.
load_notebook_bronze
def load_notebook_bronze(object_info: ObjectInfo, table_config: TableConfig,
config_manager: ConfigManager) -> None
Executes a specific notebook at the Bronze data processing layer within a data pipeline. The function handles logging, validates the presence of the notebook configuration, and ensures the notebook is executed with the defined timeout period. If the notebook path is not configured, it logs a warning and terminates execution without raising an exception.
Arguments:
object_info
ObjectInfo - Contains metadata related to the object being processed, including its file path.table_config
TableConfig - Holds the configuration details and parameters for table processing, including the notebook specific to the processing layer and timeout values.config_manager
ConfigManager - Provides configuration management, such as retrieving and validating settings for the pipeline.
Returns:
None
- The function does not return a value but performs operations such as notebook execution and logging.
load_notebook_silver
def load_notebook_silver(table_config: TableConfig,
config_manager: ConfigManager)
Loads and executes a notebook for the silver processing layer based on the given table configuration and configuration manager. This function retrieves the notebook name and other relevant parameters from the table configuration and runs the notebook using a specified timeout. If the notebook is not defined, it logs a warning and returns an initial data structure without execution.
Arguments:
table_config
TableConfig - An object containing the configuration for the table, including the notebook name and timeout settings for the silver processing layer.config_manager
ConfigManager - Configuration manager object for managing and retrieving additional global or contextual configurations.
Returns:
dict
- A dictionary containing metadata about the executed or non-executed notebook, such as file type, file path, notebook details, and execution output.