Skip to main content

load_data_silver

run

def run(tablefile: str, config_manager: ConfigManager)

Executes the silver extraction and transformation process for a table, utilizing the configuration information provided in the ConfigManager. This involves loading, processing, and merging table data within the silver lakehouse layer.

The function ensures the proper configuration of logging, verifies layer activity and table activation status, and invokes pre-silver and post-silver workflows if defined. It uses corresponding notebooks or direct data operations for the silver loading process.

Arguments:

  • tablefile str - Path to the YAML file containing table configuration details.
  • config_manager ConfigManager - An instance of ConfigManager, pre-initialized with application configuration, connection, and lakehouse details.

Returns:

  • Optional[str] - An error message containing the name of the failed table and the exception details, or None if the operation finishes successfully.

Raises:

  • Exception - If ConfigManager is not initialized before invoking this function.
  • Exception - If the bronze lakehouse configuration is not found.
  • Exception - If the stop_at_error setting is enabled and an exception occurs in processing.

dataframeloader

def dataframeloader(data_frame: DataFrame, table_config: TableConfig,
load_config: LoadConfig, config_manager: ConfigManager)

Loads data into a silver layer table in a lakehouse environment.

This function facilitates loading data from a given DataFrame into a table specified by a table configuration within a silver layer of the lakehouse architecture. It uses the provided configuration details to establish connections, manage runtime settings, and log relevant information during the operation. It validates critical configurations and raises appropriate exceptions in case of missing or invalid details.

Arguments:

  • data_frame DataFrame - The input pandas DataFrame containing data to load.
  • table_config TableConfig - Configuration object specifying table details and related connection configurations.
  • load_config LoadConfig - Configuration object holding load-specific settings, including the layer and runtime options.
  • config_manager ConfigManager - Centralized configuration management object used to retrieve connection settings and maintain runtime parameters.

Returns:

  • str - A string indicating the result of the load operation, either as success or an error message.

Raises:

  • Exception - If load_config is None, if ConfigManager is not properly initialized, or if no silver lakehouse configuration is found.
  • Exception - If there is an issue while performing the load operation and config_manager.stop_at_error is set to True.