load_data_silver
run
def run(tablefile: str, config_manager: ConfigManager)
Executes the silver extraction and transformation process for a table, utilizing the
configuration information provided in the ConfigManager
. This involves loading,
processing, and merging table data within the silver lakehouse layer.
The function ensures the proper configuration of logging, verifies layer activity and table activation status, and invokes pre-silver and post-silver workflows if defined. It uses corresponding notebooks or direct data operations for the silver loading process.
Arguments:
tablefile
str - Path to the YAML file containing table configuration details.config_manager
ConfigManager - An instance of ConfigManager, pre-initialized with application configuration, connection, and lakehouse details.
Returns:
Optional[str]
- An error message containing the name of the failed table and the exception details, or None if the operation finishes successfully.
Raises:
Exception
- IfConfigManager
is not initialized before invoking this function.Exception
- If the bronze lakehouse configuration is not found.Exception
- If the stop_at_error setting is enabled and an exception occurs in processing.
dataframeloader
def dataframeloader(data_frame: DataFrame, table_config: TableConfig,
load_config: LoadConfig, config_manager: ConfigManager)
Loads data into a silver layer table in a lakehouse environment.
This function facilitates loading data from a given DataFrame into a table specified by a table configuration within a silver layer of the lakehouse architecture. It uses the provided configuration details to establish connections, manage runtime settings, and log relevant information during the operation. It validates critical configurations and raises appropriate exceptions in case of missing or invalid details.
Arguments:
data_frame
DataFrame - The input pandas DataFrame containing data to load.table_config
TableConfig - Configuration object specifying table details and related connection configurations.load_config
LoadConfig - Configuration object holding load-specific settings, including the layer and runtime options.config_manager
ConfigManager - Centralized configuration management object used to retrieve connection settings and maintain runtime parameters.
Returns:
str
- A string indicating the result of the load operation, either as success or an error message.
Raises:
Exception
- If load_config is None, if ConfigManager is not properly initialized, or if no silver lakehouse configuration is found.Exception
- If there is an issue while performing the load operation and config_manager.stop_at_error is set to True.