Skip to main content

Bronze

Overview

Pull raw data from configured sources (CSV/JSON/XML/Parquet/Excel) into bronze Delta tables. When keephistory: true on the per-table YAML, bronze also writes versioned change records to a Bronze.his history table — this is where row history lives. Both the Fabric item generator and the runtime loader consume the same per-table YAML.

What gets generated

StageComponentOutput
BuildEasyFabric Generator (EFG)GenerateFabricObjects → Fabric lakehouse tables (bronze)
RuntimeEasyFabric Runtime (EFR)easyfabric.load_data_bronze.run, easyfabric.load_data_bronze.dataframeloader

Example YAML

Dataplatform/DP/Objects/AdvWorks/products.yaml

  Table:
Connection: adv-advworks
SourceTable: products.csv
DataPlatformObjectname: products
PreBronzeNotebook:
Notebook: Pull_Github
Param001: products.csv
Param002: products.csv
KeepHistory: true
Columns:
- SourceColumn: ProductID
SourceDataType: int
IsPrimaryKey: true
- SourceColumn: ProductName
SourceDataType: varchar(255)
- SourceColumn: Category
SourceDataType: varchar(255)
- SourceColumn: ListPrice
SourceDataType: decimal(28,5)

Schema reference

Required fields marked *. Linked types are collapsible — click to expand inline.

Fabric Object

NameTypeDescription
Connection *StringConnection to use for this object (as defined in connection)
Fields *List<FabricAttribute>Fields from the source object
SourceTable *StringName of the object in the source
Show optional fields (6)
NameTypeDescription
DataPlatformObjectNameStringTo override the source name that is used in the dataplatform
DescriptionStringDescription of the source
IsActivetrue/falseSet to active for generating this object (default=true)
KeepHistorytrue/falseSet to true if history is required in the silver layer (default=true). Primary key required.
PrefixStringSet a prefix (default = '')
SourceSchemaStringSource schema in the source (if applicable)
Fabric Attribute
NameTypeDescription
SourceColumn *StringName of the attribute in the source system
SourceDataType *StringDatatype of the source (use source datatype, will be converted to dataplatform types automatically)
Show optional fields (5)
NameTypeDescription
ClassificationClassificationTypeSet the classification of a field (sensitive, restricted, internal, public (default = public)
IsActivetrue/falseAttribute is active (default=true)
IsNullabletrue/falseAttribute can be null (default = true)
IsPrimaryKeytrue/falseAttribute is part of the primary key (default = false)
NameStringTo override the name used in the dataplatform (default is the sourcecolumn name used)

Runtime-only fields

These fields are read by the EFR at runtime and have no EFG counterpart.

  • filetype
  • skipifsourceunchanged
  • bronzeloadskip
  • layers

EasyFabric Runtime

load_data_bronze.run

def run(tablefile: str, config_manager: ConfigManager = None) -> str

Runs the bronze loader process for a specified table configuration and pulls files from the source, processes them, and loads them into the bronze layer.

Workflow:

  1. Validates table configuration and layer settings
  2. Computes stable Bronze folder path for file tracker
  3. Loads previous file snapshot from tracker (if exists)
  4. Optionally performs Azure Blob pre-check for unchanged sources
  5. Executes pre-bronze notebook (if configured)
  6. Pulls files from source system
  7. Validates file freshness (Validation 1)
  8. Checks for file changes using tracker snapshot (if skipifsourceunchanged enabled)
  9. Truncates Bronze table and loads file data by type
  10. Executes mid-bronze notebook and reloads data (if configured)
  11. Loads history records and validates correlation (Validation 2)
  12. Saves file snapshot for next run
  13. Executes post-bronze notebook (if configured)

Arguments:

  • tablefile str - Path to the YAML file representing a table's configuration.
  • config_manager ConfigManager - An instance of ConfigManager used for accessing the application's configuration settings. Defaults to global config if not provided.

Returns:

  • str - A message indicating the outcome, such as file count, skip reason, or error details. Returns None if table is inactive or skipped.

Raises:

  • Exception - If ConfigManager is not initialized, table config is invalid, or filetype is unsupported.

Configuration Options:

  • skipifsourceunchanged (bool) - Enable skip-if-unchanged detection
  • bronzeloadskip (bool) - Skip entire bronze load for this table
  • keephistory (bool) - Maintain history records and validation
  • prebronzenotebook (str) - Path to notebook to run before loading
  • midbronzenotebook (str) - Path to notebook to run between load and history
  • postbronzenotebook (str) - Path to notebook to run after loading
  • sourceorder (bool) - Sort files by sourceorder instead of name
  • bronzefolder (str) - Override Bronze folder from connection
  • max_file_age_hours (int) - Maximum file age for freshness validation (per connection)

load_data_bronze.dataframeloader

def dataframeloader(data_frame: DataFrame, load_config: LoadConfig,
table_config: TableConfig, config_manager: ConfigManager=None)

Loads a DataFrame into a specified data platform table using the provided configuration and manager.

This function handles the loading operation by using detailed configurations for the DataFrame, table, and the application configuration manager. It sets up logging, ensures required parameters are initialized, and supports specific settings for different layers (e.g., bronze layer). The function handles exception logging and provides mechanisms to stop processing upon encountering errors based on configuration settings.

Arguments:

  • data_frame DataFrame - The data to be loaded into the specified table.
  • load_config LoadConfig - Contains configuration for the loading process, including destination table.
  • table_config TableConfig - Holds table-specific settings, e.g., table name identifiers and layers.
  • config_manager ConfigManager - Manages and validates application-level configurations.
LoadConfig fields

Runtime parameter bag — construct in code and pass to the loader. All fields are optional unless flagged below.

FieldTypeDescription
_layerstrOperational layer associated with the configuration. Defaults to "not set".
dry_runboolIndicates if the process should be executed in dry-run mode. Defaults to True.
auto_null_columnboolDetermines if null values should be automatically managed for columns. Defaults to True.
load_typeLoadTypeSpecifies the type of load operation. Defaults to LoadType.FULL.
stop_at_errorboolSpecifies whether the process should stop when an error occurs. Defaults to True.
business_key_checkboolIndicates if business keys should be validated during the load. Defaults to True.
log_row_countboolDetermines if row counts should be logged during the process. Defaults to False.
key_violation_actionstrAction to be taken when key violations occur. Defaults to "raise".
destination_schemastrSchema of the destination table. Defaults to "dbo".
destination_tableOptional[str]Name of the destination table. Defaults to None.

Returns:

  • str - Message indicating the result of the DataFrame loading process, including the target table name and error details if applicable.

Raises:

  • Exception - If the destination table name is missing from LoadConfig.
  • Exception - If the ConfigManager is not properly initialized.
  • easyfabric.data.TableConfig
  • easyfabric.data.Column
  • easyfabric.data.Connection