Skip to main content

easyfabric.load_meta_data

json

logging

os

threading

Any

Optional

config

MSG_CONFIG_NOT_INITIALIZED

MSG_META_EXCLUDE_FILES

MSG_META_FILES_LOADED

MSG_META_GET_MODEL_START

MSG_META_INVALID_EXCEPT_FILES

MSG_META_INVALID_INPUT_PATH

MSG_META_INVALID_YAML_EXT

MSG_META_LOAD_EXCEPTION

MSG_META_LOADER_START

MSG_META_LOADING_FILE

MSG_META_MATCH_FOUND

MSG_META_MOD_LOADER_START

MSG_META_MODEL_NOT_FOUND_CONFIG

MSG_META_NO_CONFIGS_FOUND

MSG_META_NO_MATCH_FOUND

MSG_SYSTEM_STOP_AT_ERROR

ConfigManager

Connection

Model

TableConfig

initialize_config

clear_objects_cache

def clear_objects_cache() -> None

Clear the in-memory metadata cache.

Removes all cached TableConfig lists so subsequent calls to get_objects_by_folder will re-read from disk.

get_object_by_file

def get_object_by_file(
tablefile: str,
config_manager: ConfigManager = None) -> TableConfig | None

Load a single TableConfig from a YAML file path.

Delegates to get_objects_by_folder and returns the first result.

Arguments:

  • tablefile - Path to the YAML table configuration file.
  • config_manager - Optional ConfigManager instance; initialised automatically if not provided.

Returns:

The loaded TableConfig, or None if the file could not be loaded.

get_config_manager

def get_config_manager(config_filepath: str = None) -> ConfigManager

Load or return the global ConfigManager instance.

Arguments:

  • config_filepath - Optional path to a config.yaml file. Uses the default location when not specified.

Returns:

The initialised ConfigManager singleton.

get_objects_by_folder

def get_objects_by_folder(
table_folder: str,
config_manager: ConfigManager = None,
except_folders: list[str] = None,
except_files: list[str] = None) -> list[TableConfig] | None

Loads metadata configurations from a specified folder or file and returns a list of TableConfig objects. If a single file is specified, metadata will be loaded only from that file. Otherwise, the function traverses the specified folder and loads metadata from all YAML files.

Arguments:

  • table_folder str - The path to the folder or file containing YAML metadata files. If it starts with "Files/", it will be resolved to the default lakehouse path.
  • config_manager ConfigManager - An instance of ConfigManager used to initialize and validate configurations.
  • except_folders List[str] - List of folders to exclude from the metadata loading.
  • except_files List[str] - List of files to exclude from the metadata loading.

Returns:

  • List[TableConfig] - A list of TableConfig dataclass instances created from the loaded YAML metadata.

Raises:

  • Exception - If config_manager is not initialized or if an error occurs during metadata loading and config_manager.stop_at_error is set to True.

get_tableconfig_by_inputfile

def get_tableconfig_by_inputfile(
file_path: str,
config_manager: ConfigManager = None) -> Optional[TableConfig]

Retrieves the best matching TableConfig for the given input filepath by extracting the folder path and finding the TableConfig whose bronzefolder is the longest prefix match to that folder path.

Arguments:

  • file_path str - The path to the input file.
  • config_manager ConfigManager - An instance of ConfigManager used to load configurations.

Returns:

  • Optional[TableConfig] - The matching TableConfig if found, otherwise None.

Raises:

  • Exception - If ConfigManager is not initialized.

get_object_trigger

def get_object_trigger(
lakehousepath: str,
config_manager: ConfigManager = None) -> list[TableConfig] | None

Find TableConfigs whose connections are triggerable for a lakehouse path.

Scans the Connections folder for YAML connection definitions, filters to those with istriggerable=True, and returns matching TableConfig objects.

Arguments:

  • lakehousepath - The lakehouse file path to match against triggers.
  • config_manager - Optional ConfigManager instance; initialised automatically if not provided.

Returns:

A list of matching TableConfigs, or None on error.

get_model_by_name

def get_model_by_name(model_name: str) -> Model | None

Load a model configuration by its name.

Looks up the model in the ConfigManager's model list and loads the corresponding YAML file.

Arguments:

  • model_name - The name of the model as defined in config.yaml.

Returns:

The loaded Model instance, or None if loading fails and stop_at_error is False.

Raises:

  • Exception - If the model name is not found or ConfigManager is not initialised.

get_model_by_file

def get_model_by_file(model_file: str,
config_manager: ConfigManager = None) -> Model | None

Loads a Model object by reading a given YAML file. Ensures that the configuration manager is initialized and sets up logging for the operation.

If the input file path starts with a specific prefix, it is converted to an absolute path. The function validates the file type to ensure it is a YAML file and parses its content to create a Model instance.

Arguments:

  • model_file - The path to the metadata YAML file. It can be relative or absolute. If it starts with 'Files/', it is converted to the absolute path in the lakehouse directory.
  • config_manager - An instance of ConfigManager that is responsible for initialization and managing configurations.

Returns:

  • Model - The Model object created from the provided YAML file.

Raises:

  • Exception - If the configuration manager is not initialized.
  • Exception - If the provided file is not a YAML file.
  • Exception - If the stop-at-error policy is enabled in the configuration manager and an error occurs during the process.

get_table_config_by_name

def get_table_config_by_name(
table_name: str,
config_manager: ConfigManager = None) -> Optional[TableConfig]

Retrieves the TableConfig for the given object name by searching in Files/Objects.

Arguments:

  • table_name str - The name of the table (dataplatformobjectname).
  • config_manager ConfigManager - An instance of ConfigManager.

Returns:

  • Optional[TableConfig] - The matching TableConfig if found, otherwise None.

param_clean

def param_clean(value: Any) -> Any

One function. One truth. One source of sanity.

You call it like this in the child notebook: folder_path = fix_param(dbutils.widgets.get("folder_path")) layers = fix_param(dbutils.widgets.get("layers"))

Works perfectly when parent used json.dumps() on everything.