Object config (table)
An object is the definition of a table that will be created inside fabric, based on the settings in the yaml file. Some settings can be supplied on the connection as well. In general, the lowest will be used. So if a setting is set on connection and object, the setting of the object will be used for that particular object.
Object
Name | Description |
---|---|
Connection | Name of the connection to use for this object |
SourceTable | Name to be used to get the source. Depends on the connection what will be done. |
SourceFilter (optional) | Filter that can be used by the connection. Depends on the connection |
DataplatformObjectname | Name to be used in fabric for this object |
KeepHistory | Keep history for this object. Requires at least one column with primary key |
NotebookPreBronze | Run this notebook before loading the bronze layer |
NotebookPreBronzeParams | It is possible to supply parameters, those parameters will be send to the notebook |
NotebookPostBronze | Run this notebook after loading bronze* |
NotebookPostBronzeParams | It is possible to supply parameters, those parameters will be send to the notebook |
NotebookPreSilver | Run this notebook before loading silver* |
NotebookPreSilverParams | It is possible to supply parameters, those parameters will be send to the notebook |
NotebookPostSilver | Run this notebook after loading silver* |
NotebookPostSilverParams | It is possible to supply parameters, those parameters will be send to the notebook |
Some settings can be set on the connection or on the object. See here those options
Column
Name | Description |
---|---|
SourceColumn | Name |
SourceDataType | Sourcedatatype |
IsPrimaryKey | Define the primary key column (default=false) |
IsNullable | May this column contain nulls (not implemented as check on fabric yet) |
Basic: Example of table with one primary key field and one column
Table:
Connection: exa-example
SourceTable: example.csv
DataPlatformObjectname: example
KeepHistory: true
Columns:
- SourceColumn: ExampleID
SourceDataType: int
IsPrimaryKey: true
- SourceColumn: ExampleText
SourceDataType: varchar(250)
Basic: Example of table with one primary key field and one column with the source in the Fabric Files section of the Bronze layer
Imagine a third party that has dropped files into the Fabric Bronze layer in the files section in a certain folder. Those files need to be processed into the bronze and silver layer, based on the settings in the yaml file(s). In this case an parquet file is dropped into the bronze files section in the bronze folder exa/exampledata, the parquet file will always be overwritten and the name of the file is exampledata.parquet.
Table:
Connection: exa-fabricfiles
SourceTable: exampledata.parquet
BronzeFolder: exa/exampledata
DataPlatformObjectname: example
KeepHistory: true
Columns:
- SourceColumn: ExampleID
SourceDataType: int
IsPrimaryKey: true
- SourceColumn: ExampleText
SourceDataType: varchar(250)
The connection file:
ConnectionName: exa-fabricfiles
ConnectionPrefix: fbf
ConnectionType: fabricfiles
BronzeFolder: Example
FileType: parquet
FileExtension: parquet
When EasyFabric is run for the bronze layer for this object, it will first look for the settings in the object config. In this case it will find that the connection that should be used is exa-fabricfiles, so also those settings will be loaded. Since the BronzeFolder is defined in both, the BronzeFolder from the object is used, in this example "exa/exampledata". If all files for different objects are in the same folder, it is possible to define the folder just once, on the connection.
The object yaml configuration above, will result in 3 items. 2 tables in the bronze layer and 1 table in the silver layer. In bronze a table will be created for loading the source into a delta table. The second table in the bronze layer, is the history table. This table has an extra column called: SYSTEMSTATETIMESTAMP. Read more about SYSTEMSTATETIMESTAMP
In the silver layer also a table is created, this will have the same columns as the history table of the bronze layer.
- SourceColumn: This is the name of the column in the source data. This defines how the data is stored and processed within the model. For a comprehensive list of data types, visit all data types.