Skip to main content

Partitioning

Tables can become quite big in a Tabular model, when it is getting too big, it is possible to split the table into multiple partitions. A table always has at least one partition. Partitioning just means splitting the table in multiple parts. From the users perspective it still looks like one big table. There are multiple possibilities to set up partitioning.

In EasyTabular it is possible to setup a couple of partition types, which supports a broad range of common use-cases. The following options can be provided:

ParametersDescription
GroupNameGroupname of the partion (if not provided, the name of partitionsouce is used)
PartitionSourceThe source table that is queried on the datasource (required)
PartitionSchemaThe source schema that is used on the datasource (defaults to dbo when not provided)
PartitionFilterThe filter that is set on the source, only relevant for partitiontype = Filter
PartitionStartThe period where to start
PartitionEndThe number of partitions to create (Positive whole number)
PartitionSizeThe size of every partition, default = 1. For a partition per 2 years, set to 2, for 3 years, set to 3 and so on. Positive whole number expected
PartitionTypeThe filter type that is applied
PartitionFilterColumnThe column that is used for the filter in the wherestatement. It is possible to apply extra statements like YEAR(datecolumn) for example
HasPartitionLeftOverA partition is created for all te data outside the scope that is defined. Only data that is > than the last partition is added. This can be turned off (default = true)
PartitionIsHistoryThe number of oldest date partitions to mark as history, so the loading mechanism can skip them except on a first or full load (default = 0). See History partitions

Partition Types

TypeDescription
AllNo filter is applied on the given partitionsource
FilterPartition filter is applied on the given partitionsource
yyyyFilter is applied on partition filter column and the year month in (ex. 2021 for 2021) is applied on the column as a filter
yyyyMMFilter is applied on partition filter column and the year month in (ex. 202111 for november 2021) is applied on the column as a filter
yyyyMMddFilter is applied on partition filter column and the year month and day (ex. 20211101 for the 1st of november 2021) is applied on the column as a filter

All

The chosen partition type needs specific settings. The easiest is 'All', which just requires a partition source (a table or other queryable object in the data source). The generator will just add a partition with one simple query without a where statement.

- Table:
Name: Finance
Partitions:
- PartitionSchema: dm
PartitionSource: F_FinanceData

Filter

With the type 'filter' it is possible to set a hard filter, that will be added to the query. A couple of examples:

- Table:
Name: Finance
Partitions:
- PartitionSchema: dm
PartitionSource: F_FinanceData
PartitionFilter: Year(TransactionDate) = 2024

yyyy or yyyyMM or yyyyMMdd

With one of those types, it is possible to create a couple of partitions, based on parameters. In this case the example is based on yyyy (Year values). To know how many partitions should be created, the following information is needed:

ParametersDescription
PartitionStartThe period where to start
PartitionEndThe number of partitions to create (Positive whole number)
PartitionSizeThe size of every partition, default = 1. For a partition per 2 years, set to 2, for 3 years, set to 3 and so on. Positive whole number expected
PartitionTypeThe filter type that is applied
PartitionFilterColumnThe column that is used for the filter in the wherestatement. It is possible to apply extra statements like YEAR(datecolumn) for example
HasPartitionLeftOverA partition is created for all te data outside the scope that is defined. Only data that is > than the last partition is added. This can be turned off (default = true)
PartitionIsHistoryThe number of oldest date partitions to mark as history, so the loading mechanism can skip them except on a first or full load (default = 0). See History partitions

This will create 7 partitions, based on year. 6 partitions with a filter on a specific year from 2020 till 2026. And a 'leftover' partition for everything greater than 2026.

- Table:
Name: Finance
Partitions:
- PartitionSchema: dm
PartitionSource: F_FinanceData
PartitionStart: 2020
PartitionEnd: 6
PartitionType: yyyy
HasPartitionLeftOver: true
PartitionFilterColumn: YEAR(TransactionDate)

Another example based on per 2-months and no left over partition. This will create 6 partitions for the year 2020 and 6 partitions for the year 2021:

- Table:
Name: Finance
Partitions:
- PartitionSchema: dm
PartitionSource: F_FinanceData
PartitionStart: 202001
PartitionSize: 2
PartitionEnd: 12
PartitionType: yyyyMM
HasPartitionLeftOver: false
PartitionFilterColumn: CONCAT(YEAR(TransactionDate), RIGHT(CONCAT('0', MONTH(TransactionDate)),2))

It is possible to create multiple types of partitions per table, so mixing different types is no problem.

Source resolution

Point PartitionSource at the generated Gold object on the SQL analytics endpoint. The generator resolves it to a three-part name qualified with the Gold lakehouse — [<gold-lakehouse>].[<PartitionSchema>].[<partitionsource>] — lowercasing the object name and defaulting the schema to dbo when PartitionSchema is omitted. For example, PartitionSchema: dbo / PartitionSource: f_bezetting resolves to [Gold].[dbo].[f_bezetting].

The generated fact view exposes a single object that consumers query as a whole. Only Filter partitions push a WHERE clause into that view; the date-based types (yyyy, yyyyMM, yyyyMMdd) are loaded as WHERE-filtered partitions by the Tabular model rather than by the view.

History partitions

Older partitions usually hold closed data that no longer changes, so reprocessing them on every refresh is wasted work. PartitionIsHistory lets you mark the oldest partitions as history so the loading mechanism can skip them on a normal refresh and only load them on a first or full load.

Every generated partition is annotated with a LoadCategory annotation that the loading mechanism reads:

  • The oldest PartitionIsHistory date partitions are annotated History.
  • All remaining date partitions, and the HasPartitionLeftOver partition, are annotated Incremental. The leftover partition is never history, since it holds the most recent (and future) data.

For example, with PartitionEnd: 7 and PartitionIsHistory: 5, the five oldest year partitions are History and the two newest plus the leftover are Incremental.

- Table:
Name: Finance
Partitions:
- PartitionSchema: dm
PartitionSource: F_FinanceData
PartitionStart: 2019
PartitionEnd: 7
PartitionType: yyyy
HasPartitionLeftOver: true
PartitionFilterColumn: YEAR(TransactionDate)
PartitionIsHistory: 5

PartitionIsHistory applies to the date-based types (yyyy, yyyyMM, yyyyMMdd). It cannot be negative and cannot exceed the number of generated partitions. For an All or Filter partition, use the IsHistory boolean to mark that single partition as history instead.