When you start using Power BI, managing your data is usually pretty easy. Your datasets are small, and refreshing them doesn’t take much time – great. But as you add more data, things can get complicated. What if your larger datasets slow down your refresh times for areas where you don’t have the luxury to wait? These waiting times can quickly turn into a source of frustration. Luckily, Power BI may just have the solution for you: incremental refresh.
In this article, we’ll look at incremental refresh in Power BI. You’ll first learn what it is and why it’s useful. Then, we’ll show you how it works and what you need to set it up in Power BI Desktop.
1. What is Incremental Refresh?
Incremental refresh changes how Power BI retrieves data. Instead of importing the entire dataset with each refresh, it only fetches a specified portion. This could for example be new data added to your database or data that has changed since the last refresh.
Here’s how it works: You first set up a refresh policy. Power BI then keeps an archive of your historical data. Each time you refresh, it keeps this archive and only updates the part you want to refresh. This process is efficient and often much faster because you don’t need to import as much data to refresh your dataset.
The most important benefits of incremental refresh are:
- Faster Refreshes: Since only a small part of the data is updated, refreshes are generally faster.
- More Reliable Refreshes: Shorter refresh times reduce the risk of connectivity issues.
- Reduced Load on Data Source: Only a fraction of the data needs to be sent over the network, which reduces the load on your data source.
- Lower Memory Usage: Incremental refresh uses fewer resources in your Power BI tenant, allowing more resources for other tasks.
Hopefully this convinced you of the value this feature brings. So how does this work in Power BI desktop?
2. How Does Incremental Refresh Work?
Incremental refresh works by dividing your data into parts called partitions. Power BI automatically creates these partitions at the moment you set up an incremental refresh policy. These partitions are important because they allow the refresh process to update each part independently. Instead of refreshing the entire dataset, Power BI can update only the partitions with new or changed data.
So how are partitions formed? The way partitions are formed depends on which parts of the data you want to refresh. For example, if you set your incremental refresh policy to update data from the past month, Power BI will create partitions for each month. During a refresh, it will only update the data in the partition for the current month, leaving the previous months’ data untouched unless specified otherwise.
3. Requirements for Setting Up Incremental Refresh
To use incremental refresh in Power BI, you need to meet a few requirements:
- Supported Plans: Incremental refresh is supported for Power BI Premium, Premium per user, Power BI Pro, and Power BI Embedded models. In other words, your model refresh should happen in the service. It doesn’t work on Power BI Desktop.
- Query Folding: Your data sources should support query folding so the incremental refresh policy can request just the required data. This generally works for relational data sources like SQL databases and Azure Synapse, but other data sources that support query folding also work. Without query folding, the refresh would import all data, which is what incremental refresh tries to prevent.
- Date Column: Your dataset should include a date or datetime column. The incremental refresh policy uses RangeStart and RangeEnd parameters to filter this column. If your column has dates formatted as integers (e.g., 20241231 for December 31, 2024), you can convert the RangeStart and RangeEnd parameter values to match this format.
- Time Limits: When refreshing large datasets, the refresh duration should be within two hours for Pro models or five hours for Premium capacity due to built-in time limits. That goes to say that this also counts for incremental refreshes.
With these requirements in mind, we’re now ready to learn how to set it up in Power BI desktop.
4. Setting Up Incremental Refresh in Power BI Desktop
Let’s see how we can apply this on a practical example. For this example, we will use a dataset containing ‘NYC Yellow Taxi Trip Data’ that is stored on a SQL Server. Here’s a screenshot that profiles a selection of columns from this dataset.
The most important things to note in this image are that:
- this table contains over 45 million rows of data.
- dates range from January 2001 up to June 2024.
For this quantity of data it makes sense to consider incremental refresh. Let’s find out how we can configure it.
Setting up incremental refresh in Power BI Desktop involves several steps:
- Defining parameters: Create parameters in Power Query for the range of data you want to refresh.
- Filtering Data using Parameters: Configure your query to filter your date columns based on these parameters.
- Enabling incremental refresh: Use the Power BI Desktop settings to enable incremental refresh on your dataset.
- Publishing your dataset: Publish the dataset to the Power BI service.
Keywords: Using parameters, download file, tabular editor, refresh full load
4.1. Defining Parameters
To enable incremental refresh, you need to add two parameters to your queries: RangeStart
and RangeEnd
. Both should be of the DateTime type. These parameters help Power BI adjust its filters for the incremental refresh policy.
When setting up these parameters, the specific DateTime values you use don’t matter. The incremental refresh policy will automatically adjust these values based on the settings we’ll cover in section 4.3. Just make sure to set up the parameters and include them in your queries.
To create RangeStart and RangeEnd parameters, you can open the Power Query editor and follow these steps:
- Navigate to the Manage Parameters option.
- Select New Parameter.
- Name the first parameter
RangeStart
. - Set the data type to Date/Time.
- Insert a DateTime value in the current value box.
- Repeat the process to create the
RangeEnd
parameter.
The process is shown in the below image:
With these parameters in place, we can now incorporate them into our query.
4.2. Filtering Data using Parameters
For incremental refresh to work, your table needs a date or datetime column. Let’s see how to create an expression that uses the RangeStart and RangeEnd parameters to filter the date column.
If your date column is an integer, you can create an expression to convert the RangeStart
and RangeEnd
parameters to match this format. Just make sure this transformation supports query folding, so the filtering happens at the data source, not in Power BI.
The easiest way to filter a column with your data parameters is to:
- Find the column to filter.
- Select the dropdown and choose Date/Time Filters.
- Select the Custom Filter option.
In the Filter Rows menu that appears you can now select:
is after or equal to
for the RangeStart parameter.is before
for the RangeEnd parameter.
That looks as follows:
After you press OK, you end up with the following expression:
Table.SelectRows(
YellowCabDataset,
each [Pickup_DateTime] >= RangeStart
and [Pickup_DateTime] < RangeEnd
)
By setting up the query this way, the incremental refresh mechanism can load only a subset of your data.
However, for incremental refresh to work, all steps leading to the filtering with the RangeStart and RangeEnd parameters must support query folding. This is important because if the filtering step does not fold, Power Query would have to import the entire dataset and then filter rows after. This defeats the purpose of incremental refresh. With query folding, the data source only sends data for the relevant periods.
With the parameters in place, we can now look at how you can configure an incremental refresh policy.
4.3. Enabling Incremental Refresh
You can enable incremental refresh at the front-end of Power BI. First, load your data into Power BI and find your table in the ‘Data’ pane.
To enable incremental refresh:
- Right-click the table where you want to configure incremental refresh.
- In the menu that appears, click Incremental Refresh.
You will then see the Incremental Refresh and Real-time Data screen. Let’s explore what we can do there.
In the above image, we configured the following:
- Select Table: Choose the table to configure incremental refresh for. In this example, ‘Yellow Taxi Trip Data’.
- Set import and refresh ranges: Move the slider to the right to access the incremental refresh settings.
- Archive data starting: Specify which data to archive. Archived data is stored but won’t be refreshed. For instance, if today is July 31, 2024, when you set the archive to 4 years it shows: “Data imported from 01/01/2020 to 31/12/2023 (inclusive).” Data before 2020 and after 2023 is not archived.
- Incrementally refresh data: Specify the periods to refresh incrementally. For example, if you set 1 year before the refresh date, it shows: “Data will be incrementally refreshed from 01/01/2024 to 31/12/2024 (inclusive).”
Make sure to setup the RangeStart
and RangeEnd
parameters. If missing, the menu will display a warning: “Before you can set up incremental refresh on this table, you need to set up parameters”.
After setting this up, we are ready to publish the dataset.
After publishing a model with incremental refresh, you won’t be able to download the .pbix file again.
4.4. Publishing Your Dataset
With the parameters set up in Power Query and the Incremental Refresh policy configured, you can now publish your dataset to the Power BI service. Choose a workspace, press Publish, and you’re good to go.
Keep in mind that once published, you can’t download your dataset from the Power BI service anymore. The reason is that your dataset may now refresh with such large volumes that the Power BI service can’t offer a download. Because of this, the option to download a copy of your report and data as a .pbix file is disabled.
If you try to download your file, you will see this message: “This option isn’t available because the data model has incremental refresh enabled.” Here’s what it looks like:
4.5. Refreshing Your Dataset
After publishing your model, you can perform a refresh. This should be a manual refresh initially. The first refresh often takes a long time because partitions are created, history is loaded, and necessary model calculations are performed.
Subsequent refreshes are much faster. Even though the incremental refresh process combines data from separate partitions, it is usually quicker than importing the entire dataset.
Our dataset contains 45.7 million rows of data. To demonstrate the differences in refresh times, we ran three scenarios:
- Full Refresh: Refreshes the entire dataset with over 45 million rows.
- Incremental Refresh: Refreshes the current year and archives the last four years.
- Incremental Refresh: Refreshes the last three months and archives the last four years.
We refreshed the dataset three times for each scenario and averaged the refresh duration. Here are the results:
Dataset | Refreshed Rows | Refresh Time |
1. Full refresh | 45.718.639 | 00:06:55 |
2. Incremental Refresh (initial refresh) | 00:05:51 | |
2. Incremental Refresh (subsequent refreshes) | 16.752.276 | 00:03:34 |
3. Incremental Refresh (initial refresh) | 00:06:12 | |
3. Incremental Refresh (subsequent refreshes) | 3.690.448 | 00:01:42 |
As you can see, the initial refreshes when setting up an incremental refresh policy can take some time. However, once partitions are created, subsequent refreshes are much quicker. For example, refreshing the last three months (scenario 3) goes from 06:55 minutes to only 01:42 minutes. That’s a speed increase of more than 75%. This not only improves your refresh times but also improves the stability of the refresh and reduces the resource usage of your data source.
5. Conclusion
In this article, you learned how to set up incremental refresh in Power BI. We covered the concept, its benefits, and the steps required to configure the necessary parameters and enable the refresh policy.
While the initial setup may take some time, the benefits of faster refresh times and improved stability make it worthwhile for large datasets. With incremental refresh, you can keep your data up-to-date without overwhelming your system.
Just remember, once you upload your data model with an incremental refresh policy, you won’t be able to download the .pbix file anymore. Be sure to create a backup before proceeding.
6. Frequently Asked Questions
Incremental Refresh: When you setup an incremental refresh policy, Power BI splits your data into partitions. Based on the configured policy, it will only refresh parts of your dataset.
Scheduled Refresh: Scheduled refresh determines how often and at which times your dataset retrieves its data. If you have an incremental refresh policy, it will start incrementally refreshing your data during the scheduled times.
These two concepts are separate but work well together.
Yes, you can. To do this, incorporate the RangeStart and RangeEnd parameters in all the desired tables. After that, you can set up an incremental refresh policy for each table, just as you would for a single table.
We recommend keeping a local copy of your .pbix file. If you need to make changes in the future, simply open your local version and make the changes. When you upload your updated .pbix, the Power BI service will reprocess the partitions. Subsequent refreshes will continue to work incrementally.
Unfortunately, incremental refresh only works with query folding. When data sources don’t fold, Power Query imports all data before filtering down to the required subset. Since all data needs to be imported, there is no benefit in configuring incremental refresh.
It depends. When configuring incremental refresh, you can specify how much data to keep. If the archiving date range is before your oldest data, it keeps everything. However, you can also decide to keep only a portion of the historical data. In that case, the incremental refresh removes old data over time.
Power BI uses the RangeStart and RangeEnd parameters to configure incremental refresh. These parameters define which columns should be filtered according to the incremental refresh policy. Since your model may have many date or datetime fields, these parameters allow you to specify which columns are relevant for incremental refresh.