Search code examples
powerbipowerquery

Data Refresh Time Estimate for Incremental Refresh without Query Folding


I wanted to share some insights, assuming we move forward with the incremental refresh for the current month:

Without incremental refresh, the entire dataset (20 million records) can take up to 50 minutes to refresh. With query folding and incremental refresh (focusing on 2 million records for the current month), the refresh time is significantly reduced to 5–8 minutes.

Given that we are considering implementing incremental refresh without query folding for the current month, could you please let me know approximately how much time the refresh will take under these conditions?


Solution

  • There is no way to estimate how much time the refresh will take.

    You can make hypothesis based on your case.

    Based on the documentation :

    If you see this warning and want to verify the necessary query folding is occurring, use the Power Query Diagnostics feature, or trace queries by using a tool supported by the data source, such as SQL Profiler. If query folding isn't occurring, verify the filter logic is included in the query being passed to the data source. If not, it's likely the query includes a transformation that prevents folding.

    Before configuring your incremental refresh solution, be sure to thoroughly read and understand Query folding guidance in Power BI Desktop and Power Query query folding. These articles can help you determine whether your data source and queries support query folding.

    enter image description here

    When query folding is enabled, the data source handles a portion of the work, such as filtering, grouping, or aggregating data before Power BI receives it.

    This offloads much of the processing to the database and will reduce the amount of data that Power BI needs to handle.

    In the case where you have both query folding and incremental refresh, the time can be reduced and it's because the data source is only sending a filtered subset of the data (for the current month).

    Without query folding, Power BI must pull the entire dataset before it can apply any filtering or transformation locally. Even though you're focusing on the current month, Power BI will not filter the data until it's loaded.

    Since Power BI still retrieves the full dataset (as query folding is not available), the time to load the data will be closer to the full refresh time.

    However, because you are applying incremental refresh and focusing on just 2 million records, there could be a slight reduction in time since you are reducing the scope of processing after data is loaded.