Power BI Storage modes

Power BI allows us to connect to many different data sources – from relational databases, NoSQL databases, files, and more – to source data for consumption in Power BI. From the data sourced, you can create additional data (new calculated columns, metrics, transformed data, etc.), build data models, and create reports and dashboards.

There are a few storage modes related to how the data is retrieved, stored, and processed in Power BI. The storage modes are Import, DirectQuery, Live Connection, and Dual. The storage mode is set at the table level for each table in the Power BI data model. I will now describe these modes.

Import

With the Import storage mode, Power BI imports and caches the data from the sources. Once the data import is complete, the data in Power BI will remain the same until is refreshed by the Power BI refresh process for that dataset.

This storage mode allows for the usage of the most Power BI features for data modeling and analysis. For example, Import mode is required for using two of the popular Power BI features, Quick Insights and Q&A. Also, this mode is almost always the best for performance. However, it’s not necessarily the best option in all scenarios. Since the data is imported, the file size can get large and can sometimes take a considerable amount of time to load. But generally, for relatively static, low volume data, it is the preferred choice.

Queries submitted to an imported dataset will return data from the cached data only.

DirectQuery

With the DirectQuery storage mode, no data is cached in Power BI, but the metadata of the source tables, columns, data types, and relationships is cached. Instead, the data is directly queries on the source database when needed by a Power BI query, such as when a user runs a Power BI report that uses the data.

For Since the data is not imported, if all the tables in the data model use DirectQuery, the Power BI file size will be very small compared to a model with imported data.

Live Connection

The Live Connection storage mode is a special case of the DirectQuery mode. It is only available when sourcing from Power BI Service datasets or Analysis Services data models. There are limitations when using this mode. Data modeling is limited to creating measures, and therefore, you cannot apply transformations to the data, and you cannot define relationships within the data. And you can only have one data source in your data model.

Dual

With the Dual storage mode, a table may use Import mode or DirectQuery mode, depending on the mode of the other tables included in the query. For example, you may have a scenario in which you have a Date table that is connected to one transaction table that needs to reflect the data in the source, and is therefore set to DirectQuery mode, and also connected to another transaction table that only has less than 100,000 rows and is set to Import storage mode. By setting the Date table to Dual storage mode, Power BI will use DirectQuery when the query involves the date table and the first transaction table, while using Import mode when the query involves the date table and the second transaction table.

The below table summarizes the Power BI data storage modes:

ImportDirectQueryLive ConnectionDual
-Data is imported and cached in Power BI

-Preferred for static, relatively small datasets

-All Power BI functionality is available – including DAX, Calculated tables, Q&A and Quick Insights

-Can connect to Analysis Services but Live Connection is preferred

-Can have unlimited data sources

-Typically provides the best performance







-Data is queried on the source when needed

-Use for large datasets and when data changes in source need to be updated immediately

-Features such as Q&A, Quick Insights, Calculated Tables, and many DAX queries are not supported

-Limited data transformation functionality

-Parent-child functionality not supported

-For relational databases

-Not supported for Analysis Services

-Performance greatly dependent on the source data source
-A special case of DirectQuery

-Used for connecting to multi-dimensional data sources, such as Analysis Services

-Can be used only with Power BI datasets and Analysis Services

-Can have only one data source

-No data transformation available

-Q&A and Quick Insights not available

-Can create measures



-A combination of Import and DirectQuery

-Power BI will choose the appropriate option based on the storage mode of the tables involved in the query

-Can improve performance



















Summary of Power BI storage modes

 Note: the content in this post is relevant for the PL-300 Analyzing Data with Microsoft Power BI certification exam.

Thanks for reading! I hope you found this information useful.

Good luck on your analytics journey!

Leave a comment