Privacy Levels in Dataflows: Click to Continue? (Or Not!)

, , ,

If you’ve built a Power BI dataflow that combines between sources, most likely you’ve been stopped by a prompt asking if you want to “continue” because there is a risk that data could be revealed from one source to another.

Screenshot of prompt:
The evaluation was canceled because combining data from multiple sources may reveal data from one source to another. Click Continue if the possibility of revealing data is okay. 

[Continue button]

The prompt’s wording makes it sound like you must choose “continue” in order to be able to use dataflows to output data derived from more than one data source—but is continuing truly mandatory?

Dataflow's options dialog showing privacy setting "Allow combing data from multiple sources. This could expose sensitive or confidential data to an unauthorized person" checked.

The seeming necessity of enabling this option is reinforced by how the corresponding setting appears in the dataflow’s Options dialog. Clicking “continue” in the above prompt sets this checkbox. Its wording implies that it must be checked in order for Power Query to be able to combine between multiple data sources: If you don’t check it, you won’t be allowed to combine data from more than one source—or so it (incorrectly) seems.

Thankfully, in most cases, you do no need to enable this option in order to combine between sources.  

The option in question does not enable combing between sources; rather, it disables the data protection firewall. Dataflow’s “click continue” prompt appears not because Power Query’s ability to combine between sources needs to be switched on. Rather, the prompt is a disguised alert that the data protection firewall does not have enough information to determine how data should be combined between sources: specifically, privacy levels have not been configured for one or more of the sources.

When Power BI Desktop encounters this same situation, it prompts you to set the missing privacy levels. Surprisingly, dataflows takes a very different approach: instead of asking you for the needed but missing information, it prompts you to disable privacy levels altogether (albeit without explicitly telling you that is what you are about to do).

To resolve dataflow’s prompt without clicking “continue” and so disabling the firewall, simply set a privacy level for each data source used in the dataflow. To do this, select the settings “gear” icon near the top right of the toolbar, choose Manage connections and gateways, find the relevant connections, and update the settings for each to reflect the appropriate privacy level.

Once all involved sources have their privacy levels set, refresh the query results preview. The prompt should disappear, leaving your dataflow working under the protection of the data protection firewall.

I hope that Microsoft will improve the UX experience around these settings. As disabling the data protection firewall is needful only under limited circumstances and brings with it security ramifications, it seems the UI should not encourage disabling it. Instead, the default in situations like these should be to prompt for the missing privacy level information so that firewall can stay enabled and do its job.

Leave a Reply

Your email address will not be published. Required fields are marked *