What is the context, or scope, of this dialog’s settings?
Since there’s more than one category of context in play here, there’s more than one answer. The most obvious contextual category is probably data sets. Looking at settings from the data set scope perspective, “.\sql2017express” is scoped to that entire database server while data set “c:\” covers the entire C: drive.
Data set isn’t the only classification of context that applies to this dialog. Another is at the environment level. How broadly do each of the settings in the above dialog apply? If you open another report, will the settings you set here apply there? What if another user opens this report—how many (if any) of these settings will carry over to them?
Data set and environment are both groupings of context at the technical level. However, technical scopes aren’t the only kind of context present. There’s also the context the user assumes when they encounter the dialog, which can very easily be incorrect (not necessarily because of a fault with the user).
The ramifications of getting privacy levels wrong are significant. It’s very important to correctly configure these levels. Understanding the various contexts applicable to each setting one key to getting these settings right. Let’s take a few minutes to articulate these contextual groupings so we have a better understanding of how these settings apply.
While we’re at it, we’ll bump into a data source that never shows up in this list—but which we should be cognizant of when thinking about privacy levels.
Context: Data Source
As already mentioned, this context category is probably the most obvious. By several of the data sources in the list, there is dropdown where you can configure the scope for that particular setting—such as the entire database server or just the particular instance on the server; the entire drive, just a subfolder on the drive or just the particular file being accessed. By default, this dialog sets these scopes to the largest possible (e.g. for the database, the entire database server; for the drive, the entire c:\, etc.).
Current Workbook, an Excel-only data source, deserves a special mention. Its privacy level setting applies to data read out of the current workbook by mashups in the current workbook (that is, when
Excel.CurrentWorkbook() is used). It does not apply to data read out of the current workbook by external mashups (mashups in other files). For example, if the current file is SomeFile.xlsx, and a mashup in OtherFile.xlsx pulls data from SomeFile.xlsx, the Current Workbook privacy level in SomeFile.xlsx will not be applied.
Ignore Privacy Levels isn’t a data source so doesn’t fit into this grouping of scope.
Inline Data (the hidden case). Before we leave this context grouping, did you know that a particular kind of data source will never show in privacy levels settings? Power Query allows you to inline data, storing it directly in a mashup. For example, you can hardcode a table using
#table(…); similarly, in Power BI, you can use the GUI “Insert Data” feature to define a table that is directly stored in mashup code. Data directly stored in the mashup doesn’t get its own privacy level and so doesn’t get an entry in the privacy levels list. If inline data needs a privacy level applied to it, you’ll need to move it somewhere else because the concept of privacy levels isn’t directly applied to inline data.
This classification of context can be summed up with the question: How broadly do the data privacy level settings apply in the reporting environment? Will the settings you choose apply across your reporting environment? What about if another user opens one of your reports or spreadsheets?
Ignore Privacy Levels. The descriptive text accompanying this checkbox makes it sound like it applies to the current file. That’s party correct: It applies to the current file but only for the current user.
If you were the last one to save a file that you open, you’ll see this checkbox in the state you left it. If anyone else opens the same file, this box will automatically reset to unchecked, meaning that privacy levels will be enforced (unless their tool’s global “ignore privacy levels” setting is set). On a per-file basis, the only way to disable privacy levels is for the user that wants them disabled to turn them off; you can’t disable them for another user.
Current Workbook (Excel specific). Also applies to the current file for the current user (like Ignore privacy levels).
All other data source-specific settings, (in this case, “.\sql2017express” and “c:\”) apply to the current user for the given tool (e.g. Microsoft Power BI or Microsoft Excel) across files.
If you have several Power BI reports using the same data source, those reports will all use the same privacy level. Similarly, if you have several Excel spreadsheets using the same data source, those spreadsheets will all use the same privacy level. However, data source privacy levels are not shared between Power BI and Excel. They also are not shared across users—so if you send a report or spreadsheet to someone else, that user’s execution of its mashups will occur using the privacy levels they have configured, not the ones you have configured.
If you open a file from another user, none of the privacy settings configured by that user affect how you experience the file.
If you’re trying to figure out why a report is mysteriously running slow for another user but not for you, or slow for you but not another user, it’s possibly privacy settings are responsible—but there’s no way in Query Editor to check whether their settings are different from your settings.
In a diagnostic situation, you can view the last saved Ignore Privacy Levels and Current Workbook settings even when you were not the last saving user by using Data Mashup Explorer. (Unfortunately, the user’s data source privacy settings do not travel in the file so cannot be viewed using this tool.)
Context: Current Situation
When a user encounters the dialog above, how broadly do they expect its settings to apply? To put it another way, when the user encounters this dialog, mentally what context are they imposing on it or assuming about it?
A user will be promoted to set privacy levels the first time they execute a mashup that combines between data sources which do not have privacy levels configured. Chances are, when the user sets these levels, they are choosing values that make sense in the context of the file or report they are currently working on. If they’re combining a city + state/province + ZIP/postal code lookup table from the database and a CSV of publicly-available weather information from somewhere on their C:, chances are they are going to set both data source’s levels to “public” because in the context of the current situation the data they are consuming is all public data.
This seems like a logical course of action. The user is working in the context of a file and the dialog makes no mention of its settings being applied more broadly than that file (in fact, it even mentions “this file” at one point in its text).
The problem is, next time that user uses the same tool (e.g. Power BI or Excel) to access any data in any database on that database server or saved anywhere on that drive, privacy level public will still be used. This time, the context might be a mashup reading highly confidential data from the database or drive but that makes no difference: the privacy level was previously set and the user won’t be prompted to change it or confirm it. Chances are, since they won’t be prompted about privacy levels, updating those levels won’t cross their mind. The privacy settings they previously selected in what to them was a different context will still apply and could result in data being leaked.
The user has a lot going against them as far as getting privacy settings correct goes. What privacy levels are and what they do are not intuitively obvious or easily observable (a topic beyond the scope of this article). Even if the user understands the concept, to correctly configure the related settings, they need to understand the scope (or context) of those settings and remember to update their data source-level settings, as appropriate, when how they use those sources changes in their personal reporting environment.
Switching these settings to all be scoped to the file level would simplify this. Imagine that data source privacy level, Current Workbook and Ignore Privacy Levels settings were all scoped to and saved into the current file (report or spreadsheet) and applied to any user who opened that file. This change would remove most of the context-related complexity involved with getting privacy levels configured correctly. It would be simple to explain that “these settings apply just to this file.” Data source-specific settings could then be tailored to the data processed by just that file instead of needing to factor in all uses the user might make of that source across their entire personal reporting world. The possibility for different users to unconsciously use different data source privacy levels would be eliminated.
On the other hand, making this change would necessitate taking the time to configure data source privacy levels for each file, in contrast to the current approach where you configure those settings just once per source (while this has its downsides, the current approach does save time).
What do you think? Would changing Microsoft Power BI Desktop’s and Microsoft Excel’s Power Query privacy settings to per-file across all users be a good idea?