Category Archives: Data

Power Query M Primer (Part 4): Variables & Identifiers

, , , ,

Up until now, we’ve used variables without talking much about their specifics. Since we’re operating under the assumption that you have at least a little experience with another programming or scripting language, how we’ve used variables has probably made sense.

However, there might be differences—perhaps even significant differences—between how variables work in other languages you’ve used with and how they behave in the Power Query M language. Let’s spend some time exploring variables and related concepts to clear up any lurking confusion and position you to take full advantage of what M offers.

We’ll start with a brief recap of the main unit where we define variables: the let expression. Then, we’ll talk about how variables (and other things) are identified. Related to identifiers is scope, so we’ll cover that, too. Next time, we’ll expand  our understanding of how variables work by learning about M’s paradigm.
Continue reading

Power Query M Primer (part 3):
Functions: Function Values, Passing, Returning, Defining Inline, Recursion

, , , ,

Not only can you define and invoke functions (as we covered in part 2), you can also pass them around. The ability to pass a function around without invoking it gives lots of flexibility.

Sounds complex? Yes, in words, but not necessarily in practice. If you’ve touched the Power Query M language, you’ve probably already passed functions around—just perhaps without realizing about it.

Then there’s the mysterious each. It shows up a lot in code generated by the query editor. What does it actually mean or do? Turns out, it’s a handy shortcut that can simplify code you write.

We have ground to cover. Let’s get going!
Continue reading

Power Query M Primer (part 2):
Functions: Defining

, , , ,

If you read part 1 in this series, you may have picked up a theme about expressions that produce values: A simple statement, like 1, is an expression that produces a value. let is also an expression that produces a value.

Guess what? A function is an expression that ultimately produces a value. Unlike the expressions we looked at last time, a function only produces this value when it’s invoked.

Usually, parameters are passed to a function when it’s invoked. The function can reference these inputs as it computes what it will return. Continue reading

Lightning Talk: SELECT Doesn’t Always Return the Expected Number of Columns

, ,

Last night, I had the privilege of sharing a lightning talk at the OKC SQL Server User Group (OKCSQL). This 17-minute presentation touches on four scenarios where the way columns are returned by SELECT might not match what you expect.

When you write SELECT statements, you probably have expectations like:

  • SELECT * returns all columns in the referenced object(s).
  • Each column is returned in a separate column.
  • Each column returned in visible in the result set.

However—at least in the world of Microsoft SQL Server/Transact SQL (T-SQL)—these assumptions aren’t always true.


Continue reading

Power Query M Primer (part 1):
Introduction, Simple Expressions & let

, , ,

Sooner or later, you may find yourself working directly with the Power Query M formula language. Tools like Microsoft Excel’s Get & Transform Data and Microsoft Power BI provide point-and-click interfaces allowing you to build data transformation sequences which behind-the-scenes are implemented in Power Query M. However, these query editors can’t do everything you might need. Sometimes direct editing and authoring of M is required.

Search the Internet and you’ll find many examples showing how to use this language to solve one problem or another but little is out there describing the syntax and paradigm of the language itself. A limited knowledge of these details may be insignificant when simply copying and pasting samples and editing variable and column names. Move beyond this to weaving and writing your own solutions directly in Power Query M and a solid understanding of the language’s syntax, rules and capabilities becomes most helpful.
Continue reading

Python for Non-Data Analytics?

, , ,

Microsoft SQL Server 2017’s support for Python had me curious. Would this integration make it practical to use Python in non-data analytic scenarios where a general-purpose programming language—like Python—is more suitable than set-based T-SQL?

The answer to this question hinges around the integration’s design. Presumably, the new Python integration is designed to work well for data analysis, as this seems to be the prime impetus motivating it. An integration that excels in one scenario, like the assemble data -> process & analyze -> return results flow common in data analytics, may or may not be a good fit for other use cases.

To help us evaluate when and where SQL Server’s Python support may be helpful for non-data analytics applications, let’s compare it—or, more specifically, let’s compare the underlying external script execution environment that powers it—with two integrations that have been included with SQL Server for some time: Common Language Runtime (CLR) and xp_cmdshell.
Continue reading

Security Concern: Hidden Resultset Columns

,

The resultset grid shows all columns sent by the server, right? Not always! Under certain circumstances, Microsoft® SQL Server® transmits columns that weren’t referenced in the SELECT statement and that don’t show up in tools like SSMS.

How many columns does the below view return? Four seems like the obvious answer but it’s only sometimes correct. This view sometimes returns four columns—and other times returns nine!

CREATE VIEW dbo.EmployeeSummary AS
SELECT
  e.FirstName,
  e.LastName,
  dl1.MonthName + ' ' + dl1.DayString AS Birthday,
  dl2.YearString AS EmployedSince
FROM Employee e
  JOIN UserAccount u ON e.UserName = u.UserName
  JOIN EmploymentHistory h ON e.SSN = h.SSN
  JOIN DateLookup dl1 ON h.HireDate = dl1.Date
  JOIN DateLookup dl2 ON u.Birthdate = dl2.Date;

The extra five columns (shaded in yellow, below) are returned as hidden columns. They won’t appear in most query tools, yet the data they contain still crosses the wire and can be accessed programmatically on the client-side.

Resultset Continue reading

Why You Can’t Directly Bulk Load Values
or See Them in Traces

,

Bulk loading seems shrouded in mystery. You can’t directly bulk insert values using a query. However, you can bulk load using a tool like BCP, an API like SqlBulkCopy or via a query that tells SQL Server to read the rows out of a file and insert them (BULK INSERT or OPENROWSET(BULK…)). Then, if you use one of these means to do a bulk load and watch server activity using a Profiler trace or extended events, you’ll see an insert query but no row data. What’s going on?! Continue reading

T-SQL on the Wire: What takes place between client and server?

,

The answer to this question might shed light on some Microsoft® SQL Server® mysteries, might answer—or raise—security concerns and might help you administer, develop and debug better.

In this presentation, learn about Tabular Data Stream (TDS)—the protocol used for client-to-SQL Server interactions. In addition to a  high-level understanding of how TDS works, you’ll (hopefully!) come away with practical applications of this knowledge that should benefit you as a database administrator or developer.