Bulk loading seems shrouded in mystery. You can’t directly bulk insert values using a query. However, you can bulk load using a tool like BCP, an API like SqlBulkCopy or via a query that tells SQL Server to read the rows out of a file and insert them (
BULK INSERT or
OPENROWSET(BULK…)). Then, if you use one of these means to do a bulk load and watch server activity using a Profiler trace or extended events, you’ll see an insert query but no row data. What’s going on?!
These puzzles can be resolved by learning a little about Tabular Data Stream (TDS)—the protocol clients use to communicate with Microsoft® SQL Server®.
How a Client Bulk Loads
At the TDS level, bulk loading starts with the client sending a SQL Batch message containing an
INSERT BULK T-SQL statement. This special insert statement identifies destination table and columns—just like an ordinary insert—as well the relevant bulk insert options to use, like check constraints, fire triggers or keep nulls. However, it does not include row values. Instead, after the SQL Batch is sent, the data to bulk load is streamed from client to server, using a subset of the same data structures that the server uses to send results to the client.
The data to bulk insert is streamed using a table-like format. The stream starts with metadata describing the columns about to be transmitted. Next, rows are sent. Each row’s column values are encoded in a deterministic binary format corresponding with the column’s data type (for example, datetime values are precisely encoded instead of being transmitted as string which can be interpreted differently depending on the current culture). Lastly, a concluding done notification wraps up the tabular transmission.
Why You Can’t Do It or See Them
Since T-SQL doesn’t know anything about TDS structures, it’s not possible to specify the values to bulk insert directly in a query—there’s no syntax for coding up the necessary column metadata, row and done notification components.
Profiler and extended events provide ways to capture SQL Batches—which is way you can use them to see INSERT BULK statements. However, these tools don’t provide events to capture row data flowing across the wire—which is why, when you use these tools, you can’t see the actual data being bulk inserted and which is also why, using these tools, you can’t view result set rows being sent from server to client.
If you’d like to learn more about TDS, including more about how it transmits tabular data, you may enjoy watching T-SQL on the Wire: What takes place between client and server? This post is adapted from that presentation.