SQL: Passing Array

Many times you find yourself with this damned task, to pass an array to your SQL code (whether you supply the code, or whether it’s a stored procedure doesn’t really matter). You need to pass an array of elements, usually Id’s, but it’s absolutely up to you. In my app I had to pass to my stored procedure an array of Id’s. Each ID was a binary(20). Since I use MSSQL 2005, I could use XML to accomplish this task. So my stored procedure receieves a parameter of Xml:

CREATE PROCEDURE usp_MyProc
   @ids XML
AS
BEGIN
    SELECT f1, f2 FROM @ids.nodes(‘/id’) T(c) JOIN MyTable ON ID = T.c.value(‘.’, ‘MyIdType’);
END

Invoking it will be:

DECLARE @x AS XML;
SET @x = “<id>aa</id><id>bb</id>”;
EXEC usp_MyProc @x

Now the Ids in @x must be encoded in Base64, because MyIdType is Binary(20). So everything is cool now, except one thing, performance sucks big time. Now I was wondering what is the difference between attributed values and the node itself:
<id value=”id goes here”/>
( SELECT f1, f2 FROM @ids.nodes(‘/id’) T(c) JOIN MyTable ON ID = T.c.value(‘value’, ‘MyIdType’); )
and
<id>id goes here</id>

Apparently when only doing a test of parsing the Xmls input, they are almost running at the same speed, although the attributed way is a bit faster. But when you take the result (that’s the converted id as binary) and insert it into a temp table, such as: DECLARE @tmp AS TABLE (ID MyIdType), the attributed Xml seems to perform 2X faster than the nodes format. Interesting ah?

Even so, the performance was still bad, and I had to do something better, I decided to get rid off the Xml and Base64 stuff, they take too much time for SQL to evaluate. My first attempt was to convert from an input of VARCHAR directly to Binary, but without success, the conversion won’t be done well, I got my binary stream as 0x4142 if I were to supply an input of “ab” instead of a right value of 0xab…Then I looked into this substring builtin function and found out (thanks to a friend) that it can operate on varbinary types as well! Imagine that. This meant that from now on I can supply the ids array as one long binary input as varbinary…

The code is now:

DECLARE @P AS INT;
SET @P = 1;
DECLARE @L AS INT;
SET @L = LEN(@ids)+1;
WHILE @P < @L
  SELECT f1, f2 FROM MyTable WHERE ID = SUBSTRING(@ids, @P, 20);
  SET @P = @P + 20;
END

Voila. This seemed to be twice faster than attributed Xml, thus we saved string parsing and Base64 decoding and we even eliminated the use of JOIN. Yet, we do the parsing of the binary stream ourself, and it a bit blows the code with more lines, but it’s still simple code and the performance is much better.

In the beginning I didn’t realize why the above code was faster than a single select query with Xml, but I then understood that SQL scans the tables using indexing, of course. So even if it parses the Xml each time for the next node, the select query happens in reality more than a single time (as much as there is Id nodes to read in), so breaking the single select and putting it in the while statement didn’t degrade performance, but only boosted it with binary parsing. However, binary parsing is not the right phrase to use, because we only “slice” (as in Python :) ) the binary input stream to smaller chunks which doesn’t do any parsing anyways. The only disadvantage I found to this way of passing arrays, was that I cannot pass records/structures as in Xml… It might require more work then and ugly parsing for integers etc…

One Response to “SQL: Passing Array”

  1. Zashkaser says:

    hahaha ! this is good shit

Leave a Reply