Monday, September 30, 2013

31 DAYS OF SSIS – 10 THINGS ABOUT EXPRESSIONS (16/31)

The SSIS expression language is an interesting blend of T-SQL and Visual Basic functions.  Some work in ways we expect and others do not.  And while a lot of it is familiar enough that we can get by, there is some things that are a bit foreign and exclusive to SSIS.

1. EXPRESSION EVALUATION

This first one is one that I learned the hard way.  Expressions are evaluated at the time in which the expression is accessed.  This means that each time you reference the expression it’s value will be determined.  This has an impact when you use variables that change during execution and what their output will be.
Also, if you have a variable with a value that is defined by an expression, the expression will override the value that you set for the variable.  I’ve fallen foe to this a number of times in the past.  Setting a variable value, either through BIDS or at run-time with DTEXEC, will have no effect if the variable value is evaluated by an expression.

2. USE SYSTEMSTARTDATE FOR TIMESTAMPS

As mentioned above, expressions values are determined when the expression is evaluated.  Because of this, the GETDATE() function in SSIS will provide the current date and time each time it is called and the date and time at the beginning of the package execution will be different that at the end of the package execution.  As an alternative to GETDATE() function, use the SystemStartDate variable.  This variable is set at the start of the package execution and static throughout the package’s execution.

3. CONDITIONAL LOGIC

Inevitably, at some point when writing expressions you will want to perform some sort of conditional logic.  With T-SQL and Visual Basic, we can write IF-THEN-ELSE code blocks.  In Access, we have the immediate if (IIF) function.  And in Excel and C# there is the use if IF().  Of course, none of these were good enough for SSIS.
Instead in SSIS, the expression language uses ? and : for conditional logic.  These are written as:
1boolean expression ? true value : false value
After writing a boolean expression the question mark is use to indicate that the boolean expression should be evaluated.  If the boolean expression is true, then return the first value.  The colon is used to indicate the false value.

4. AND AND OR

A few times I’ve encountered package with multiple data paths coming out of Conditional Split transformations which are subsequently merged back together with Union All transformations.  The cure for this is usually a quick tutorial on how to use the AND and OR.
Quite simply, AND translates to && and OR translates to || within the SSIS expression language.  And erroneous expression might look like this:
1@[User::Value1] == 1 AND @[User::Value2] == 1 OR @[User::Value3] == 1
But can be fixed as the following:
1@[User::Value1] == 1 && @[User::Value2] == 1 || @[User::Value3] == 1
Make sure you utilize AND and OR operators rather than adding additional needless transformations.

5. BACKSLASH

Whenever you need to add a backslash into an expression through string concatenation, you may come upon an unexpected error.  When backslashes are entered, they will cause the expression to error with the following error:
image
Long and the short, backslashes require an escape from themselves.  The remedy to this is to double up your backslashes.

6. TOO MANY PARENTHESIS

Parenthesis are nice.  They take portions of your expressions and group them together.  This allows you to both visually and logically control the expression.  But what you may not know is that you can go too far with parenthesis.  Take the following expressions for example:
1(100 - (@[User::Value1] + @[User::Value2])) == 0 ? 0 : ((@[User::Value3] / (100 - (@[User::Value1] + @[User::Value2]))) * 100)
Place this in a variable expression and it will compile and works nicely.  But try and run some values in it and you will occassionally get a divide by zero error.  The logic is correct but the parenthesis seem to confuse the expression when it executes.
Drop out some of the parenthesis and you will have not problem.  Changing over to the expression below removes the issue:
1100 - @[User::Value1] + @[User::Value2] == 0 ? 0 : @[User::Value3] / (100 - @[User::Value1] + @[User::Value2]) * 100

7. TOO FEW PARENTHESIS

Of course, you need to be careful not to have too few parenthesis.  A very common mistake that I will see with too few parenthesis is when the results of condition logic is being returned.  The goal might be to return a 100-character string value with the following logic:
1(DT_STR,100,1252)"1"=="1""1" "0"
The trouble with this expectation is that the data conversion at the start of the expression only affects the boolean expression.  It does not affect the output from the true or false values.  Instead, the expression will return a 1-character unicode string.
A simple rewrite with a couple of parenthesis and the expression runs as expected:
1(DT_STR,100,1252)("1"=="1""1" "0")
It doesn’t take much to fix this.  The key to parenthesis is to only use them use you need them and to make certain the encapsulate the intended targets.

8. DATE FUNCTIONS

With SSIS packages you won’t make if very far without using some form of date functions.  Trouble is we have most of the T-SQL date functions but we don’t use the same methods for passing in the parameters.  The difference here is that the parameters for date portion of date functions are included within double-quotes.  Instead of DATEADD(d, 1, GETDATE()) the expression would be written as DATEADD(“d”, 1, GETDATE()).
Another thing to mention with date functions is the lack of the CONVERT functionality in T-SQL.  That handy function can easily reformat date and time values in T-SQL.  Unfortunately, expressions require the use of DATEPART (or similar functions) to pull all of the date values apart and then to have them concatenated back together.

9. MISSING FUNCTIONS SUCK

In the past item, I
There are some functions that you will want that are not available in the expression language.  Some functions you may look for that are not available are:
  • LEFT
  • RIGHT
  • STUFF
  • ISDATE
  • ISNUMERIC
  • MONTHNAME
There are usually two possible outcomes because these functions are missing.  First, to get around them we often need to use other functions (SUBSTRING vs. LEFT/RIGHT) to get to the results.  This indirect method is perfectly valid but makes the code slightly more difficult to decipher.  Other times, we will need to move logic from Derived Column transformations in the Script Component transformations.  This would be the case for ISDATE and ISNUMERIC functions.

10. CONCATENATION

Let’s not finish with bang.  For this last item, when you need to concatenate values together in an expression, you will be using the plus (+).  Fortunately, the syntax matches what we are used to with T-SQL. This isn’t terribly interesting or difficult to remember.
The other piece with concatenation, that is similar to T-SQL, is that it does not perform implicit conversions.  If you have a integer and a string, the integer will need to be converted to a string before the values can be concatenated.

EXPRESSION WRAP-UP

All of the expression items above are things that I often find myself explaining and teaching to others.  The expression language isn’t so different from T-SQL but there are enough differences that each adds to a developers frustration and can result in people giving up.
Do you have any pointers that you would recommend to people when working with expressions?  Please, write them in the comments below.

No comments:

Post a Comment