Working
with Precedence Constraints in SQL Server Integration Services
In SSIS, tasks are linked by precedence
constraints. A task will only execute if the condition that is set
by the precedence constraint preceding the task is met. By using these
constraints, it is possible to ensure different execution paths depending on
the success or failure of other tasks. This means that you can use tasks
with precedence constraints to determine the workflow of an SSIS package. We challenged Rob Sheldon to provide
a straightforward practical example of how to do it.
The control flow in a SQL Server Integration Services (SSIS)
package defines the workflow for that package. Not only does the control flow
determine the order in which executables (tasks and containers) will run, the
control flow also determines under what conditions they’re executed. In other
words, certain executables will run only when a set of defined conditions are
met.
You configure the workflow by using precedence constraints.
Precedence constraints link the individual executables together and determine
how the workflow moves from one executable to the next. Figure 1 shows the
control flow for the PrecedenceConstraints.dtsx
package. The precedence constraints are the green and red arrows (both solid
and dotted) that connect the tasks and container to each other. (There can also
be blue arrow, as you’ll learn later in the article.)
Figure 1: Control Flow in the
PrecedenceConstraints.dtsx SSIS package
As you would expect, the arrows define the direction of the
workflow as it moves from one executable to the next. For example, after the
first Execute SQL task runs, the
precedence constraints direct the workflow to the next Execute SQL task and the Sequence
container. One or both of these executables will run, depending on how the
precedence constraints have been configured.
When a precedence constraint connects executables, the
originating executable (the first to run) is referred to as the precedence
executable. Multiple precedence constraints can originate from the
precedence executable. In the first Execute
SQL task in Figure 1, two precedence constraints originate from that
precedence executable.
The task or container that is on the downstream end of the
precedence constraint is referred to as the constrained executable. The
constrained executable will run only if the conditions defined on the
precedence constraint are met. If the conditions are not met, the constrained
executable will not run. As a result, by configuring the precedence
constraints, you can create complex workflows, while minimizing the need to
configure duplicate tasks and containers.
Note that I created the PrecedenceConstraints.dtsx package in
SSIS 2005. However, I upgraded the package to SSIS 2008 to ensure that
precedence constraints are implemented the same in both versions. You can
download the SSIS 2005 version of the PrecedenceConstraints.dtsx package in the speech bubble.
If you want to run the PrecedenceConstraints.dtsx package, you
should first run the following Transact-SQL code against the AdventureWorks
sample database:
IF EXISTS(
SELECT
table_name FROM information_schema.tables
WHERE
table_name = 'Employees')
DROP TABLE
Employees
GO
CREATE TABLE
Employees
(
EmployeeID INT PRIMARY KEY,
FirstName NVARCHAR(50) NOT NULL,
LastName NVARCHAR(50) NOT NULL,
Jobtitle NVARCHAR(50) NOT NULL
)
GO
IF EXISTS(
SELECT
table_name FROM information_schema.tables
WHERE
table_name = 'EmployeeLog')
DROP TABLE
EmployeeLog
GO
CREATE TABLE
EmployeeLog
(
LogID INT IDENTITY PRIMARY KEY,
LogFile VARCHAR(50) NULL,
LogDateTime DATETIME NOT NULL DEFAULT GETDATE()
)
The code creates the Employees table and EmployeeLog table, both
of which are necessary to run the PrecedenceConstraints.dtsx package. The
package itself archives employee data, logs information about the package
execution, truncates the Employees table if necessary, retrieves data through
the HumanResources.vEmployee view, and loads it into the Employees table. The
package uses different types of precedence constraints to control the workflow
for these various operations. We’ll look at the workflow and precedence
constraints in closer detail as we work through the article.
Defining Workflow by Success or Failure
By default, when a precedence constraint connects two
executables, the constrained executable will run after the precedence
executable successfully runs. However, if the precedence executable fails, that
part of the workflow is interrupted, and the constrained executable does not
run. You can override this behavior by setting the Value property on the precedence constraint. The property supports
three options:
- Success: The precedence executable must
run successfully for the constrained executable to run. This is the
default value. The precedence constraint is set to green when the Success option is selected.
- Failure: The precedence executable must
fail for the constrained executable to run. The precedence constraint is
set to red when the Failure
option is selected.
- Completion: The constrained
executable will run after the precedence executable runs, whether the precedence
executable runs successfully or whether it fails. The precedence
constraint is set to blue when the Completion
option is selected.
If you refer back to Figure 1, you’ll see that the two
precedence constraints originate from the Data Flow task, one green and one red. If the Data Flow task runs successfully, the
first Send Mail task will run
because the green precedence constraint connects to that Send Mail task. Because the Value property of the precedence
constraint is set to Success,
the precedence constraint will evaluate to true and the workflow will continue
along that path.
However, if the Data
Flow task fails, the red precedence constraint will evaluate to true
because its Value property is
set to Failure. As a result, the
second Send Mail task will run.
This way, you can configure each Send
Mail task differently so that you’re sending a unique email based on
whether the Data Flow task
succeeds or fails.
Note: The
two Send Mail tasks and the SMTP connection manager are included
here for demonstration purposes only. If you want to tests these tasks, you
will need to point the connection manager to an actual SMTP server and
configure the tasks appropriately. Otherwise, you should disable the tasks or
they will fail when you try to run the package. (Even if they do fail, however,
the package will still run and load the data as expected.)
Defining Workflow by Expressions
Although defining workflow by execution outcome (success,
failure, or completion) can be useful, the workflow logic is still limited to
that outcome. However, you can further refine your workflow by adding
expressions to the precedence constraints. Any expression you add must be a
valid SSIS expression and must evaluate to true or false.
To add an expression, double-click the precedence constraint to
open the Precedence Constraint Editor
dialog box, as shown in Figure 2. (The editor shown in this figure is the one
for the precedence constraint that connects the first and second Execute SQL tasks.)
Figure 2: Defining an
expression in the Precedence Constraint Editor dialog box
When adding an expression to a precedence constraint, the first
step you must take is to select one of the following options from the Evaluation operation drop-down list:
- Constraint: The precedence
constraint is evaluated solely on the option selected in the Value property. For example, if
you select Constraint as
the Evaluation operation
option and select Success
as the Value option (the
default settings for both properties), the precedence constraint will
evaluate to true only if the precedence executable runs successfully. When
the precedence constraint evaluates to true, the workflow continues and
the constrained executable runs. (When the Constraint option is selected, the Expression property is greyed out.)
- Expression: The precedence
constraint is evaluated based on the expression defined in the Expression text box. If the
expression evaluates to true, the workflow continues and the constrained
executable runs. If the expression evaluates to false, the constrained
executable does not run. (When the Expression
option is selected, the Value
property is greyed out.)
- Expression and Constraint: The precedence
constraint is evaluated based on both the Value property and the expression. Both must evaluate to true
for the constrained executable to run. For example, in the PrecedenceConstraints.dtsx package,
the first Execute SQL task
must run successfully and the expression must evaluate to true for the
precedence constraint to evaluate to true and the constrained executable
to run.
- Expression or Constraint: The precedence
constraint is evaluated based on either the Value property or the expression. At least one of these
properties must evaluate to true for the constrained executable to run.
After you’ve selected an option from the Evaluation operation list (and set the Value property, if appropriate), you’re next step is to define the
expression. The expression I’ve used in this case (shown in Figure 2),
determines whether the @EmployeeCount property equals 0 (@EmployeeCount
== 0).
To better understand how this works, let’s take a quick look at
the first Execute SQL task. The
task runs the following SELECT statement to retrieve the number of rows in the
Employees table:
SELECT COUNT(*) FROM Employees
The task then assigns the statement’s results (a scalar integer
value) to the @EmployeeCount variable, which I defined when I set up the SSIS
package.
The precedence constraint expression then uses the variable
value to determine whether it equals 0. If it does, the expression evaluates to
true. If not, it evaluates to false. Because the Value property precedence constraint is also set to Success, the first Execute SQL task must run successfully
and the expression must evaluate to true for the precedence constraint as a
whole to evaluate to true. If it does, the second Execute SQL task runs.
The precedence constraint that connects the first Execute SQL task to the Sequence container uses similar logic.
However, the expression itself is slightly different:
@EmployeeCount > 0
In this case, the @EmployeeCount value
must be greater than 0 for the expression to evaluate to true. If it does
evaluate to true and the first Execute
SQL task runs successfully, the Sequence
container and the tasks within it will run.
Notice that the expressions in the two precedence constraints
that originate from the first Execute
SQL task are mutually exclusive. That is, only one of the two
expressions can ever evaluate to true during a specific execution. That does
not mean that you cannot have multiple precedence constraints with expressions
that all evaluate to true during a single execution, but it does mean that you
want to use caution when implementing expression to insure that they reflect
exactly the logic you’re trying to implement in your workflow. In this case, I
want to ensure that only one precedence constraint evaluates to true during a
single execution.
Note: SSIS
expressions are an entity unto themselves and unique to SSIS packages and their
components. It is beyond the scope of this article to get into the details of
expressions, but it is important to get them right. Be sure to refer to the
topic “Integration Services Expression Reference” in SQL Server Books Online if
you have any questions about SSIS expressions.
Defining Workflow by Logical AND or Logical OR
Two other important configuration options in a precedence
constraint are the Logical OR
and Logical AND settings. These
settings apply only to constrained executables and only if those executables
have more than one precedence constraint directed to it. For example, the Data Flow task in the PrecedenceConstraints.dtsx package
(shown in Figure 1) has two precedence constraints pointing to it: one from the
second Execute SQL task and one
from the Sequence container.
If you refer back to Figure 2, you’ll find the following two
options at the bottom of the Precedence
Constraint Editor dialog box:
- Logical AND: All precedence
constraints that point to the constrained executable must evaluate to true
in order for that executable to run. This is the default option. If it is
selected, the arrow is solid.
- Logical OR: Only one
precedence constraint that points to the constrained executable must
evaluate to true in order for that executable to run. If this option is
selected, the arrow is dotted.
In the case of the precedence constraints that point to the Data Flow task, it is the second
option that is selected, as indicated by the dotted lines. As a result, the
workflow can originate from either the second Execute SQL task or from the Sequence container, and only one of the precedence executables has
to run successfully.
The Logical OR
and the Logical AND options let
you define multiple execution paths, yet share those elements that are common
to each path, such as the Data Flow
task in the PrecedenceConstraints.dtsx
package. The logical AND/OR settings, along with the Value property and the use of expressions, provide you with the
options necessary to define intricate workflows while minimizing duplicate
efforts.
Despite its simplicity, the PrecedenceConstraints.dtsx package described in this article
demonstrates all these elements and should provide you with the foundation you
need to use precedence constraints to their fullest. However, you can find
additional information about precedence constraints in the topic “Setting Precedence
Constraints on Tasks and Containers” in SQL Server Books Online.
No comments:
Post a Comment