Thursday, December 5, 2013

SSIS 2005 INTERVIEW QUESTIONS

http://ssisinterviewcentral.blogspot.in/

SSIS INTERVIEW QUESTIONS


1) Explain architecture of SSIS?
A. SSIS architecture consists of four key parts:
     a) Integration Services service: monitors running Integration Services packages and   
          manages the storage of packages.
     b) Integration Services object model: includes managed API for accessing Integration  
         Services tools,command-line utilities, and custom applications.
     c) Integration Services runtime and run-time executables: it saves the layout of 
         packages, runs packages,and provides support for logging, breakpoints, configuration, 
         connections, and  transactions.The Integration  Services run-time executables are the 
         package, containers, tasks,and event handlers that Integration Services includes, and  
         custom tasks.
     d) Data flow engine: provides the in-memory buffers that move data from source to 
          destination.


2) What is the control flow?
A. In SSIS a workflow is called a control-flow. A control-flow links together our modular 
     data-flows as a series of operations in order to achieve a desired result.
     A control flow consists of one or more tasks and containers that execute when the 
     package runs.To control order or define the conditions for running the next task or 
     container in the package control flow, you use precedence constraints to connect 
     the tasks and containers in a package. A subset of tasks and containers can 
     also be grouped and run repeatedly as a unit within the package control flow. SQL 
     Server 2005 Integration Services (SSIS) provides three different types of control flow
     elements: containers that provide structures in packages, tasks that provide 
     functionality, and precedence constraints that connect the executables,containers,
     and tasks into an ordered control flow.


3) How would you do Logging in SSIS?
A. Logging Configuration provides an inbuilt feature which can log the detail of various
    events like onError, onWarning etc to the various options say a flat file, SqlServer table,
    XML or SQL Profiler.


4) What is the control flow?
A. A data flow consists of the sources and destinations that extract and load data, the 
     transformations that modify and extend data, and the paths that link sources, transformations, 
     and destinations. Before you can add a data flow to a package, the package control flow 
     must include a Data Flow task. The Data Flow task is the executable within the SSIS
     package that creates, orders, and runs the data flow. A separate instance of the data
     flow engine is opened for each Data Flow task in a package.
     SQL Server 2005 Integration Services (SSIS) provides three different types of data flow 
     components:sources, transformations, and destinations. Sources extract data from data 
     stores such as tables and views in relational databases, files, and Analysis Services 
     databases. Transformations modify, summarize, and clean data. Destinations load data 
     into data stores or create in-memory datasets. 


5) How would you do Error Handling?
A.  A SSIS package could mainly have two types of errors
      a) Procedure Error: Can be handled in Control flow through the precedence control and 
           redirecting the execution flow.
      b) Data Error: is handled in DATA FLOW TASK buy redirecting the data flow using Error 
          Output of a component.


6) How do you do error handling in SSIS?
A. When a data flow component applies a transformation to column data, extracts data from 
     sources, or loads data into destinations, errors can occur. Errors frequently occur because
     of unexpected data values. 
     For example, a data conversion fails because a column contains a string instead of a    
     an insertion into a database column fails because the data is a date and the column has a       
     numeric data type, or an expression fails to evaluate because a column value is zero, 
     resulting in number,a mathematical operation that is not valid.Errors typically fall into one the 
     following categories:-Data conversion errors, which occur if a conversion results in loss of 
     significant digits, the loss ofinsignificant digits, and the truncation of strings. Data conversion 
     errors also occur if the requested conversion is not supported. -Expression evaluation errors, 
     which occur if expressions that are evaluated at run time perform invalid operations or 
     become syntactically incorrect because of missing or incorrect data values. -Lookup errors, 
     which occur if a lookup operation fails to locate a match in the lookup table. Many data flow 
     components support error outputs, which let you control how the component handles 
     row-level errors in both incoming and outgoing data. You specify how the component 
     behaves when truncation or an error occurs by setting options on individual columns in the 
     input or output. 
     For example, you can specify that the component should fail if customer name data is 
     truncated, but ignore errors on another column that contains less important data.


7) How to pass property value at Run time? How do you implement Package 
    Configuration?
A. A property value like connection string for a Connection Manager can be passed to the pkg 
    configurations.Package Configuration provides different options like XML File, Environment  
    Variables,using package SQL Server Table, Registry Value or Parent package variable.


8) How do you do logging in SSIS?
A. SSIS includes logging features that write log entries when run-time events occur and can
     also write custom messages.
     Integration Services supports a diverse set of log providers, and gives you the ability to 
     providers. The Integration Services log providers can write log entries to text files, SQL 
     Server Profiler, SQL ceate custom log Server, Windows Event Log, or XML files. 
     Logs are associated with packages and are configured at the package level. Each task or  
     package can log information to any package log. The tasks and containers in a package can 
     be enabled for container in a logging even if the package itself is not. 
    To customize the logging of an event or custom message, Integration Services provides a 
    schema of commonly logged information to include in log entries.
    The Integration Services log schema defines the information that you can log.You can select 
    elements from the log schema for each log entry.

9) How would you deploy a SSIS Package on production?
A. A) Through Manifest
             1. Create deployment utility by setting its propery as true .
             2. It will be created in the bin folder of the solution as soon as package is build.
             3. Copy all the files in the utility and use manifest file to deply it on the Prod.
     B) Using DtsExec.exe utility
     C) Import Package directly in MSDB from SSMS by logging in Integration Services. 

10) How do you deploy ssis packages?
A.   SQL Server 2005 Integration Services (SSIS) makes it simple to deploy packages to any 
       computer. 
       There are two steps in the package deployment process:
        1)The first step is to build the Integration Services project to create a package deployment 
            utility.
        2)The second step is to copy the deployment folder that was created when you built the 
            Integration Services project to the target computer, and then run the Package Installation 
            Wizard to install the packages.

11) What is Execution Tree?
A.   Execution trees demonstrate how package uses buffers and threads. At run time, the data  
       flow engine breaks down Data Flow task operations into execution trees. These execution 
       trees specify how buffers and threads are allocated in the package. Each tree creates a 
       new buffer and may execute on a different thread. When a new buffer is created such as 
       when a partially blocking or blocking transformation is added to the pipeline, additional 
       memory is required to handle the data transformation and each new tree may also give you 
       an additional worker thread.

12) What are variables and what is variable scope ?
A.   Variables store values that a SSIS package and its containers, tasks, and event handlers 
      can use at run time. The scripts in the Script task and the Script component can also use 
      variables. The precedence constraints that sequence tasks and containers into a workflow 
      can use variables when their constraint definitions include expressions.
      Integration Services supports two types of variables: user-defined variables and system 
      variables. User-defined variables are defined by package developers, and system variables 
      are defined by Integration Services. You can create as many user-defined variables as a 
      package requires, but you cannot create additional system variables. 
      Scope :
      A variable is created within the scope of a package or within the scope of a container, task,     
      or event handler in the package. Because the package container is at the top of the   
      container hierarchy, variables with package scope function like global variables and can be 
      used by all containers in the package. Similarly, variables defined within the scope of a 
      container such as a For Loop container can be used by all tasks or containers within the For 
      Loop container.


13) Difference between Unionall and Merge Join?
A.   a) Merge transformation can accept only two inputs whereas Union all can take more than 
           two inputs
      b) Data has to be sorted before Merge Transformation whereas Union all doesn't have any  
           condition like that.

14) True or False - Using a checkpoint file in SSIS is just like issuing the CHECKPOINT    
       command against the relational engine. It commits all of the data to the database. 
A.   False. SSIS provides a Checkpoint capability which allows a package to restart at the point  
       of failure. 

15) Can you explain the what the Import\Export tool does and the basic steps in the  
       wizard? 
A.   The Import\Export tool is accessible via BIDS or executing the dtswizard command. 

       The tool identifies a data source and a destination to move data either within 1 database,   
        between instances or even from a database to a file (or vice versa).  



16) How would you restart package from previous failure point?What are Checkpoints  
       and how can we implement in SSIS?
A.   When a package is configured to use checkpoints, information about package execution is  
       written to a checkpoint file. When the failed package is rerun, the checkpoint file is used to 
       restart the package from the point of failure. If the package runs successfully, the checkpoint 
       file is deleted, and then re-created the next time that the package is run.


17) Can you explain the what the Import\Export tool does and the basic steps in the 
       wizard? 
A.   The Import\Export tool is accessible via BIDS or executing the dtswizard command. 
       The tool identifies a data source and a destination to move data either within 1 database,  

       between instances or even from a database to a file (or vice versa). 

18) Where are SSIS package stored in the SQL Server?

       MSDB.sysdtspackages90 stores the actual content and ssydtscategories, sysdtslog90, 
       sysdtspackagefolders90, sysdtspackagelog, sysdtssteplog, and sysdtstasklog do the 
       supporting roles.


19) What are the command line tools to execute SQL Server Integration Services 
       packages? 
       DTSEXECUI - When this command line tool is run a user interface is loaded in order to   

       configure each of the applicable parameters to execute an SSIS package. 
       DTEXEC - This is a pure command line tool where all of the needed switches must be  

       passed into the command for successful execution of the SSIS package. 


20) How would you schedule a SSIS packages?
A.   Using SQL Server Agent. Read about Scheduling a job on Sql server Agent



21) Can you explain the SQL Server Integration Services functionality in Management  
      Studio? 
      You have the ability to do the following: 
      Login to the SQL Server Integration Services instance 
      View the SSIS log 
      View the packages that are currently running on that instance 
      Browse the packages stored in MSDB or the file system 
      Import or export packages 
      Delete packages 
      Run packages

22) Difference between asynchronous and synchronous transformations? 
      Asynchronous transformation have different Input and Output buffers and it is up to the   

      component designer in an Async component to provide a column structure to the output     
      buffer and hook up the data from the input.


23) Can you name some of the core SSIS components in the Business  
       Intelligence?
A.   Connection Managers 
      Control Flow 
      Data Flow 
      Event Handlers 
      Variables window 
      Toolbox window 
      Output window 
      Logging 
      Package Configurations




24) How to achieve parallelism in SSIS?
A.   Parallelism is achieved using MaxConcurrentExecutable property of the package. Its 
      default  is -1 and is calculated as number of processors + 2.


25) True or False: SSIS has a default means to log all records updated, deleted or  
       inserted on a per table basis. 
A.   False, but a custom solution can be built to meet these needs. 




26) How do you do incremental load?
A.   Fastest way to do incremental load is by using Timestamp column in source table and then   
       storing last ETL timestamp, In ETL process pick all the rows having Timestamp greater
       than the stored Timestamp so as to pick only new and updated records 


27) What is a breakpoint in SSIS? How is it setup? How do you disable it? 
A.  A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an   

     opportunity to review the status of the data, variables and the overall status of the SSIS  
     package. 
    10 unique conditions exist for each breakpoint. 
     Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on   

     the object where you want to set the breakpoint and select the 'Edit Breakpoints...' option. 



28) How to handle Late Arriving Dimension or Early Arriving Facts?
A.   L
ate arriving dimensions sometime get unavoidable 'coz delay or error in Dimension ETL or 
       may be due to logic of ETL. To handle Late Arriving facts, we can create dummy 
       Dimension with natural/business key and keep rest of the attributes as null or default.  
       And as soon as Actual dimension arrives, the dummy dimension is updated with Type 1  
       change. These are also known as Inferred Dimensions.

29) Can you name 5 or more of the native SSIS connection managers? 
A.  OLEDB connection - Used to connect to any data source requiring an OLEDB connection 

      (i.e., SQL Server 2000) 
      Flat file connection - Used to make a connection to a single file in the File System.   

      Required for reading information from a File System flat file 
      ADO.Net connection - Uses the .Net Provider to make a connection to SQL Server 2005  
      or other connection exposed through managed code (like C#) in a custom task 
      Analysis Services connection - Used to make a connection to an Analysis Services 
      database or project. Required for the Analysis Services DDL Task and Analysis Services     
      Processing Task 
      File connection - Used to reference a file or folder. The options are to either use or create   

      a file or folder 
      Excel 
      FTP 
      HTTP 
      MSMQ 
      SMO 
      SMTP 
      SQLMobile 
      WMI 


30) How do you eliminate quotes from being uploaded from a flat file to SQL Server? 
       In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text    

       qualifier field then preview the data to ensure the quotes are not included. 
       Additional information: How to strip out double quotes from an import file in SQL Server     

       Integration Services 

31) Can you name 5 or more of the main SSIS tool box widgets and their functionality? 
A.  For Loop Container 
      Foreach Loop Container 
      Sequence Container 
      ActiveX Script Task 
      Analysis Services Execute DDL Task 
      Analysis Services Processing Task 
      Bulk Insert Task 
      Data Flow Task 
      Data Mining Query Task 
      Execute DTS 2000 Package Task 
      Execute Package Task 
      Execute Process Task 
      Execute SQL Task 
      etc.


32) Can you explain one approach to deploy an SSIS package? 
A.  One option is to build a deployment manifest file in BIDS, then copy the directory to the    

      applicable SQL Server then work through the steps of the package installation wizard 
      A second option is using the dtutil utility to copy, paste, rename, delete an SSIS Package 
      A third option is to login to SQL Server Integration Services via SQL Server Management    

      Studio then navigate to the 'Stored Packages' folder then right click on the one of the    
      children folders or an SSIS package to access the 'Import Packages...' or 'Export   
      Packages...'option. 
      A fourth option in BIDS is to navigate to File | Save Copy of Package and complete the   

      interface. 

33) Can you explain how to setup a checkpoint file in SSIS? 
A. The following items need to be configured on the properties tab for SSIS package: 
     CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to   

     save the value of package variables and log completed tasks. Rather than using a hard-  
     coded path as shown above, it's a good idea to use an expression that concatenates a path    
     defined in a package variable and the package name. 
     CheckpointUsage - Determines if/how checkpoints are used. Choose from these options:   

     Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints.   
     IfExists is the typical setting and implements the restart at the point of failure behavior. If a   
     Checkpoint file is found it is used to restore package variable values and restart at the point   
     of failure. If a Checkpoint file is not found the package starts execution with the first task. The   
     Always choice raises an error if the Checkpoint file does not exist. 
     SaveCheckpoints - Choose from these options: True or False (default). You must select  

     True to implement the Checkpoint behavior. 

34) Can you explain different options for dynamic configurations in SSIS? 
A.  Use an XML file 
      Use custom variables 
      Use a database per environment with the variables 
      Use a centralized database with all variables  


35) How do you upgrade an SSIS Package? 
A.   Depending on the complexity of the package, one or two techniques are typically used: 
       Recode the package based on the functionality in SQL Server DTS 
       Use the Migrate DTS 2000 Package wizard in BIDS then recode any portion of the   

       package that is not accurate


36) Can you name five of the Perfmon counters for SSIS and the value they provide? 
A.   SQLServer:SSIS Service 
       SSIS Package Instances - Total number of simultaneous SSIS Packages running 
       SQLServer:SSIS Pipeline 
       BLOB bytes read - Total bytes read from binary large objects during the monitoring period. 
       BLOB bytes written - Total bytes written to binary large objects during the monitoring  

       period. 
       BLOB files in use - Number of binary large objects files used during the data flow task   

       during the monitoring period. 
       Buffer memory - The amount of physical or virtual memory used by the data flow task

       during the monitoring period. 
       Buffers in use - The number of buffers in use during the data flow task during the  
       monitoring period. 
       Buffers spooled - The number of buffers written to disk during the data flow task during 
       the monitoring period. 
       Flat buffer memory - The total number of blocks of memory in use by the data flow task   
       during the monitoring period. 
       Flat buffers in use - The number of blocks of memory in use by the data flow task at a

       point in time. 
       Private buffer memory - The total amount of physical or virtual memory used by data   
       transformation tasks in the data flow engine during the monitoring period. 
       Private buffers in use - The number of blocks of memory in use by the transformations

       in the data flow task at a point in time. 
       Rows read - Total number of input rows in use by the data flow task at a point in time. 
       Rows written - Total number of output.

No comments:

Post a Comment