Mastering SSIS-950: Efficient Data Transformation with Custom Scripts

Espresso

Below is a detailed example of how you would create an SSIS package to extract data from a source SQL Server table, transform it by adding a new column, and then load it into a destination SQL Server table. The example includes SQL scripts for creating the necessary tables and sample data, SSIS expressions for transformations, and C# code for custom scripts.

1. Create the Source and Destination Tables

You’ll first need to set up the source and destination tables in your SQL Server database.

-- Create the source table
CREATE TABLE SalesOrders (
    OrderID INT PRIMARY KEY,
    OrderDate DATE,
    CustomerName NVARCHAR(100),
    TotalAmount DECIMAL(18, 2)
);

-- Insert some sample data into the source table
INSERT INTO SalesOrders (OrderID, OrderDate, CustomerName, TotalAmount)
VALUES
    (1, '2024-08-01', 'John Doe', 150.00),
    (2, '2024-08-05', 'Jane Smith', 250.50),
    (3, '2024-08-10', 'Alice Brown', 300.00);

-- Create the destination table
CREATE TABLE SalesOrdersProcessed (
    OrderID INT PRIMARY KEY,
    OrderDate DATE,
    CustomerName NVARCHAR(100),
    TotalAmount DECIMAL(18, 2),
    OrderProcessedDate DATE
);

2. Create the SSIS Package in SQL Server Data Tools (SSDT)

Step 1: Add a Data Flow Task

  1. Open SQL Server Data Tools (SSDT) and create a new SSIS project.
  2. Drag and drop a Data Flow Task onto the Control Flow tab.

Step 2: Configure the Data Flow Task

  1. Double-click on the Data Flow Task to open the Data Flow Designer.

Step 3: Set Up the OLE DB Source

  1. Drag an OLE DB Source onto the design surface.
  2. Double-click the OLE DB Source to configure it.
  3. Set up a connection to your SQL Server database and select the SalesOrders table as the source.
  4. Click Columns to select the columns you want to include (e.g., OrderID, OrderDate, CustomerName, TotalAmount).

Step 4: Add a Derived Column Transformation

  1. Drag a Derived Column Transformation onto the design surface.
  2. Connect the OLE DB Source to the Derived Column Transformation.
  3. Double-click the Derived Column Transformation to configure it.
    • Derived Column Name: OrderProcessedDate
    • Derived Column: Add as a new column
    • Expression: GETDATE()
    • Data Type: DT_DATE
    This expression will add the current date to the OrderProcessedDate column.

Step 5: Configure the OLE DB Destination

  1. Drag an OLE DB Destination onto the design surface.
  2. Connect the Derived Column Transformation to the OLE DB Destination.
  3. Double-click the OLE DB Destination to configure it.
  4. Set up a connection to your SQL Server database and select the SalesOrdersProcessed table as the destination.
  5. Click Mappings to map the source columns to the destination columns.

3. C# Script for Custom Transformation (Optional)

If you need to perform a more complex transformation, you can use a Script Component. Below is an example of how you could use a C# script to calculate a discount on the TotalAmount column.

Step 1: Add a Script Component

  1. Drag a Script Component onto the Data Flow design surface.
  2. When prompted, select Transformation as the script type.

Step 2: Configure the Script Component

  1. Double-click the Script Component to configure it.
  2. In the Inputs and Outputs section, add a new output column called DiscountedTotalAmount.
  3. Set the data type of DiscountedTotalAmount to decimal[DT_DECIMAL].

Step 3: Write the C# Script

Click Edit Script to open the script editor. Here’s a sample script:

csharpCopy codepublic override void Input0_ProcessInputRow(Input0Buffer Row)
{
    // Apply a 10% discount to the TotalAmount
    Row.DiscountedTotalAmount = Row.TotalAmount * 0.9M;

    // Add the current date to the OrderProcessedDate column
    Row.OrderProcessedDate = DateTime.Now;
}

This script calculates a 10% discount on the TotalAmount and adds the current date to OrderProcessedDate.

4. Execute the SSIS Package

Once you’ve configured the Data Flow, you can execute the SSIS package:

  1. Save the package.
  2. Right-click the package and select Execute.
  3. Monitor the execution progress in the Execution Results tab.

5. Verify the Results

Run the following SQL query to verify that the data has been successfully transformed and loaded into the SalesOrdersProcessed table:

sqlCopy codeSELECT * FROM SalesOrdersProcessed;

You should see the original data from SalesOrders, with the addition of the OrderProcessedDate column (containing the current date) and, if you added the custom script, the DiscountedTotalAmount column.

Summary

This example walked you through the process of setting up an SSIS package that extracts data from a source table, applies transformations (including custom C# scripts), and loads the data into a destination table. This setup is highly customizable and can be adapted to meet more complex ETL requirements in your data integration projects.

Leave a Comment