How To Use the Unpivot Transform in SSIS

Moving data from columns to rows can be a tricky task and accomplished in a variety of ways. That may be through a series of unions in T-SQL, but that’s just ugly and means lots of passes at the data potentially. You could use a T-SQL unpivot, which isn’t bad and usually my preferred method. The problem is you may not always have the option to use T-SQL. What if your source is a flat file or a database that doesn’t support unpivot? That’s where SSIS comes in handy. Data can run through an unpivot transform on its way to the destination database or file. In this example I am going to a SQL Server table as my source and destination just for simplicity. All my code can be found in line below so you can recreate this example as well.

For those wondering about compatibility, the unpivot transformation has been in SSIS since its introduction in 2005.

To start let’s get a couple tables set up with some data to unpivot. This code will create a table called SalesColumns and populate it with some data.

CREATE TABLE [dbo].[SalesColumns](
[CustomerID] [int] NULL,
[SaleAmount] [decimal](18, 2) NULL,
[TaxAmount] [decimal](18, 2) NULL,
[ShippingAmount] [decimal](18, 2) NULL
) ON [PRIMARY]

GO
INSERT [dbo].[SalesColumns] ([CustomerID], [SaleAmount], [TaxAmount], [ShippingAmount]) VALUES (1, CAST(100.00 AS Decimal(18, 2)), CAST(7.00 AS Decimal(18, 2)), NULL)
GO
INSERT [dbo].[SalesColumns] ([CustomerID], [SaleAmount], [TaxAmount], [ShippingAmount]) VALUES (2, CAST(150.00 AS Decimal(18, 2)), CAST(10.50 AS Decimal(18, 2)), CAST(5.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[SalesColumns] ([CustomerID], [SaleAmount], [TaxAmount], [ShippingAmount]) VALUES (3, CAST(85.00 AS Decimal(18, 2)), CAST(5.95 AS Decimal(18, 2)), CAST(2.00 AS Decimal(18, 2)))
GO
INSERT [dbo].[SalesColumns] ([CustomerID], [SaleAmount], [TaxAmount], [ShippingAmount]) VALUES (1, CAST(500.00 AS Decimal(18, 2)), CAST(35.00 AS Decimal(18, 2)), CAST(25.00 AS Decimal(18, 2)))
GO

The table should look like the image below. Notice that we have 3 customers across 4 sales each with different sales, tax and shipping quantities. I intentionally left one value NULL so we can see how the unpivot transform handles that value.

Image1

The next thing we will create is a table to load the data into. It will contain a column for the CustomerID, a column for the type of amount we are storing (Sale, Tax or Shipping in this case) and a column for the amount. We will call this table SalesRows.

CREATE TABLE [dbo].[SalesRows](
[CustomerID] [int] NULL,
[ChargeType] [nvarchar](255) NULL,
[Amount] [decimal](18, 2) NULL
) ON [PRIMARY]
GO

Now that we have a source with some data and a destination table it’s time to hop over to SSIS and build our unpivot transform into the data flow. I’m going to skip the step by step on getting our package up and running, but before you go any farther you should have an SSIS package started in your favorite version of SSIS. It should contain a data flow with an OLE_DB Source that selects all the data from the SalesColumns table. In my screenshot you will see the data flow is called DFT Unpivot Data, my source is called OLE_SRC SalesColumns and is pulling from the SalesColumns table. My connection manager is called OLEDB_TRANSFORMATION.

Image2

Once your source is configured the next thing to do is drag over an Unpivot transform. If you are in SSIS 2012 or SSIS 2014 this is going to be found in the “Other Transforms” section. Connect the OLE_DB source to the Unpivot. Inside the Unpivot transform editor you will see a list of your columns and a few properties on the bottom half of the screen.

Image3

Here is a rundown of the properties and what each does before we actually configure them.

  • Checking a box next to a column name indicates you want to unpivot that column (turn it from a column to a row), this causes the column to show up as an Input Column.
  • Checking the Pass Through box means you don’t want to unpivot the column. You want it to remain as a column on the other side of the transform.
  • Destination Column – The name of the field where the unpivoted data will be stored, this will be a new column output from the transform.
  • Pivot Key Value – By default this will populate with the name of the column. This will be the value that populates a new column (Pivot key value column name) in the output.
  • Pivot key value column name – By default this will be populated with Pivot Key Value. That’s lame, choose something better. The new output column that holds the Pivot Key Value.

Now to setup this transform. Check the box next to each column you would like to unpivot. In this case it is all of our amount columns. We want to keep the CustomerID on each row so we will keep that marked as a Pass Through but not check it off in the list of available columns.

Image4

We are going to change a few properties.

  • The column where these values will be stored in our destination table is called Amount. Change the destination column name to Amount for all the input columns.
  • I don’t want each row to be labeled SaleAmount, TaxAmount and ShippingAmount. I would like those to just be labeled Sale, Tax and Shipping (no amount). Make this change in the Pivot Key Value column.
  • In our destination table this is going to populate a field called ChargeType, so rename the Pivot key value column name to ChargeType.

The Unpivot transform editor should look like this:

Image5

Click OK to close the transform editor. Now connect the Unpivot transform to an OLE_DB destination (which I have renamed OLE_DEST SalesRows). Don’t worry about the error mark in my screenshot, it will go away after the next step.

Image6

Open the destination, select the destination table (SalesRows) and map the columns.

Image7

If we look at the Unpivot transformation editor side by side with the destination mapping it is easier to see how the settings correspond to the transformation in the data. The highlights in the image below show what each option in the transform turns into in the destination. The pass through option (highlighted in green) remains a column in the destination. The destination column (highlighted in red) becomes a new column in the destination that was not in the source. This single column holds the values that were previously help in multiple columns. Finally the Pivot Key Value (highlighted in blue) is what will populate the newly created column called ChargeType which is specified in the Pivot key value column name.

Image8

After running the package the resulting table will be loaded with 11 rows. Someone out there is probably thinking, if I take 4 rows of data and turn 3 columns into rows for each of those I should end up with 12 rows of data (4 source rows * 3 columns = 12 rows). The “missing” row is the NULL value we placed in the data at the beginning of this example. When a column is NULL the unpivot transformation will ignore that record and remove it from the data set.

Image9

There you have it. The power of unpivot in SSIS to take your columns and turn them into rows! The same can be accomplished through T-SQL if you happen to be using a database engine that allows it. For SQL Server you can check out the MSDN page here: http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx

Bradley Schacht

Bradley Schacht is a Cloud Solution Architect on the state and local government team with Microsoft based in Jacksonville, FL. He has co-authored 3 SQL Server books including "SQL Server 2014 Professional Administration". As a former consultant and trainer, he uses his experience on many parts of the Microsoft BI and data platform to help customers deliver the best possible solutions. Bradley frequently presents at community events around the country. He is a contributor to sites such as SQLServerCentral.com and an active member of the Jacksonville SQL Server User Group (JSSUG).

You may also like...

1 Response

  1. alex says:

    This is great! I have similar situation:
    Vehicle_Id, Date_time, Chan_1, ….. Chan_N.
    I want to achieve:

    Vehicle_id, Date/Time, Chan_1, Value
    Vehicle_id, Date/Time, Chan_2, Value
    …..
    Vehicle_id, Date/Time, Chan_N, Value

    Vehicle_Id does not change, Date/Time is a timestamp. Compared to your example I need one column – Costomer_Id, in my case I have 2 – Vehicle_id and Date/Time. Can I achieve the same result with two columns without conversion and concatenation. The input file is csv with data coming from sensors, I have Vehicle_Id, Date/Time and a value per a sensor channel.

Leave a Reply