Transformer Stage Example
The Transformer stage is one of a processing stage. It appears under the processing category in the tool palette.
Transformer stage allows us to create transformations to apply to our data. These transformations can be simple or complex and can be applied to individual columns in our data. Transformations are specified using a set of functions.
Transformer stages can have a single input and any number of outputs. It can also have a reject link that takes any rows which have not been written to any of the outputs links by reason of a write failure or expression evaluation failure.
The data refining, validation and mapping part of the process is mainly handled by a transformer stage. Transformer stage doesn’t extract or write data to a target database. It handles extracted data, performs conversions, mappings, validations, passes values and controls the data flow.
Transformer Stage will consist of:
Stage variables: An intermediate processing variable that retains value during read and doesn’t pass the value into target column.
Constraints: Constraints are conditions that are either true or false that specifies flow of data with a link and filter out data.
Derivations: It is an expression that specifies value to be passed on to the target column.
Column Mappings: means mapping of columns from source to target so that data flows down the target from source.
The order of execution of these in transformer stage :
1. Stage Variables
4. Column mapping
Functions provided by Transformer stage:
· String Function
· Number Functions
· Raw Functions
· Vector Function
· Type Conversion Functions
· Utility Functions
Here in our example we are having employee.txt as an input and we are going to perform some Null handling and applying string functions along with defining derivations to the columns and pulling data to target.
Design the job as shown below:
Open Transformer properties window by either double click on to the stage or Right clickàselect properties from drop down menus.
Select the columns need to be pulled from source and drag and drop it to target columns.
We can create new columns as per our requirement as well like shown in below. (Creating Fullname by concatenating FirstName+MiddleName_LastName)
We can also change the datatype for columns like HireDate and DateofBirth similar to below.
Open the Stage variables properties by Right clickà stage variable properties as shown below.
Create the Stage variable providing the Name for variable, SQL type, initial value if required as below.
Click OK it creates the Stage variable.
Post stage variable creation, provide the derivation as below:
Open Stage constraints by clicking on the option at the left corner to the above of Transformer stage properties window similar to below.
Enter the constraints for the Links which will make flow of data to the targets according to the constraints provided.
Under Transformer Properties window, we have defined stage variable, Constraints, Derivations and column mappings and all of these will look like similar to shown below.
Configure the target Dataset file stage accordingly to store the result in a file.
Save the job, compile and run it. Post successful completion the job will look like:
Output Data:(Target Data)
Reject Data:(On the basis of Key column employeeID)