Wednesday, 29 June 2011

Informatica B2B Data Transformation: Basic concepts (working with Parser)

Below, is a diagram which shows how a basic Parser looks like, and also the Marker and Content Anchors. It also shows what are the various ways of looking in a file and searching for a pattern in the file.
In order to open a new blank project in DT, go to File -> New -> Project -> Data Transformation Project -> Blank project. Give the project a name, and then you are ready to go. Initially you will have nothing but a blank code file (.tgp file) under the Scripts section on the left hand side (Data transformation Explorer). If you are not able to see the Data transformation Explorer, go to Window -> Show View -> Data Transformation Explorer.
Before you begin anything, just prepare a basic XSD file (preferably using XML Spy) which contains a Root element and a Data element as the child element of that Root.

Initial Stage (Only a blank code file)
Open the image files in a new tab to view it in a better way. 







A simple XSD file

It is better to use a software such as XML Spy to make a Schema definition file. But, if you are well versed with XML and XSD, you can directly code it here.



The basic components of a Parser
It is an empty Parser. If we look at the image, we will see the various properties of a Parser. We will go deep into each one of them, in the coming posts. Example source is a required element. It is the input to the parser.



I used a simple Text as an input
Other input options are also available, but for simplicity I used a Text input. After doing this Press Ctrl-F10, and you will see the input on the right hand side  as shown in the next image.


Marker in Action
You can see that once I introduce a Marker, which searches for "apple" in the body of the code, the "apple" word in the input file gets marked. You can use the property expander/minimizer to view the other properties of a marker (Phase, Marking etc).
Note: You can enter a Marker by either selecting the word from the right hand side and selecting "insert Marker" from the right click, or you can just start typing in Marker, and press Enter in the code area.


Content Anchor
Next, if we add a content anchor, by just typing in Content in the code area below the marker, we will see that the text "very much" gets selected in a different color. This denotes the part to be captured and put into the XSD element. For now, the data holder of the Content is blank, but we will add it eventually, otherwise the project will be invalid. Also, note the disable and optional features, which helps you to discard parts of code while executing the code.


Marker Content Maker
In this, we can see how the Markers act as a border for the content. We define one more Marker, which searches for "much", just below the Content anchor and hence we see that the scope of the Content anchor is then limited to the word "very" (along with spaces). Next, we will place this word into a data element in the target XSD.




XSD element in data holder
When we click on the data holder of the Content Anchor, we will find that a new pop up box opens up. This shows the variables schema that are available in the project. It also shows the local variables which contains the system variables as well any custom variables that you might have made. For now, my XSD is placed in no target namespace. So I selected the Data element from there.


Once you done with the above steps, save the code and press Ctrl-F10 to ensure that the markings on the input file are right. Also, if you want to experiment with properties of Marking etc, you can do that. You will understand the concept better by experimenting with the options available. Anyways, now to run the project, you can go to Run in the menu bar or press F5 or Ctrl-F5. Once you do that, you will see the Events tab filled with the execution steps below the code window, and also an output file under the Results section on the left hand side. Open the output file to view the desired XML file.


The final output file
You can browse through the execution in the events tab, and understand how the back end engine works. You will also observe that the Markers are executed first as they are in the initial phase followed by the Content which is in the main phase. Also, you will see that the Data element contains the word "very", hence verifying that the code works fine.


This post showed the working of a Parser. The example here is very basic, but is the foundation of more complex things that can be handled in a parser in DT. Please note the important concepts in here: Marking property, Phase property, Data holder, Opening and Closing Marker.
If you happen to pass by, and have any doubts, then please drop a comment to this post, and I will get back to you.

3 comments:

  1. Hi,

    I have a question on B2B Data Transformation thing as follows:
    When I am going to create a new project in DT I am getting the error as "Could not load file or assembly 'Interop.DSOFile, Version=2.1.0.0, Culture=neutral, PublicKeyToken=513f10b5fc075764' or one of its dependencies. The system cannot find the file specified."

    Could you please help me out?

    Thanks,
    Chaitanya

    ReplyDelete
    Replies
    1. Hi Chaitanya,
      I am not sure about the error. I searched independently on the net and saw that the file "Interop.DSOFile" is used by Microsoft office applications. May be if you download this file separately, the problem will be fixed.

      Regards

      Delete