Wednesday 11 February 2015

30 TOP Ab Initio Interview Questions and Answers

Below are some important Ab Initio interview questions which are asked in most MNC company interviews for beginners or professionals.

1.What is the difference between rollup and scan?
By using rollup we cant generate cumulative summary records for that we will be using scan.

2.What is the difference between partitioning with key and round robin?
PARTITION BY KEY:
In this, we have to specify the key based on which the partition will occur. Since it is key based it results in very well balanced data. It is useful for key dependent parallelism.
PARTITION BY ROUND ROBIN:
In this, the records are partitioned in sequential way, distributing data evenly in blocksize chunks across the output partition. It is not key based and results in well balanced data especially with blocksize of 1. It is useful for record independent parallelism.

3.How do you truncate a table
There are many ways to do it.
1. Probably the easiest way is to use Truncate Table
2. Run Sql or update table can be used to do the same thing
3. Run Program

4.What is the difference between a DB config and a CFG file?
A .dbc file has the information required for Ab Initio to connect to the database to extract or load tables or views. While .CFG file is the table configuration file created by db_config while using components like Load DB Table

5.Types of parallelism in detail.
There are 3 types of parallelism in ab-initio.
1) Data Parallelism:
Data is processed at the different servers at the same time.
2) Pipeline parallelism:
In this the records are processed in pipeline, i.e. the components do not have to wait for all the records to be processed. The records that got processed are passed to next component in pipeline.
3) Component Parallelism:
In this two or more components process the records in parallel.
Component parallelism:-
A graph with multiple processes running simultaneously on
separate data uses component parallelism.
Data parallelism :- A graph that deals with data divided into segments and operates on each segment simultaneously uses data parallelism. Nearly all commercial data processing tasks can use data parallelism. To support this form of parallelism, Ab Initio provides Partition components to segment data, and Departition components to merge segmented data back together .
Pipeline parallelism :- A graph with multiple components running simultaneously on the same data uses pipeline parallelism. Each component in the pipeline continuously
reads from upstream components, processes data, and writes to downstream components. Since a downstream component can process records previously written
by an upstream component, both components can operate in parallel. NOTE: To limit the number of components running simultaneously, set phases in the graph.

No comments:

Post a Comment