SAP ETL Questions part I


1. What are the extractor types?
Application Specific
o BW Content
FI, HR, CO, SAP CRM, LO Cockpit
o Customer-Generated Extractors
LIS, FI-SL, CO-PA
Cross Application (Generic Extractors)
o DB View, InfoSet, Function Module
2. What are the steps involved in LO Extraction?
The steps are:
o RSA5 Select the DataSources
o LBWE Maintain DataSources and Activate Extract Structures
o LBWG Delete Setup Tables
o 0LI*BW Setup tables
o RSA3 Check extraction and the data in Setup tables
o LBWQ Check the extraction queue
o LBWF Log for LO Extract Structures
o RSA7 BW Delta Queue Monitor
3. How to create a connection with LIS InfoStructures?
LBW0 Connecting LIS InfoStructures to BW
4. What is the difference between ODS and InfoCube and MultiProvider?
ODS: Provides granular data, allows overwrite and data is in transparent tables, ideal for drilldown and RRI.
CUBE: Follows the star schema, we can only append data, ideal for primary reporting.
MultiProvider: Does not have physical data. It allows to access data from different InfoProviders (Cube, ODS, InfoObject). It is also preferred for reporting.
5. What are Start routines, Transfer routines and Update routines?
Start Routines: The start routine is run for each DataPackage after the data has been written to the PSA and before the transfer rules have been executed. It allows complex computations for a key figure or a characteristic. It has no return value. Its purpose is to execute preliminary calculations and to store them in global DataStructures. This structure or table can be accessed in the other routines. The entire DataPackage in the transfer structure format is used as a parameter for the routine.
Transfer / Update Routines: They are defined at the InfoObject level. It is like the Start Routine. It is independent of the DataSource. We can use this to define Global Data and Global Checks.
6. What is the difference between start routine and update routine, when, how and why are they called?
Start routine can be used to access InfoPackage while update routines are used while updating the Data Targets.
7. What is the table that is used in start routines?
Always the table structure will be the structure of an ODS or InfoCube. For example if it is an ODS then active table structure will be the table.
8. Explain how you used Start routines in your project?
Start routines are used for mass processing of records. In start routine all the records of DataPackage is available for processing. So we can process all these records together in start routine. In one of scenario, we wanted to apply size % to the forecast data. For example if material M1 is forecasted to say 100 in May. Then after applying size %(Small 20%, Medium 40%, Large 20%, Extra Large 20%), we wanted to have 4 records against one single record that is coming in the info package. This is achieved in start routine.
9. What are Return Tables?
When we want to return multiple records, instead of single value, we use the return table in the Update Routine. Example: If we have total telephone expense for a Cost Center, using a return table we can get expense per employee.
10. How do start routine and return table synchronize with each other?
Return table is used to return the Value following the execution of start routine
11. What is the difference between V1, V2 and V3 updates?
V1 Update: It is a Synchronous update. Here the Statistics update is carried out at the same time as the document update (in the application tables).
V2 Update: It is an Asynchronous update. Statistics update and the Document update take place as different tasks.
o V1 & V2 don’t need scheduling.
Serialized V3 Update: The V3 collective update must be scheduled as a job (via LBWE). Here, document data is collected in the order it was created and transferred into the BW as a batch job. The transfer sequence may not be the same as the order in which the data was created in all scenarios. V3 update only processes the update data that is successfully processed with the V2 update.
12. What is compression?
It is a process used to delete the Request IDs and this saves space.
13. What is Rollup?
This is used to load new DataPackages (requests) into the InfoCube aggregates. If we have not performed a rollup then the new InfoCube data will not be available while reporting on the aggregate.
14. What is table partitioning and what are the benefits of partitioning in an InfoCube?
It is the method of dividing a table which would enable a quick reference. SAP uses fact file partitioning to improve performance. We can partition only at 0CALMONTH or 0FISCPER. Table partitioning helps to run the report faster as data is stored in the relevant partitions. Also table maintenance becomes easier. Oracle, Informix, IBM DB2/390 supports table partitioning while SAP DB, Microsoft SQL Server, IBM DB2/400 do not support table portioning.
15. How many extra partitions are created and why?
Two partitions are created for date before the begin date and after the end date.
16. What are the options available in transfer rule?
InfoObject
Constant
Routine
Formula
17. How would you optimize the dimensions?
We should define as many dimensions as possible and we have to take care that no single dimension crosses more than 20% of the fact table size.
18. What are Conversion Routines for units and currencies in the update rule?
Using this option we can write ABAP code for Units / Currencies conversion. If we enable this flag then unit of Key Figure appears in the ABAP code as an additional parameter. For example, we can convert units in Pounds to Kilos.
19. Can an InfoObject be an InfoProvider, how and why?
Yes, when we want to report on Characteristics or Master Data. We have to right click on the InfoArea and select “Insert characteristic as data target”. For example, we can make 0CUSTOMER as an InfoProvider and report on it.
20. What is Open Hub Service?
The Open Hub Service enables us to distribute data from an SAP BW system into external Data Marts, analytical applications, and other applications. We can ensure controlled distribution using several systems. The central object for exporting data is the InfoSpoke. We can define the source and the target object for the data. BW becomes a hub of an enterprise data warehouse. The distribution of data becomes clear through central monitoring from the distribution status in the BW system.
21. How do you transform Open Hub Data?
Using BADI we can transform Open Hub Data according to the destination requirement.
22. What is ODS?
Operational DataSource is used for detailed storage of data. We can overwrite data in the ODS. The data is stored in transparent tables.
23. What are BW Statistics and what is its use?
They are group of Business Content InfoCubes which are used to measure performance for Query and Load Monitoring. It also shows the usage of aggregates, OLAP and Warehouse management.
24. What are the steps to extract data from R/3?
Replicate DataSources
Assign InfoSources
Maintain Communication Structure and Transfer rules
Create and InfoPackage
Load Data
25. What are the delta options available when you load from flat file?
The 3 options for Delta Management with Flat Files:
o Full Upload
o New Status for Changed records (ODS Object only)
o Additive Delta (ODS Object & InfoCube)
26. What are the inputs for an InfoSet?
The inputs for an InfoSet are ODS objects and InfoObjects (with master data or text).
27. What internally happens when BW objects like InfoObject, InfoCube or ODS are created and activated?
When an InfoObject, InfoCube or ODS object is created, BW maintains a saved version of that object but does not make it available for use. Once the object is activated, BW creates an active version that is available for use.
28. What is the maximum number of key fields that you can have in an ODS object?
16
29. What is the importance of 0REQUID?
It is the InfoObject for Request ID. OREQUID enables BW to distinguish between different data records.
30. Can you add programs in the scheduler?
Yes. Through event handling.
31. What does a Data IDoc contain?
Data IDoc contains:
o Control Record Contains administrator information such as receiver, sender and client.
o Data record
o Status Record Describes status of the record e.g., modified.
o
32. What is the importance of the table ROIDOCPRMS?
It is an IDOC parameter source system. This table contains the details of the data transfer like the source system of the data, data packet size, maximum number of lines in a data packet, etc. The data packet size can be changed through the control parameters option on SBIW i.e., the contents of this table can be changed.
33. When is IDOC data transfer used?
IDOCs are used for communication between logical systems like SAP R/3, R/2 and non-SAP systems using ALE and for communication between an SAP R/3 system and a non-SAP system. In BW, an IDOC is a data container for data exchange between SAP systems or between SAP systems and external systems based on an EDI interface. IDOCs support limited file size of 1000 bytes. So IDOCs are not used when loading data into PSA since data there is more detailed. It is used when the file size is lesser than 1000 bytes.
34. When an ODS is in 'overwrite' mode, does uploading the same data again and again create new entries in the change log each time data is uploaded?
No.
35. What is the function of 'selective deletion' tab in the manage contents of an InfoCube?
It allows us to select a particular value of a particular field and delete its contents.
36. When we collapse an InfoCube, is the consolidated data stored in the same InfoCube or is it stored in the new InfoCube?
When the cube is collapsed the data is stored in the same cube, data is stored in F table before the compress and in E table after the compression. These two tables are for the same cube.
37. What happens when you load transaction data without loading master data?
The transaction data gets loaded and the master data fields remain blank.
38. When given a choice between using an InfoCube and a MultiProvider, what factors to consider before making a decision?
One would have to see if the InfoCubes are used individually. If these InfoCubes are often used individually, then it is better to go for a MultiProvider with many InfoCubes since the reporting would be faster for an individual InfoCube query rather than for a big InfoCube with lot of data.
39. How many hierarchy levels can be created for a characteristic InfoObject?
Maximum of 98 levels.
40. What is the function of 'reconstruction' tab in an InfoCube?
It reconstructs the deleted requests from the InfoCube. If a request has been deleted and we want the data records of that request to be added to the InfoCube, we can use the reconstruction tab to add those records. It goes to the PSA and brings the data to the InfoCube.
41. What are secondary indexes with respect to InfoCubes?
It is an Index created in addition to the primary index of the InfoCube. When you activate a table in the ABAP Dictionary, an index is created on the primary key fields of the table. Further indexes created for the table are called secondary indexes.
42. What is DB Connect and where is it used?
DB connect is a database connecting program. It is used in connecting third party tools with BW for reporting purpose.
43. What is the common method of finding the tables used in any R/3 extraction?
By using the transaction LISTSCHEMA we can navigate the tables.
44. What is the difference between table view and InfoSet query?
An InfoSet Query is a query using flat tables while a view table is a view of one or more existing tables. Parts of these tables are hidden, and others remain visible.
45. How to load data from one InfoCube to another InfoCube?
Through DataMarts data can be loaded from one InfoCube to another InfoCube.
46. What is the difference between extract structure and DataSource?
DataSource defines the data from different source system, where an extract structure contains the replicated data of DataSource and where we define extract rules and transfer rules
Extract Structure is a record layout of InfoObjects.
Extract Structure is created on SAP BW system.
47. What is entity relationship model in data modeling?
An ERD (Entity Relation Diagram) can be used to generate a physical database.
It is a high level data model.
It is a schematic that shows all the entities within the scope of integration and the direct relationship between the entities.
48. What is DataMining concept?
Process of finding hidden patterns and relationships in the data.
With typical data analysis requirements fulfilled by data warehouses, business users have an idea of what information they want to see.
Some opportunities embody data discovery requirements, where the business user wants to correlate sets of data to determine anomalies or patterns in the data.
49. How does the time dependency work for BW objects?
Time Dependent attributes have values that are valid for a specific range of dates (i.e., valid period).
50. What is I_ISOURCE?
Name of the InfoSource