Monday, January 07, 2008

fwd:Kimball's DW interview question sample

Sample Interview Questionnaire
Analysis
1. What is a logical data mapping and what does it mean to the ETL team?
2. What are the primary goals of the data discovery phase of the data warehouse project?
3. How is the system-of-record determined?
Architecture
4. What are the four basic Data Flow steps of an ETL process?
5. What are the permissible data structures for the data staging area? Briefly describe the pros
and cons of each.
6. When should data be set to disk for safekeeping during the ETL?
Extract
7. Describe techniques for extracting from heterogeneous data sources.
8. What is the best approach for handling ERP source data?
9. Explain the pros and cons of communicating with databases natively versus ODBC.
10. Describe three change data capture (CDC) practices and the pros and cons of each.
Data Quality
11. What are the four broad categories of data quality checks? Provide an implementation
technique for each.
12. At which stage of the ETL should data be profiled?
13. What are the essential deliverables of the data quality portion of ETL?
14. How can data quality be quantified in the data warehouse?
Building mappings
15. What are surrogate keys? Explain how the surrogate key pipeline works.
16. Why do dates require special treatment during the ETL process?
17. Explain the three basic delivery steps for conformed dimensions.
18. Name the three fundamental fact grains and describe an ETL approach for each.
19. How are bridge tables delivered to classify groups of dimension records associated to a single
fact?
20. How does late arriving data affect dimensions and facts? Share techniques for handling each.
Metadata
21. Describe the different types of ETL metadata and provide examples of each.
22. Share acceptable mechanisms for capturing operational metadata.
23. Offer techniques for sharing business and technical metadata.
Optimization/Operations
24. State the primary types of tables found in a data warehouse and the order which they must be loaded to enforce referential integrity.
25. What are the characteristics of the four levels of the ETL support model?
26. What steps do you take to determine the bottleneck of a slow running ETL process?
27. Describe how to estimate the load time of a large ETL job.
Real Time ETL
28. Describe the architecture options for implementing real-time ETL.
29. Explain the different real-time approaches and how they can be applied in different business
scenarios.
30. Outline some challenges faced by real-time ETL and describe how to overcome them.

0 Comments:

Post a Comment

<< Home