Data Warehouse Interview Questions and Answers

Data Warehouse interview questions

These Data Warehouse questions have been designed for various interviews, competitive exams and entrance tests. We have covered questions on both basic and advanced concepts which will help you improve your skills to face interview questions on Data Warehouse.

Who are these Data Warehouse interview questions designed for?

All the Data Warehouse Developers, Business Intelligence Managers, ETL Developers, Data Warehouse ETL Testing, Data Warehouse Specialists, Data Warehousing Technical Architects , Data Warehousing Business Analysts and Data Warehouse Architects will find these questions extremely useful. All freshers, BCA, BE, BTech, MCA and college students wanting to make a career in front end designing will be highly benefited by these questions.

Data Warehouse interview questions topics

This section covers Data Warehouse topics like - Business Intelligence, ETL, ER Modeling, dimensional modeling, Fact Table, Dimension Table, Data Mining, Materialized View, Star Schema, Snowflake Star Schema etc.

1. What is a Data Warehousing? How Data Warehouse works?

Data warehousing is process of accumulating a wide variety of organization's historical data and organizing it into simple structure so that analysts and programs can leverage it for better decision making.

A Data warehouse acts as a central repository where data is collected from different databases. The data is then processed, transformed and loaded into a suitable format. The processed data in the Data warehouse can be accessed through Business Intelligence tools, SQL clients, spreadsheets etc.

Video : Data Warehouse Interview Questions and Answers - For Freshers and Experienced

2. What is Business Intelligence?

BI(Business Intelligence) is a software suite to transform raw data into actionable information to help make informed business decisions. BI tools provide relevant, reliable and actionable information to the right people at the right time with the goal of achieving better decision faster. It creates reports, summaries, dashboards, maps, graphs, charts etc. and provides users with detailed intelligence about the nature of the business.  

3. What is ETL?

ETL stands for Extract, Transform and Load. It is a process of extracting the data from disparate sources, transforming it into the required format and finally loading it into the Data Warehouse system.

To perform all these functions, certain tools are used which are called the ETL tools such as Oracle Warehouse Builder, PowerCenter Informatica, SQL Server Integration Services (SSIS), Open Text Integration Center etc.

4. What is Fact and Dimension in Data Warehouse?

Fact in Data warehouse provides quantitative information about the business process. They are also called as measurements and metrics. Quantity, Sales Amount, Profit, Turnover etc. are some examples of Fact.

Dimension is an object that describes facts or business numbers. Product, Location, Time are a few examples of Dimension.

5. Compare Database and Data Warehouse Utility

A Database stores current data and processes day-to-day transactions in an organization. Organizations most often use databases for online transaction processing (OLTP).

A Data warehouses is used for online analytical processing (OLAP), which uses complex queries to analyze rather than process transactions.

Data Structure

Most databases use a normalized data structure.
Data warehouses normally use a de-normalized data structure helping better performance when reading data for analytical purposes.

Concurrent Users

An OLTP database supports thousands of concurrent users.
Data warehouses support a limited number of concurrent users.

Purpose

A Database is designed to record day day-to-day transaction in an organization
A Data warehouse focuses on providing high-level reporting and analysis.

Modeling technique

Entity Relational modeling techniques are put in use for designing RDMS databases
Data modeling techniques are utilized for designing a data warehouse.

Optimization

Database is optimized for write operation.
Data warehouse is optimized for read operations.

Performance

Performance is low for analysis queries in a database.
A data warehouse extends high performance for analytical queries.

6. What do you understand by dimensional modeling?

Dimensional model is a data modeling technique used in data warehouse design. In this design model, all the data is stored in two types of tables - Facts table and Dimension table. Fact tables hold numeric data.  Dimension tables hold the descriptive information to outline the facts.

7. What is Fact Table? What is Dimension table?

Fact Table

A fact table is used in the dimensional model in data warehouse design.
A Fact table contains the transactional measurements/facts or the quantitative information of business processes and foreign keys to dimension table.

Dimension Table

A dimension table contains dimensions of a fact. It provides descriptive information for all the measurements recorded in fact table, i.e. Product ID, Product Category etc. They are joined to fact table via a foreign key.

8. What is the difference between ER Modeling and Dimensional Modeling?

ER Modeling

i.) ER Modeling is used for Online Line Transaction Processing (OLTP) application design and optimized for Insert/Update/Delete/Select data.

ii.) ER Modeling revolves around the Entities and their Relationships to capture the overall process of the system or transactions.

iii.) In ER Modeling, tables are the units of storage.

iv.) ER Modeling contains normalized data.

Dimensional Modeling

i.) Dimensional Modeling is used for Online Analytical Processing (OLAP) applications design and optimized for retrieving data or answering queries on business process.

ii.) Dimensional Modeling revolves around Dimensions for decision making and doesn't capture the process.

iii.) In Dimensional Modeling, Cubes are the units of storage.

iv.) Dimensional Modeling contains de-normalized data.

9. What is Data Mining?

Data Mining is the procedure of mining knowledge from huge sets of data. It generates detailed insights of the business. The insights derived via Data Mining can be used for marketing, fraud detection, decision-making process etc.

10. What is the difference between view and materialized view?

View

i.) A view is a virtual table formed from one or more base tables or views. It doesn't physically hold any data.

ii.) Since a View is not pre-computed and stored on a disk, you always get the updated data in a View when any changes are made to the original base table.

iii.) It is used for security purpose. Using Views, you can restrict the user from accessing sensitive information in a database.

iv.) It reduces the complexity of queries by getting data from several tables into a single customized View.

Materialized view

i.) A Materialized View is the physical copy of the original base tables. It holds data physically in a table.

ii.) It is pre-computed and stored on a disk like an object, and it is not updated each time it is used.

iii.) Materialized view improves performance. Since it is pre-computed, it responds faster in comparison to View.