Data Warehouse

Definition of Data Warehouse

According to Bill Inomn (1990)
“A data warehouse is a subject oriented, integrated, time-variant and non-volatile collection of data. This data helps analysts to make informed decisions in an organization.”

  • A Data Warehouse is a relational database which is designed to support management and decision – making.
  • It is designed for query analysis rather than transaction processing.
  • It contains historical data which is derived from transactional data, but it can include data from various sources.
  • It contains a wide variety of data that presents a coherent picture of business conditions at a single point in time.

Features of Data Warehouse

  • Subject Oriented - Data warehouse provides the information about major subject areas of the organization.
  • Integrated - It is constructed by integrating data and enhances the effective analysis of data.
  • Non-Volatile - Data warehouse does not erase the previous data when new data is added to it.
  • Time Variant - It provides the information in a historical view.

Advantages of Data Warehouse

  • Data Warehouse reduces the cost used to access historical data.
  • It removes informational processing load from transaction – oriented databases.
  • It allows others to access and share the data.
  • It improves turnaround time for analysis and reporting.

Disadvantages of Data Warehouse

  • Data Warehouse is not easy to maintain. It can be costly to maintain it.
  • Data Warehouse has security issues.
  • It is a time consuming process.
  • It is difficult to accommodate the changes in data types and ranges and also in the data source schema, indexed and queries.

Types of Data Warehouse

Following are the types of Data Warehouse,
1. Information Processing
2. Analytical Processing
3. Data Mining

1. Information Processing allows to process the data which is stored in Data Warehouse.
2. Analytical Processing uses some basic OLAP operations such as slice-and-dice, drill down, drill up and pivoting.
3. Data Mining can be represented using visualization tools and supports knowledge discovery.

Data Mart

  • A data mart is a simple form of data warehouse.
  • It is a repository of data which is designed to serve a particular community of knowledge workers.
  • Data mart is focused on a single subject, such as sales, finance or marketing.
  • It is controlled by a single department within an organization.
  • It deals with multiple subject areas.
  • It is implemented and controlled by a central organizational unit such as the Corporate Information Technology (IT) group.
  • Data marts are smaller and less complex than a data warehouse.
  • They are easier to build and maintain.
Following are the differences between Data Warehouse and Data Marts,

Data WarehouseData Mart
Data warehouse is related to a central repository for all organization's data.Data Mart is only related to a specific group of users within the organization.
It holds multiple subjects.It holds single subject.
It takes the data from many data sources.It takes the data from a few data sources.
Size required can be 100 GB – TB and above.Size required can be less than 100 GB.
Implementation time can be from months to years.Implementation time can be in months.
It holds detailed information.It holds summarized data.