Data Mining

What is Data mining?

  • Data mining is the process of extracting the useful information stored in the large database.
  • It is the extraction of hidden predictive information.
  • Data Mining is the practice of automatically searching the large stores of data to discover patterns.
  • Data Mart is a powerful new technology with great potential that helps organization to focus on the most important information in their data warehouse.
  • It uses mathematical algorithms to segment the data and evaluates the probability of future events.
  • Data mining is a powerful tool used to retrieve the useful information from available data warehouses.
  • Data mining can be applied to relational databases, object-oriented databases, data warehouses, structured-unstructured databases etc.
  • Data mining is also known as Knowledge Discovery in Databases (KDD).
knowledge discovery in database

Different steps of KDD as per the above diagram are:

1. Data cleaning removes irrelevant data from the database.
2. Data integration: The heterogeneous data sources are merged into a single data source.
3. Data selection retrieves the relevant data to the analysis process from the database.
4. Data transformation: The selected data is transformed in forms which are suitable for data mining.
5. Data mining: The various techniques are applied to extract the data patterns.
6. Pattern evaluation evaluates different data patterns.
7. Knowledge representation: This is the final step of KDD which represents the knowledge.