Knowledge discovery as a process consists of an iterative sequence of the following steps:
- Data cleaning:
It can be applied to remove noise and correct inconsistencies in the data.
- Data integration:
Data integration merges data from multiple sources into a coherent data store, such as a data warehouse.
- Data selection:
where data relevant to the analysis task are retrieved from the database.
- Data transformation:
where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations.
example, normalization may improve the accuracy and efficiency of mining algorithms involving distance measurements.
- Data mining:
an essential process where intelligent methods are applied in order to extract data patterns.
- Pattern evaluation:
to identify the truly interesting patterns representing knowledge based on some interestingness measures.
- Knowledge presentation:
where visualization and knowledge representation techniques are used to present the mined knowledge to the user.