0
15kviews
Explain Data Mining Primitives.
1 Answer
0
155views

• Each user will have a data mining task in mind, that is, some form of data analysis that he or she would like to have performed.

• A data mining task can be specified in the form of a data mining query , which is input to the data mining system. A data mining query is defined in terms of data mining task primitives.

• These primitives allow the user to interactively communicate with the data mining system during discovery in order to direct the mining process, or examine the findings from different angles or depths.

The data mining primitives specify the following, as illustrated in

 The set of task-relevant data to be mined: This specifies the portions of the database or the set of data in which the user is interested. This includes the database attributes or data warehouse dimensions of interest (referred to as the relevant attributes or dimensions ).

 The kind of knowledge to be mined: This specifies the data mining functions to be performed, such as characterization, discrimination, association or correlation analysis, classification, prediction, clustering, outlier analysis, or evolution analysis.

 The background knowledge to be used in the discovery process: This knowledge about the domain to be mined is useful for guiding the knowledge discovery process and for evaluating the patterns found. Concept hierarchies are a popular form of background knowledge, which allow data to be mined at multiple levels of abstraction. An example of a concept hierarchy for the attribute (or dimension) age is shown in User beliefs regarding relationships in the data are another form of background knowledge.

 The interestingness measures and thresholds for pattern evaluation: They may be used to guide the mining process or, after discovery, to evaluate the discovered patterns. Different kinds of knowledge may have different interestingness measures. For example, interestingness measures for association rules include support and confidence .Rules whose support and confidence values are below user-specified thresholds are considered uninteresting.

 The expected representation for visualizing the discovered patterns: This refers to the for min which discovered patterns are to be displayed, which may include rules, tables, charts, graphs, decision trees, and cubes. A data mining query language can be designed to incorporate these primitives, allowing users to flexibly interact with data mining systems. Having a data mining query language provides a foundation on which user-friendly graphical interfaces can be built.

Please log in to add an answer.