Data Mining & Business Intelligence - Dec 2016
Information Technology (Semester 6)
TOTAL MARKS: 80
TOTAL TIME: 3 HOURS (1) Question 1 is compulsory.
(2) Attempt any three from the remaining questions.
(3) Assume data if required.
(4) Figures to the right indicate full marks. 1(a) Define "Data Mining". Enumerate five example applications that can benefit by using Data Mining.(5 marks) 1(b) Clearly explain the data preprocessing phase for data mining.(5 marks) 1(c) Describe one hierarchical clustering algorithm using an example dendrogram.(5 marks) 1(d) Explain the concept of a decision support system with the help of an example application.(5 marks) 2(a) Partition the given data into 4 bins using Equi-depth binning method and perform smoothing according to the following methods.
Smoothing by bin mean
Smoothing by bin median
Smoothing by bin boundaries Data: 11, 13, 13, 15, 15, 16, 19, 20, 20, 20, 21, 21, 22, 23, 24, 30, 40, 45, 45, 71, 72, 73, 75(10 marks) 2(b) For the same set of data points in question 2.(a)
a) Find Mean, Medium and Mode.
b) Show a boxplot of the data. Clearly indicating the five-number summary.(10 marks) 3(a) The table below shows a sample dataset of whether a customer reponds to a survey of not. " Outcome" is the class label. Construct a Naive Bayes' Classifier for the dataset. For a new example (Rural, semidetached, low,No), what will be the predicted class label?
|District||House Type||Income||Previous Customer||Outcome|
|01||A, B, D, E, F|
|02||B, C, E|
|04||A, B, D, E|
|04||A, B, C, E|
|05||A, B, C, D, E,F|
|06||B, C, D|
|07||A, B, D,E|