Data Mining & Business Intelligence - May 2015
Information Technology (Semester 6)
TOTAL MARKS: 80
TOTAL TIME: 3 HOURS (1) Question 1 is compulsory.
(2) Attempt any three from the remaining questions.
(3) Assume data if required.
(4) Figures to the right indicate full marks. 1 (a) Describe the different types of attributes one may come across in a data mining data set with two examples of each type.(5 marks) 1 (b) Explain the different distance measures that can be used to compute distance between two clusters.(5 marks) 1 (c) Define "Business Intelligence" and "Support System" with examples.(5 marks) 1 (d) Define "Outlier". What are the different types of Outliers that occur in dataset?(5 marks) 2 (a) Consider the following data points: 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
i) What is the mean of the data? What is the median?
ii) What is the mode of the data?
iii) What is the mid-range of the data?
iv) Can you find (roughly) the first quartile (Q1) and the third quartile (Q3) of the data?
v) Show a box plot of the data.(10 marks) 2 (b) Design a BI system for fraud detection. Describe all the steps from Data collection to Decision Making clearly.(10 marks) 3 (a) Illustrate any one classification technique for the above data set. Show how we can classify a new tuple. With (Homeowner=Yes; status=Employed; Income=Average).
Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70% and find all the association rules in the data set.(10 marks) 5 (b) Explain different methods that can be used evaluate and compare the accuracy of different classification algorithms.(10 marks) 6 (a) DBSCAN clustering algorithm with an example.(10 marks) 6 (b) Multilevel and Multidimensional Association rules.(10 marks)