Data Mining & Business Intelligence : Question Paper May 2015 - Information Technology (Semester 6)

25views

written 8.1 years ago by

Data Mining & Business Intelligence - May 2015

Information Technology (Semester 6)

TOTAL MARKS: 80
TOTAL TIME: 3 HOURS (1) Question 1 is compulsory.
(2) Attempt any three from the remaining questions.
(3) Assume data if required.
(4) Figures to the right indicate full marks. 1 (a) Describe the different types of attributes one may come across in a data mining data set with two examples of each type.(5 marks) 1 (b) Explain the different distance measures that can be used to compute distance between two clusters.(5 marks) 1 (c) Define "Business Intelligence" and "Support System" with examples.(5 marks) 1 (d) Define "Outlier". What are the different types of Outliers that occur in dataset?(5 marks) 2 (a) Consider the following data points: 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
i) What is the mean of the data? What is the median?
ii) What is the mode of the data?
iii) What is the mid-range of the data?
iv) Can you find (roughly) the first quartile (Q1) and the third quartile (Q3) of the data?
v) Show a box plot of the data.(10 marks) 2 (b) Design a BI system for fraud detection. Describe all the steps from Data collection to Decision Making clearly.(10 marks) 3 (a) Illustrate any one classification technique for the above data set. Show how we can classify a new tuple. With (Homeowner=Yes; status=Employed; Income=Average).

Id	Homeowner	Status	Income	Defaulted
1	Yes	Employed	High	No
2	No	Business	Average	No
3	No	Employed	Low	No
4	Yes	Business	HIgh	No
5	No	Unemployed	Average	Yes
6	No	Business	Low	No
7	Yes	Unemployed	High	No
8	No	Employed	Average	Yes
9	No	Business	Low	No
10	No	Employed	Average	Yes

(10 marks) 3 (b) Why is Data Preprocessing required? Explain the different steps involved in Data Preprocessing.(10 marks) 4 (a) Use K-means to cluster the following data set into 3 clusters.

Protein	20	21	15	22	20	25	26	20	18	20
Fat	9	9	7	17	8	12	14	9	9	9

(10 marks) 4 (b) Describe the different visualization techniques that can be used in Data Mining.(10 marks) 5 (a) Consider the following transaction database:

TID	Items
01	A,B,C,D
02	A,B,C,D,E,G
03	A,C,G,H,K
04	B,C,D,E,K
05	D,E,F,H,L
06	A,B,C,D,L
07	B,I,E,K,L
08	A,B,D,E,LK
09	A,E,E,H,L
10	B,C,D,F

Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70% and find all the association rules in the data set.(10 marks) 5 (b) Explain different methods that can be used evaluate and compare the accuracy of different classification algorithms.(10 marks) 6 (a) DBSCAN clustering algorithm with an example.(10 marks) 6 (b) Multilevel and Multidimensional Association rules.(10 marks)

ADD COMMENT EDIT

13views

written 8.0 years ago by

bhagyashreekhole • 0

DATA MINING attribute types:

There are different types of attributes

Nominal. Examples: ID numbers, eye color, zip codes
Ordinal. Examples: rankings (e.g., taste of potato chips on a scale from 1-10), grades, height in {tall, medium, short}
Interval. Examples: calendar dates, temperatures in Celsius or Fahrenheit
Ratio. Examples: temperature in Kelvin, length, time, counts

1. Nominal: The values of a nominal attribute are just different names, i.e., nominal attributes provide only enough information to distinguish one object from another. (=, ≠). Example: zip codes, employee ID numbers, eye color,sex: {male, female}

2. Ordinal: The values of an ordinal attribute provide enough information to order objects. (<, >) Example: hardness of minerals,{good, better, best},grades, street numbers

ADD COMMENT EDIT