written 8.6 years ago by |
Data Mining & Business Intelligence - May 2015
Information Technology (Semester 6)
TOTAL MARKS: 80
TOTAL TIME: 3 HOURS
(1) Question 1 is compulsory.
(2) Attempt any three from the remaining questions.
(3) Assume data if required.
(4) Figures to the right indicate full marks.
1 (a) Describe the different types of attributes one may come across in a data mining data set with two examples of each type.(5 marks)
1 (b) Explain the different distance measures that can be used to compute distance between two clusters.(5 marks)
1 (c) Define "Business Intelligence" and "Support System" with examples.(5 marks)
1 (d) Define "Outlier". What are the different types of Outliers that occur in dataset?(5 marks)
2 (a) Consider the following data points: 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
i) What is the mean of the data? What is the median?
ii) What is the mode of the data?
iii) What is the mid-range of the data?
iv) Can you find (roughly) the first quartile (Q1) and the third quartile (Q3) of the data?
v) Show a box plot of the data.(10 marks)
2 (b) Design a BI system for fraud detection. Describe all the steps from Data collection to Decision Making clearly.(10 marks)
3 (a) Illustrate any one classification technique for the above data set. Show how we can classify a new tuple. With (Homeowner=Yes; status=Employed; Income=Average).
Id | Homeowner | Status | Income | Defaulted |
1 | Yes | Employed | High | No |
2 | No | Business | Average | No |
3 | No | Employed | Low | No |
4 | Yes | Business | HIgh | No |
5 | No | Unemployed | Average | Yes |
6 | No | Business | Low | No |
7 | Yes | Unemployed | High | No |
8 | No | Employed | Average | Yes |
9 | No | Business | Low | No |
10 | No | Employed | Average | Yes |
Protein | 20 | 21 | 15 | 22 | 20 | 25 | 26 | 20 | 18 | 20 |
Fat | 9 | 9 | 7 | 17 | 8 | 12 | 14 | 9 | 9 | 9 |
TID | Items |
01 | A,B,C,D |
02 | A,B,C,D,E,G |
03 | A,C,G,H,K |
04 | B,C,D,E,K |
05 | D,E,F,H,L |
06 | A,B,C,D,L |
07 | B,I,E,K,L |
08 | A,B,D,E,LK |
09 | A,E,E,H,L |
10 | B,C,D,F |
Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70% and find all the association rules in the data set.(10 marks) 5 (b) Explain different methods that can be used evaluate and compare the accuracy of different classification algorithms.(10 marks) 6 (a) DBSCAN clustering algorithm with an example.(10 marks) 6 (b) Multilevel and Multidimensional Association rules.(10 marks)
written 8.4 years ago by |
DATA MINING attribute types:
There are different types of attributes
- Nominal. Examples: ID numbers, eye color, zip codes
- Ordinal. Examples: rankings (e.g., taste of potato chips on a scale from 1-10), grades, height in {tall, medium, short}
- Interval. Examples: calendar dates, temperatures in Celsius or Fahrenheit
- Ratio. Examples: temperature in Kelvin, length, time, counts
1. Nominal: The values of a nominal attribute are just different names, i.e., nominal attributes provide only enough information to distinguish one object from another. (=, ≠). Example: zip codes, employee ID numbers, eye color,sex: {male, female}
2. Ordinal: The values of an ordinal attribute provide enough information to order objects. (<, >) Example: hardness of minerals,{good, better, best},grades, street numbers
written 8.4 years ago by |
1a) Answer : https://www.ques10.com/p/2774/describe-the-different-types-of-attributes-one-may/#2775