0
Data Warehouse And Mining Question Paper - Dec 18 - Computer Engineering (Semester 8) - Mumbai University (MU)

## Data Warehouse And Mining - Dec 18

### Computer Engineering (Semester 8)

Total marks: 80
Total time: 3 Hours
INSTRUCTIONS
(1) Question 1 is compulsory.
(2) Attempt any three from the remaining questions.
(3) Draw neat diagrams wherever necessary.

1.a. Information requirements are recorded for “Hotel occupancy” considering dimensions like Hotel, Room and Time. Few Facts recorded are vacant rooms, occupied rooms, number of occupants, etc.

Answer the following questions for this problem:

i. Design the star schema.

ii. Can you convert this star schema to a snowflake schema? If yes, justify and draw the snowflake schema.

(10 marks) 12809

1.b. Explain Data mining as a step in KDD .Illustrate the architecture of typical data mining system.
(10 marks) 12802

2.a. The college wants to record the Marks for the courses completed by students using the dimensions: I) Course, II) Student, III) Time & a measure Aggregate marks

Create a Cube and perform following OLAP operations :

i) Rollup ii) Drill down iii) Slice iv) Dice v) Pivot.

(10 marks) 12803

2.b. Apply the Naive Bayes classifier algorithm to classify an unknown sample X (outlook = sunny, temperature = cool, humidity = high, windy = false ) The sample data set is as follows:

Outlook Temperature Humidity Windy Class
Sunny Hot High False N
Sunny Hot High True N
Overcast Hot High False P
Rain Mild High False P
Rain Cool Normal False P
Rain Cool Normal True N
Overcast Cool Normal True P
Sunny Mild High False N
Sunny Cool Normal False P
Rain Mild Normal False P
Sunny Mild Normal True P
Overcast Mild High True P
Overcast Hot Normal False P
Rain Mild High True P

(10 marks) 12807

3.a. Discuss Data Warehouse design strategies in detail?
(10 marks) 12806

3.b. Discuss the types of attributes and data visualization for data exploration.
(10 marks) 12787

4.a. Discuss various OLAP models and their architecture.
(10 marks) 12794

4.b. Find clusters using k-means clustering algorithm, if we have several objects(4 types of medicines) and each object have two attributes or features as shown in table below. The intention is to group these objects into k = 2 group of medicine based on the two features (pH and weight index)

Object Attribute 1 (X) Weight Index Attribute 2 (Y) pH
Medicine A 1 1
Medicine B 2 1
Medicine C 4 3
Medicine D 5 4

(10 marks) 12804

5.a. Discuss the process of extraction, transformation and loading with a neat and labelled diagram.
(10 marks) 12808

5.b. A database has five transactions. Let minimum support = 40% and minimum confidence = 60%
i) Find all frequent patterns using Apriori Algorithm.
ii) List strong association rules

Transaction-Id Items
A 1, 3, 4, 6
B 2, 3, 5, 7
C 1, 2, 3, 5, 8
D 2, 5, 9, 10
E 1, 4

(10 marks) 12805

Q.6 Write short note on the following (Answer any FOUR) [20]

6.a. Applications of Data Mining (minimum two in detail)
(5 marks) 5924

6.b. Data pre-processing
(5 marks) 5927

6.c. FP Tree
(5 marks) 5965

(5 marks) 5967

6.e. Meta data with example
(5 marks) 5392