Question Paper: Data Warehouse & Mining : Question Paper May 2012 - Computer Engineering (Semester 8) | Mumbai University (MU)
0

Data Warehouse & Mining - May 2012

Computer Engineering (Semester 8)

TOTAL MARKS: 100
TOTAL TIME: 3 HOURS
(1) Question 1 is compulsory.
(2) Attempt any four from the remaining questions.
(3) Assume data wherever required.
(4) Figures to the right indicate full marks.
1 (a) Define a data warehouse. Explain what the need for developing a data Warehouse and hence explain its architecture.(10 marks) 1 (b) Compare OLTP and OLAP systems. Explain the steps in KDD with a suitable block diagram.(10 marks) 2 (a) What is meant by ETL? Explain the ETL process in detail.(10 marks) 2 (b) State and explain the various schemas used in data warehousing with examples for each of them(10 marks) 3 (a) Differentiate between top down and bottom-up approaches for building a data warehouse. Explain the advantages and disadvantages of each of them.(10 marks) 3 (b) Define what is meant by information package diagram. For recording the information requirements for "hotel occupancy" having dimensions like time, hotel etc, give the information package diagram for the same, also draw the star schema and snow flake schema.(10 marks) 4 (a) What is meant by Meta data? Explain with an example. Explain the different type of metadata stored in Data warehouse.(10 marks) 4 (b) What is meant by association rule mining for the example give below perform apriori algorithm. Also -
(1) Determine the k-item sets (frequent) obtained.
(2) Justify the strong association rule that has been determined i.e. specify which is the strangest rule obtained
The table is as follows -

TID Items
01 1,3,4,6
02 2,3,5,7
03 1,2,3,5,8
04 2,5,9,10
05 1,4

Assume minimum support 30% and minimum confidence 75%. (10 marks) 5 (a) Explain dimension modelling in detail.(10 marks) 5 (b) Explain what is meant by clustering? State and explain various types with suitable example.(10 marks) 6 (a) What is meant by classification? Justify why clustering is said to be supervised learning. How the classifier accuracy determined and also explain its various types.(10 marks) 6 (b) What is meant by Market Basket Analysis? Explain with example. State and explain with formula the meaning of the terms:-
(i) Support
(ii) Confidence
(iii) Iceberge queries
Hence explain how to mining multilevel association rules from transactional databases, with example of each.
(10 marks)


Write short notes on (any two)

7 (a) OLAP operations(10 marks) 7 (b) Data Warehouse Deployment and maintenance(10 marks) 7 (c) Attribute oriented induction(10 marks) 7 (d) Web Mining(10 marks)

Please log in to add an answer.