Data Warehouse & Mining - Dec 2016
Computer Engineering (Semester 8)
TOTAL MARKS: 100
TOTAL TIME: 3 HOURS (1) Question 1 is compulsory.
(2) Attempt any four from the remaining questions.
(3) Assume data wherever required.
(4) Figures to the right indicate full marks. 1(a) Consider following dimensions for a Hypermarket chain: Product, Store, Time and Promotion. With respect to this business scenario, answer the following questions. Clearly state any reasonable assumptions you make. Design a star schema. Whether the star schema can be converted to snowflake schema? Justify your answer and draw snowflake schema for the data warehouse (clearly mention the Fact table (s), Dimension table (s), their attributes and measures).(10 marks) 1(b) Define linear, non-linear and multiple regressions. Plan a regression model for Disease development with respect to change in weather parameters.(10 marks) 2(a) What is meant by metadata in the context of a Data warehouse? Explain the different types of meta data stored in a data warehouse. Illustrate with a suitable example.(10 marks) 2(b) Describe the various functionalities of Data mining as a step in the process of Knowledge Discovery.(10 marks) 3(a) In what way ETL cycle can be used in typical data ware house, explain with suitable instance.(10 marks) 3(b) What is Clustering Technique? Discuss the Agglomerative algorithm with the following data and plot a Dendrogram using single link approach.The table below comprises sample data items indicating the distance between the elements.
|T-1000||M, O, N, K, E, Y|
|T-1001||D, O, N, K, E, Y|
|T-1002||M, A, K, E|
|T-1003||M, U, C, K, Y|
|T-1004||C, O, O, K, E|
i) OLTP Vs. OLAP
ii) Data Warehouse Vs.Data Mart(10 marks) 5(b) Why naive Bayesian classification is called "naive"? Briefly outline the major ideas of naive Bayesian classification.(10 marks)
Write a short note any four Q6.(i, ii, iii, iv, v)
6(i) Application of Data Mining to Financial Analysis(5 marks) 6(ii) Fact less Fact Table(5 marks) 6(iii) Indexing OLAP data(5 marks) 6(iv) Data Quality(5 marks) 6(v) Decision Tree based Classification Approach(5 marks)