Big Data Analytics Question Paper - Dec 16 - Information Technology (Semester 8)

32views

written 5.8 years ago by

Big Data Analytics - Dec 16

Information Technology (Semester 8)

Total marks: 80
Total time: 3 Hours INSTRUCTIONS
(1) Question 1 is compulsory.
(2) Attempt any three from the remaining questions.
(3) Draw neat diagrams wherever necessary.

1(a) What are the three Vs of Big Data? Give two examples of big data case studies. Indiacte which Vs are satisfied by these case studies.

5 marks 12234

1(b) What is the role of a "combiner" in the Map reduce framework? Explain with the help of an example.

5 marks 12235

1(c) Through an example illustrate how the triangular array can be usedn to optimally store and count pairs in a frequent itemset mining algorithm.

5 marks 12236

1(d) List the different issues and challenges in data stream query processing.

5 marks 6312

2(a) What are the different data architecture patterns on NOSQL? Explain "key value" store and "Document" store patterns with relevant examples.

10 marks 5980

2(b) Show Map Reduce implementation for the following two tasks using pseudocode.
i) Multiplication of two matrices
ii) Computing Group-by and aggregation of a relational table.

10 marks 12238

3(a) Give a formal definition of the Nearest Neighbor problem. Show how finding plagiarism in documents is Nearest Neighbor problem. What similarity measures can be used.

10 marks 12239

3(b) Clearly explain the concept of a Bloom Filter with the help of an example.

10 marks 6313

4(a) Suppose a data stream consists of the integers 3, 1, 4, 1, 5, 9, 2, 6, 5. Let the hash function being used is h(x) = 3x + 1 mod 5; Show how the Flajolet- Martin Algorithm will estimate the number of distinct element in this stream.

10 marks 12240

4(b) Clearly explain how the CURE algorithm can be used to cluster big data sets.

10 marks 6321

5(a) Define Collaborative filtering. Using an example of an e-commerce site like Filpkart of Amazon describe how it can be used to provide recommendations to users.

10 marks 12241

5(b) Define PageRank. Using the web graph shown below compute the PageRank at every node at the end of the second iteration. Use teleport factor = 0.8

enter image description here

10 marks 12242

6(a) Explain clearly with diagrams how the PCY algorithm helps to perform frequent itemset mining for large datasets.

10 marks 6319

6(b) For the graph given below use betweenness factor and find all communities

enter image description here

10 marks 12243

ADD COMMENT EDIT