Big Data Systems (ELL886)
Credit
3.00 (L-T-P: 3-0-0)
Department / Center / School / Unit
Course Contents
Introduction; Hadoop, Map-Reduce, GFS/HDFS, Bigtable/HBASE; Extension of Map- Reduce: iMap-reduce (iterative), incremental map-reduce. SQL and Data-parallel programming, DryadLINQ. Data-flow parallelism vs. message passing. Data locality. Memory hierarchies. Sequential versus random access to secondary storage. NoSQL systems. NewSQL systems. Finding similar items and LSH; Search Technology: link analysis and Page-rank algorithm; Large Scale Graph Processing; Mining Streaming Data and Realtime analytics: Window semantics and window joins. Sampling and approximating aggregates (no joins). Querying histograms. Maintaining histograms of streams. Use of Haar wavelets. Incremental and online query processing: online aggregation.