CPSC 324: Big Data Analytics

Spring 2024

Course Information

Course Resources


Weekly Schedule
Week Topic HW
1 (1/17) Course Overview  
1 (1/19) Data Architectures  
2 (1/22) Reading Papers / Data Architectures R-1 (clusters, borg)
2 (1/24) Machines and Networks  
2 (1/26) Machines and Networks Q-1
3 (1/29) GCP Basics and System Properties HW-1 (gcp, cloud storage)
3 (1/31) System Properties  
3 (2/2) Cluster Management  
4 (2/5) Cluster Management R-2 (gfs, mr)
4 (2/7) Distributed File Systems  
4 (2/9) Distributed FIle Systems Q2
5 (2/12) BigQuery Basics HW-2 (bq)
5 (2/14) Map Reduce  
5 (2/16) Map Reduce  
6 (2/19) No Class: President's Day Holiday  
6 (2/21) Map Reduce  
6 (2/23) Map Reduce Examples Q3
7 (2/26) Looker Studio Overview HW-3 (looker)
7 (2/28) Data Warehouse Concepts  
7 (3/1) Exam 1  
8 (3/4) Data Processing Architectures R-3 (dremel, spark)
8 (3/6) Data Processing Architectures  
8 (3/8) Query Processing  
(3/11-3/15) No Class: Spring Break  
9 (3/18) Dataproc (Spark) Basics, Query Processing HW-4 (spark), Project
9 (3/20) Query Processing  
9 (3/22) Query Processing Q4
10 (3/25) Query Processing R-4 (dataflow)
10 (3/27) Open Data Formats (intro)  
10 (3/29) No Class: Good Friday  
11 (4/1) No Class: Easter Holiday  
11 (4/3) BQ ML Overview, Open Data Formats (cont) HW-5 (bq ml)
11 (4/5) Open Data Formats (PAX, Parquet) Proj part 1 due
12 (4/8) Open Data Formats (Parquet details)  
12 (4/10) SQL on MapReduce  
12 (4/12) SQL on MapReduce Q5, R-5 (tensorflow)
13 (4/15) SQL on MapReduce  
13 (4/17) Distributed SQL: Presto HW-6 (pipelines)
13 (4/19) Exam 2  
14 (4/22) Pub/Sub, Dataflow Proj part 2 due
14 (4/24) Distributed SQL: Presto  
14 (4/26) Distributed SQL: Presto Q6
15 (4/29) Distributed SQL: Presto  
15 (5/1) Distributed SQL: Dremel  
15 (5/3) Wrap Up Make-Up Quiz
16 (5/8) FINAL EXAM, WED, 3:30-5:30 Proj due (5/10)