CIS-4250/5250 (Big Data Processing) Home Page

This is the home page for Peter Chapin's CIS-4250/5250 course notes for the Fall 2018 semester. Here you will find class handouts, slides used during the lectures, homework assignments, and links to other references of interest.

File Size MD5
Jumbo-2018-08-25.ova (lemuria)
Jumbo-2018-08-25.ova (external)
14,155,683,840 bytes 37ab28dfb2a9da09562c7ca6b8924a57


All live lectures will be accessed from the following Zoom URL:

The list below are approximate lecture-by-lecture topics for this course. The topics with links to Zoom lectures are for this (Fall 2018) edition of the course. The topics without links are approximate and subject to change.




  1. Homework #01 Average Temperatures (Awk). Due: 2018-09-06
  2. Homework #02 Average Temperatures (Hadoop). Due: 2018-09-21
  3. Homework #03 Analyzing the Ga Data Set. Due: 2018-10-16 (see AverageTemperature.scala as a template for using the Tool interface)
  4. Homework #04 Ga with Spark. Due: 2018-10-26
  5. Homework #05 RFCAnalyzer. Due: 2018-12-14
  6. Homework #06 Kafka. Due: 2018-12-14


The following are links to relevant resources for this class.

Last Revised: 2018-12-12
© Copyright 2018 by Peter C. Chapin <>