The main goal of this assignment is to prepare and test your knowledge in the first objective; that is, knowing and preparing your data. It helps you understand the characteristics of data such that correct techniques and technologies can be selected to process and analyze the data, and make decisions.
In this assignment, you will demonstrate your knowledge and experience in Hadoop (or Spark) installation and configuration, a suitable programming language/environment, and the modern big data processing framework called MapReduce with mapper and reducer modules.
The main purpose of this assignment is to test your abilities to implement at least one machine learning technique on both a regular and big data computing platforms, and analyze your data sets.