This course adopts the textbook entitled "Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning," authored by Shan Suthaharan and published by Springer US in 2015-2016. It covers the main topics suitable for learning about big data and machine learning under four categories, namely, understanding of data, understanding of systems, understanding of machine learning, and understanding of scaling up machine learning. In essence this course will prepare you to become data analyst, data scientist, or an educator; not only by providing you with the theoretical back ground in these topics, but also with many worked examples that will challenge you expand the ideas to your own applications.
The main motivation for the development and the delivery of the course is the need for experts to work on the current massive data problems encountered by the industry and research organizations. This problem is expected to grow exponentially with respect to the growth of data and the challenges that the data complexity and the technological limitations bring to the problem domain. Therefore, it is important to educate and generate many professional as possible and help them understand the necessity of collaborative (team work) efforts. The development and delivery of the course are also motivated by the support and encouragement from the University of North Carolina at Greensboro, "Center for Science of Information" at Purdue University, and Springer US.