Applied Machine Learning

LACOL Shared Course: Applied Machine Learning (ML)

Summer Term: June 7 – August 4, 2023

Format: Fully online class with synchronous and asynchronous elements

Summer 2023 Teaching Team: 

  • Simon Hoellerbauer, Post Doctoral Fellow in Data Science and Society, Vassar College
  • Natalia Toporikova, Assoc. Professor of Neuroscience and Data Science, Washington and Lee University
  • additional teaching staff TBA

Project Advisors: 

  • Laurie Heyer, Professor of Mathematics and Computer Science, Davidson College
  • Monika Hu, Associate Professor of Statistics, Vassar College

Eligibility: Open to approx. 25 students from across LACOL institutions

Credit: option and type of credit students may receive from their home institution varies.

Prerequisites: This course is designed for students majoring in STEM or Social Science fields outside* of Computer Science or Statistics. 

  • Calculus I is required.
  • Basic proficiency in R or Python is required through prior courseworks such as Introduction to Data Science, Introduction to Statistics, or a Modeling/Methods class where the student has encountered linear models and basic coding. 

*Note: CS/Stats majors may be more likely to take an ML class with advanced math prerequisites in their home department. However, the LACOL class might be open to CS/Stats majors as an enrichment.

Course Learning Goals/Outcomes:

  • Become familiar with basic concepts of selected commonly used ML models
  • Engage in application of existing ML models in authentic data
  • Translate model findings into practical recommendations/decision making
  • Recognize bias and potential for misuse of ML methods

Course Description:

Machine Learning is an important modern approach to data-driven decision-making. It brings together computer science, mathematics, and statistics to extract new information from data. It’s commonly used in many STEM and social science disciplines and established as a useful tool to identify new trends and predictions. 

This class will probe further into the question “what is machine learning?” and teach you how to investigate data using machine learning models. It will teach you how to extract and identify useful features that best represent your data. You will learn a few of the most important machine learning algorithms (e.g. logistic regression, k-nearest neighbors, support vector machines, and random forests) and learn how to evaluate their performance.  

You will also explore some of the inductive biases of ML methods (i.e. what assumptions about the data and the world are baked into the structure of a given algorithm) and implications for the types of problems the algorithm is appropriate for. You will examine the potential for misuse of ML – sometimes unintended, sometimes malicious – through concrete examples (drawing on sources such as Atlas of AI) and discuss the value judgments, critical thinking, and societal impacts that must be considered whenever ML is applied in the real world. 

Finally, with a team of your classmates, you will develop your own machine learning model and apply it to understand real-life data.

Possible Machine Learning Algorithms That Will Be Covered Include:

  • Regularized regression (LASSO, ridge regression) 
  • k-Nearest-Neighbors
  • Support vector machines (SVM)
  • Clustering (k-means, hierarchical)
  • Trees (CART, random forests, boosting)
  • Debiased machine learning
  • Neural networks (including deep learning) if time permits

Course Technology:

  • Enrolled students will be granted access (at no cost to the student) for: 
    • course website / Canvas LMS
    • coding environment – JupyterHub, replit, or RStudio Server (platform details TBA)
  • Students must ensure their own access to a computer with a webcam and broadband internet on a daily basis during the summer term.

Programming Language: Students will be introduced to ML libraries and algorithms through hands-on coding labs using either R or Python. The choice between R or Python will be announced early in 2023.

Texts

LACOL’s shared course model:

  • As a fully online, shared course, Applied Machine Learning class serves various Data Science programs across the membership.
  • As a subject, Applied ML is amenable to, and benefits from, online and collaborative learning.
  • Applied ML course gives fresh context for LACOL faculty and students to experiment with digital teaching, learning, and collaboration at a more advanced level. (See also: LACOL Introduction to Data Science.)