Applied Machine Learning

LACOL Shared Course: Applied Machine Learning (ML)

Summer Term: June 7 – August 4, 2023

Format: Fully online class with synchronous and asynchronous elements

Summer 2023 Teaching Team: 

  • Simon Hoellerbauer, Post Doctoral Fellow in Data Science and Society, Vassar College (Course Director)
  • Natalia Toporikova, Assoc. Professor of Neuroscience and Data Science, Washington and Lee University
  • LACOL college lecturers in applied topics TBA
  • Student Teaching Assistants: C. Klein, Vassar College ’26 and TBA

Course Advisors: 

  • Laurie Heyer, Professor of Mathematics and Computer Science, Davidson College
  • Monika Hu, Associate Professor of Statistics, Vassar College

Eligibility: Open to approx. 25 students from across LACOL institutions

Credit: option and type of credit students may receive from their home institution varies.

Prerequisites: This course is designed for students majoring in STEM or Social Science fields outside* of Computer Science or Statistics. 

  • Calculus I is required.
  • Basic proficiency in R is required through prior courseworks such as an Introduction to Data Science class. Student should be familiar with the concept of linear models, data frames, and basic coding. 

*Note: CS/Stats majors may be more likely to take an ML class with advanced math prerequisites in their home department.

Course Learning Goals/Outcomes:

  • Become familiar with basic concepts of selected commonly used ML models
  • Engage in application of existing ML models in authentic data
  • Translate model findings into practical recommendations/decision making
  • Recognize bias and potential for misuse of ML methods

Course Description:

Machine Learning is an important modern approach to data-driven decision-making. It brings together computer science, mathematics, and statistics to extract new information from data. It’s commonly used in many STEM and social science disciplines and established as a useful tool to identify new trends and predictions. 

This class will probe further into the question “what is machine learning?” and teach you how to investigate data using machine learning models. It will teach you how to extract and identify useful features that best represent your data. You will learn a few of the most important machine learning algorithms (e.g. logistic regression, k-nearest neighbors, support vector machines, and random forests) and learn how to evaluate their performance.  

You will also explore some of the inductive biases of ML methods (i.e. what assumptions about the data and the world are baked into the structure of a given algorithm) and implications for the types of problems the algorithm is appropriate for. You will examine the potential for misuse of ML – sometimes unintended, sometimes malicious – through concrete examples (drawing on sources such as Atlas of AI) and discuss the value judgments, critical thinking, and societal impacts that must be considered whenever ML is applied in the real world. 

Finally, with a team of your classmates, you will develop your own machine learning model and apply it to understand real-life data.

Machine Learning Topics and Algorithms:

  • Review: data wrangling/cleaning
  • Review: linear algebra
  • Github and Git
  • How to come up with research question
  • Regularized regression (LASSO, ridge regression) 
  • k-Nearest-Neighbors
  • Support vector machines (SVM)
  • Clustering (k-means, hierarchical)
  • Trees (CART, random forests, boosting)
  • Debiased machine learning
  • Neural networks (including deep learning) if time permits

Course Technology:

  • Enrolled students will be granted access (at no cost to the student) for: 
    • Course website / LMS
    • R/RStudio coding environment (Quarto or R Markdown)
  • Students must ensure their own access to a computer with a webcam and broadband internet on a daily basis during the summer term.

Programming Language: Students will be introduced to ML libraries and algorithms through hands-on coding labs using R.

Text: https://www.statlearning.com/ 

Course Information – Summer 2023:

LACOL’s shared course model:

  • As a fully online, shared course, Applied Machine Learning class serves various Data Science programs across the membership.
  • As a subject, Applied ML is amenable to, and benefits from, online and collaborative learning.
  • Applied ML course gives fresh context for LACOL faculty and students to experiment with digital teaching, learning, and collaboration at a more advanced level. (See also: LACOL Introduction to Data Science.)