Data Science & Engineering
ECE 204 (formerly ECE 379)
University of Wisconsin–Madison
Instructor: Laurent Lessard
A hands-on introduction to Data Science using the Python programming language. The course is intended for Freshmen and Sophomores of any major that have limited prior experience in computer programming or data science. The course teaches how to think about data-centric problems in a computational way. Given data from real-world phenomena, students will learn to describe, analyze, and make predictions. To this effect, the course will also introduce programming in Python, which is the most widely used programming language in the data science industry. Topics covered include: how to import, manipulate, summarize, and visualize data of various types, how to perform descriptive analyses such as clustering and principal component analysis, how to perform predictive analyses such as classification and regression, and notions of bias, fairness, and ethics in data science.
Prerequisites: There are no prerequisites for this course. We will provide you with the tools you need and teach you how to use them. Most importantly, we will equip you with the knowledge and ability to continue using what you’ve learned long after you complete the class and for the rest of your career as a student and beyond.
IMPORTANT: The materials below are from Fall 2019-20, which was the last time Prof. Lessard taught this course. More recent offerings of the course might use different notes/materials.
The class is organized into modules.
- Python modules show how to perform specific computations and tasks using Python and/or Jupyter notebooks. These modules consist of lecture slides containing explanations and code snippets.
- Concept modules explain a new concept, typically from a mathematical, geometric, or intuitive perspective, with illustrative examples. These modules consist of lecture slides.
- Case studies apply the concepts covered in previous modules to a realistic use case. This typically includes manipulating and analyzing data sets, and interpreting and visualizing the results. Case Study modules are IPython notebooks (ipynb) and also contain a short quiz at the end.
- Introduction/Survey modules give a high-level overview of what’s to come or a summary of what has been covered thus far. These modules consist of lecture slides.
Lecture slides are also available for download as PDFs. The typical pace of the class is two modules per lecture.
Part I: Python and Jupyter basics
Part II: Unsupervised learning
Part III: Supervised learning
Part IV: Time series
ECE 204 is intended to be a first course in programming and learning to reason with data. It was the first course of its kind at UW-Madison, created and first taught by Laurent Lessard with help from Teaching Assistants Scott Sievert, Pankaj Kabra, and Shashank Varma. The course is still under active development and continues to evolve.
Learning outcomes
In other words: what are the skills you will acquire upon completing this class?
- Write working code in Python to import, manipulate, analyze, visualize, and otherwise interact with datasets of various types. If you don’t know what “writing code” even means, you’ll learn that too!
- Perform descriptive analyses to extract, summarize, and interpret salient features from datasets.
- Perform predictive analyses to model trends and make predictions from datasets.
- Apply techniques to identify and clean data that contains missing entries, outliers, or other forms of noise or uncertainty.
- Recognize and evaluate potential issues pertaining to bias, fairness, privacy, and ethics in applying data science techniques. Also understand the limits of what data can do.
Evaluation
A combination of in-class activities, homework assignments, midterm exams, and a final exam. These will largely be hands-on activities where you will complete tasks on your computer and submit your answers electronically.
Materials required
The only thing you will need is a laptop. All course-related materials and software will be provided.