11-785 Deep Learning

11-785 Introduction to Deep Learning

Spring 2021
Zoom Link to Lecture

Bulletin and Active Deadlines

Assignment	Deadline	Description	Links
This piece is performed by the Chinese Music Institute at Peking University (PKU) together with PKU's Chinese orchestra. This is an adaptation of Beethoven: Serenade in D major, Op.25 - 1. Entrata (Allegro), for Chinese transverse flute (Dizi), clarinet and flute.
HW4P1	May 2	Language Modelling using RNNs	*Autolab, Writeup (.pdf)**
HW4P2	May 2 (See Piazza for early deadline)	Listen, Attend, and Spell	*Kaggle, Writeup (.pdf)**
Sign Up for Project Groups	Feb. 5	-	Piazza
Project Gallery
Here's an example of a successful project from Fall 2020. The team developed an AI Limmerick generator, and compiled a book from the AI Poet's creations.	-	-	Project Report, Project Video, Book (Amazon)

The Course

“Deep Learning” systems, typified by deep neural networks, are increasingly taking over all AI tasks, ranging from language understanding, and speech and image recognition, to machine translation, planning, and even game playing and autonomous driving. As a result, expertise in deep learning is fast changing from an esoteric desirable to a mandatory prerequisite in many advanced academic settings, and a large advantage in the industrial job market.

In this course we will learn about the basics of deep neural networks, and their applications to various AI tasks. By the end of the course, it is expected that students will have significant familiarity with the subject, and be able to apply Deep Learning to a variety of tasks. They will also be positioned to understand much of the current literature on the topic and extend their knowledge through further study.

If you are only interested in the lectures, you can watch them on the YouTube channel.

Course description from student point of view

The course is well rounded in terms of concepts. It helps us understand the fundamentals of Deep Learning. The course starts off gradually with MLPs and it progresses into the more complicated concepts such as attention and sequence-to-sequence models. We get a complete hands on with PyTorch which is very important to implement Deep Learning models. As a student, you will learn the tools required for building Deep Learning models. The homeworks usually have 2 components which is Autolab and Kaggle. The Kaggle components allow us to explore multiple architectures and understand how to fine-tune and continuously improve models. The task for all the homeworks were similar and it was interesting to learn how the same task can be solved using multiple Deep Learning approaches. Overall, at the end of this course you will be confident enough to build and tune Deep Learning models.

Prerequisites

We will be using Numpy and PyTorch in this class, so you will need to be able to program in python3.
You will need familiarity with basic calculus (differentiation, chain rule), linear algebra and basic probability.

Units

Courses 11-785, 18-786, and 11-685 are equivalent 12-unit graduate courses, and have a final project. Course 11-485 is the undergraduate version worth 9 units, the only difference being that there is no final project.

Acknowledgments

Your Supporters

Instructors:

Bhiksha Raj : bhiksha@cs.cmu.edu
Rita Singh : rsingh@cs.cmu.edu

TAs:

Akshat Gupta: akshatgu@andrew.cmu.edu
Anurag Katakkar: akatakka@andrew.cmu.edu
Alex Kwark: hkwark@andrew.cmu.edu
David Park: jinhyun1@andrew.cmu.edu
Hao Chen: haoc3@andrew.cmu.edu
Joseph Konan: jkonan@andrew.cmu.edu
Kinori Rosnow: krosnow@andrew.cmu.edu
Owen Wang: owenw@andrew.cmu.edu
Sai Prahladh: saiprahp@andrew.cmu.edu
Shayeree Sarkar: shayeres@andrew.cmu.edu
Shentong Mo: shentonm@andrew.cmu.edu
Vaidehi Joshi: vaidehij@andrew.cmu.edu
(China/India) Haoxuan Zhu: haoxuanz@andrew.cmu.edu
(China/India) Yi Shen: yishen@andrew.cmu.edu
(Doha) Nour Ali: ntali@andrew.cmu.edu
(India) Shriti Priya: shritip@andrew.cmu.edu
(Kigali) Charles Yusuf: cyusuf@andrew.cmu.edu
(Kigali) Tanya Akumu: takumu@andrew.cmu.edu

Pittsburgh Schedule (Eastern Time)

Lecture: Monday and Wednesday, 8:20 a.m. - 9:40 a.m.

Recitation: Friday, 8.20am-9.40am

Office hours:

We will be using OHQueue and Zoom links listed on Piazza to manage office hours. The tentative schedule will be updated soon.

Course Work

Policy
Breakdown
Score Assignment		Grading will be based on weekly quizzes (24%), homeworks (50%) and a course project (25%). Note that 1% of your grade is assigned to Attendance.
Quizzes
Quizzes		There will be weekly quizzes. We will retain your best 12 out of the remaining 14 quizzes. Quizzes will generally (but not always) be released on Friday and due 48 hours later. Quizzes are scored by the number of correct answers. Quizzes will be worth 24% of your overall score.
Assignments
Assignments		There will be five assignments in all. Assignments will include autolab components, where you must complete designated tasks, and a kaggle component where you compete with your colleagues. Autolab components are scored according to the number of correctly completed parts. We will post performance cutoffs for A (100%), B (80%), C (60%), D (40%) and F (0%) for Kaggle competitions. Scores will be interpolated linearly between these cutoffs. Assignments will have a “preliminary submission deadline”, an “on-time submission deadline” and a “late-submission deadline.” Early submission deadline: You are required to make at least one submission to Kaggle by this deadline. People who miss this deadline will automatically lose 10% of subsequent marks they may get on the homework. This is intended to encourage students to begin working on their assignments early. On-time deadline: People who submit by this deadline are eligible for up to five bonus points. These points will be computed by interpolation between the A cutoff and the highest performance obtained for the HW. The highest performance will get 105. Late deadline: People who submit after the on-time deadline can still submit until the late deadline. There is a 10% penalty applied to your final score, for submitting late. Slack days: Everyone gets up to 7 slack days, which they can distribute across all their homework P2s only. Once you use up your slack days you will fall into the late-submission category by default. Slack days are accumulated over all parts of all homeworks, except HW0, to which no slack applies. Kaggle scoring: We will use max(max(on-time score), max(slack-day score), .0.9max(late-submission score))* as your final score for the HW. If this happens to be a slack-days submission, slack days corresponding to the selected submission will be counted. Assignments carry 50% of your total score. HW0 is not graded (but is mandatory), while each of the subsequent four are worth 12.5%. A fifth HW, HW5, will be released later in the course and will have the same weight as a course project. Please see Project section below for more details.
Project
Project		All students taking a graduate version of the course are required to do a course project. The project is worth 25% of your grade. These points are distributed as follows: 10% - Proposal; 15% - Midterm Report; 30% - Project Video; 5% - Responding to comments on Piazza; 40% - Paper peer review. Note that a Project is mandatory for 11-785/18-786 students. In the event of a catastrophe (remember Spring 2020), the Project may be substititued with HW5. 11-685 Students may choose to do a Project instead of HW5. Either your Project OR HW5 will be graded.
Attendance
Attendance		If you are in section A you are expected to attend the zoom lectures We will tag you as having attended the lecture if you are present for at least 60 minutes of the duration of the lecture If you are in any of the other (out-of-timezone) sections, you may either watch the real-time zoom lectures or the recorded lectures on mediatech If viewed on mediatech, the lectures of each week must be viewed before 8AM of the Monday following the following week (Otherwise, it doesn’t count) At the end of the semester, we will select a random subset of 50% of the lectures and tabulate attendance If you have attended at least 70% of these (randomly chosen) lectures, you get the attendance point
Final grade
Final grade		The end-of-term grade is curved. Your overall grade will depend on your performance relative to your classmates.
Pass/Fail
Pass/Fail		Students registered for pass/fail must complete all quizzes, HWs and if they are in the graduate course, the project. A grade equivalent to B- is required to pass the course.
Auditing
Auditing		Auditors are not required to complete the course project, but must complete all quizzes and homeworks. We encourage doing a course project regardless.
		End Policy

Study groups

This semester we will be implementing study groups. It is highly recommended that you join a study group; see the forms on the bulletin.

Piazza: Discussion Board

Piazza is what we use for discussions. You should be automatically signed up if you're enrolled at the start of the semester. If not, please sign up here. Also, please follow the Piazza Etiquette when you use the piazza.

AutoLab: Software Engineering

AutoLab is what we use to test your understand of low-level concepts, such as engineering your own libraries, implementing important algorithms, and developing optimization methods from scratch.

Kaggle: Data Science

Kaggle is where we test your understanding and ability to extend neural network architectures discussed in lecture. Similar to how AutoLab shows scores, Kaggle also shows scores, so don't feel intimidated -- we're here to help. We work on hot AI topics, like speech recognition, face recognition, and neural machine translation.

Media Services/YouTube: Lecture and Reciation Recordings

CMU students who are not in the live lectures should watch the uploaded lectures at Media Services in order to get attendance credit. Links to individual videos will be posted as they are uploaded.

YouTube is where non-CMU folks can view all lecture and recitation recordings. Videos marked “Old“ are not current, so please be aware of the video title.

Books and Other Resources

The course will not follow a specific book, but will draw from a number of sources. We list relevant books at the end of this page. We will also put up links to relevant reading material for each class. Students are expected to familiarize themselves with the material before the class. The readings will sometimes be arcane and difficult to understand; if so, do not worry, we will present simpler explanations in class.

You can also find a nice catalog of models that are current in the literature here. We expect that you will be in a position to interpret, if not fully understand many of the architectures on the wiki and the catalog by the end of the course.

Academic Integrity

You are expected to comply with the University Policy on Academic Integrity and Plagiarism.

You are allowed to talk with and work with other students on homework assignments.
You can share ideas but not code. You should submit your own code.

Your course instructor reserves the right to determine an appropriate penalty based on the violation of academic dishonesty that occurs. Violations of the university policy can result in severe penalties including failing this course and possible expulsion from Carnegie Mellon University. If you have any questions about this policy and any work you are doing in the course, please feel free to contact your instructor for help.

Class Notes

A book containing class notes is being developed in tandem with this course; check it out.

Tentative Schedule of Lectures

Video of Student Discussion for HW2P2 (YT))

Lecture	Date	Topics	Slides and Video	Additional Materials	Quiz
0	-	Course Logistics Learning Objectives Grading Deadlines	Slides (*.pdf) Video (YT)		No Quiz
1	Monday Feb. 1	Introduction	Slides (.pdf) Video (YT) Video (MT) Chat (.txt) Polls (*.rtf)	The New Connectionism (1988) On Alan Turing's Anticipation of Connectionism	Quiz 1
2	Wednesday Feb.3	Neural Nets as Universal Approximators	Slides (.pdf) Video (YT) Video (MT) Chat (.txt) Polls (*.rtf)	Hornik et al. (1989) Shannon (1949) On the Bias-Variance Tradeoff	Quiz 1
3	Monday Feb. 8	Learning a Neural Net	Slides (.pdf) Video (YT) Video (MT) Chat (.txt) Polls (*.rtf)	Widrow and Lehr (1992) Convergence of perceptron algorithm	Quiz 2
4	Wednesday Feb. 10	Backpropogation Calculus of backpropogation	Slides (.pdf) Video (YT) Video (MT) Chat (.txt) Polls (*.docx)	Werbos (1990) Rumelhart, Hinton and Williams (1986)	Quiz 2
5	Monday Feb. 15	Backpropogation, continued Calculus of backpropogation, continued	Slides (.pdf) Video (YT, Part 1) Video (YT, Part 2) Video (MT) Chat (.txt) Polls (*.docx)	Werbos (1990) Rumelhart, Hinton and Williams (1986)	Quiz 3
6	Wednesday Feb. 17	Convergence issues Loss Surfaces Momentum	Slides (.pdf) Video (YT) Video (MT) Chat (.txt) Polls (*.docx)	Backprop fails to separate, where perceptrons succeed, Brady et al. (1989) Why Momentum Really Works	Quiz 3
7	Monday Feb. 22	Batch Size, SGD, Minibatch, second-order methods	Slides (.pdf) Video (YT) Video (MT) Chat (.txt) Polls (*.docx)	Momentum, Polyak (1964) Nestorov (1983)	Quiz 4
8	Wednesday Feb. 24	Optimizers and Regularizers Choosing a divergence (loss) function Batch normalization Dropout	Slides (.pdf) Video (YT) Video Part 2 (YT) Video (MT) Chat (.txt) Polls (*.docx)	ADAGRAD, Duchi, Hazan and Singer (2011) Adam: A method for stochastic optimization, Kingma and Ba (2014)	Quiz 4
9	Monday March 1	Shift invariance and Convolutional Neural Networks	Slides (.pdf) Polls (.docx) Chat (*.txt) Video (YT) Video (MT)		Quiz 5
10	Wednesday March 3	Models of vision, Convolutional Neural Networks	Slides (.pdf) Polls (.docx) Chat (*.txt) Video (YT) Video (MT)		Quiz 5
11	Monday March 8	Learning in Convolutional Neural Networks	Slides (.pdf) Polls (.docx) Chat (*.txt) Video (YT)	CNN Explainer	Quiz 6
12	Wednesday March 10	Learning in CNNs, transpose Convolution	Slides (.pdf) Video (YT) Polls (.docx) Chat (*.txt)		Quiz 6
13	Monday March 15	Time Series and Recurrent Networks	Slides (.pdf) Video Part 1(YT) Video Part 2(YT) Polls (.docx) Chat (*.txt)	Fahlman and Lebiere (1990) How to compute a derivative, extra help for HW3P1 (*.pptx)	Quiz 7
14	Wednesday March 17	Stability and Memory, LSTMs	Slides (.pdf) Video (MT) Video (YT) Polls (.docx) Chat (*.txt)	Bidirectional Recurrent Neural Networks	Quiz 7
15	Monday March 22	Loss Functions in RNNs, Sequence Prediction	Video (MT) Slides (*.pdf) Video (YT)	LSTM	Quiz 8
-	Monday Mar. 22	CNNs Office Hours	Video (YT)
16	Wednesday March 24	Connectionist Temporal Classification Sequence prediction	Video (MT) Video (YT) Polls (.docx) Slides (.pdf)	See recitation 2 on computing derivatives
17	Monday March 29	Connectionist Temporal Classification (CTC) Sequence To Sequence Prediction	Video (MT) Video (YT) Slides (.pdf) Polls (.docx)	Labelling Unsegmented Sequence Data with Recurrent Neural Networks	Quiz 9
18	Wednesday March 31	Sequence To Sequence Methods Attention	Video (MT) Video (YT) Slides (.pdf) Polls (.docx) Chat (*.txt)		Quiz 9
-	Monday Aprilr 5	No class			Quiz 10
19	Wednesday April 7	Representations and Autoencoders	Video (MT) Video (YT) Slides (.pdf) Polls (.docx)	Quiz Reading (*.pdf)	Quiz 10
20	Monday April 12	Variational Auto Encoders : EM and Variational Bounds	Video (MT) Video (YT) Slides (.pdf) Polls (.docx)		Quiz 11
21	Wednesday April 14	Variational Auto Encoders	Video (MT) Video (YT) Slides (*.pdf)	Tutorial on VAEs (Doersch) Autoencoding variational Bayes (Kingma)	Quiz 11
22	Monday April 19	Generative Adversarial Networks, 1	Video (MT) Video (YT) Slides (*.pdf)		Quiz 12
23	Wednesday April 21	Generative Adversarial Networks, 2	Video (MT) Slides (*.pdf) Video (YT)		Quiz 12
24	Monday April 26	Hopfield Nets	Video (MT) Slides (*.pdf)		Quiz 13
25	Wednesday April 28	Hopfield Nets and Boltzmann Machines	Video (MT) Slides (.pdf) Slides Boltzmann Machines (.pdf)		Quiz 13
26	Monday May 3	Wrap Up : A quick run through over everything we covered	Video (MT) Slides (*.pdf)		Quiz 14
27	Wednesday May 5	Guest Lecture - Mahaveer Jain, Facebook	Slides (*.pdf)		Quiz 14

Tentative Schedule of Recitations

Recitation	Date	Topics	Materials	Videos	Instructor
0A	Due Feb. 1	Object Oriented Programming	Notebook (*.zip)	Video (YT)	Shayeree Sarkar
0B	Due Feb. 1	Fundamentals of NumPy and PyTorch	Notebook (*.zip)	Video (YT)	Nour Ali
0C	Due Feb. 1	AWS Setup	Handout	Video (YT)	Vaidehi Joshi
0D	Due Feb. 1	Introduction to Google Colab	Handout	Video (YT)	Haoxuan Zhu
0E	Due Feb. 1	Debugging	Notebook (*.zip)	Video (YT)	Owen Wang
0F	Due Feb. 1	Remote Notebooks	Handout	Video (YT)	Zhihao Wang
1	Out Feb. 6	Your First Deep Learning Code	Slides (*.pdf)	Video (MT)	David Park
1	Out Feb. 6	Basics of an MLP	Slides (*.pdf)	Video (YT)	Tanya Akumu
2	Out Feb. 12	Computing Derivatives	Slides (*.pdf)	Video (YT)	Kinori and Sai
HW 1 Bootcamp	Out Feb. 12	How to get started with HW1	Notebook (*.ipynb)	Video (YT)	Vaidehi and Sai
3	Out Feb. 19	Optimizing the Network	Notebook 1(.ipynb) Notebook 2(.ipynb) Slides (*.pdf)	Video (YT)	Alex and Shentong
4	Out Feb. 26	Convolutional Neural Networks	CNN Basics(.pfd) CNN Backprop(.pptx)	Video (YT)	Nour, Vaidehi, Shayeree and Tanya
5	Out Mar. 5	CNNs: Classification and Verifaction	Slides (.pfd) Handout (.zip)	Video (MT) Video (YT)	David Park
HW 2 Bootcamp	Out Feb. 9	How to get started with HW2	Handout (.zip) Slides (.pdf)	Video (YT)	Shriti and Shayeree
6	Out Mar. 12	RNN Basics	Slides (.pdf) Handout (.zip)	Video (YT)	Joseph Konan and Kinori Rosnow
7	Out Mar. 19	CTC and Beam Search	Slides (.pdf) Handout (.ipynb)	Video (YT)	Akshat Gupta and Charles Yusuf
HW 3 Bootcamp	Out Mar. 24	How to get started with HW3	Slides (*.pdf)	Video (YT)	Owen, Charles, and Kinori
8	Out Mar. 26	Attention	Slides (.pdf) Addtional Notes used in Recitation (.pdf) Handout (*.zip) Video (YT)		Anurag Katakkar and Shriti Priya
9	Out Mar. 28	Autograd Bootcamp	Slides (*.pdf) Video (YT)		Kinori Rosnow
10	Out Apr. 2	HW4P2 Bootcamp, Listen Attend Spell	Video (YT)		Eason
11	Out Apr. 9	Representations and Autoencoders	Slides - Representation Learning (.pdf) Slides - Autoencoders (.pdf)	Video (YT)	Anurag and Shentong
12	Out Apr. 16	Autoencoders and VAEs	Handout (*.zip)	Video (YT)	Akshat and Joseph
13	Out Apr. 16	GANs	Slides (.pdf) Notebook (.ipynb)		Akshat
14	Out Apr. 16	GANs	Slides (*.pdf)		Akshat

Assignments and Quizzes

∑ Ongoing, ∏ Upcoming

Assignment	Released	Due	Material / Links
HW0p1	Winter Break	Feb 8	Autolab, handout (see recitation 0s)
HW0p2	Winter Break	Feb 8	Autolab, handout (see recitation 0s)
Quiz 1	Feb 6, 12:00 AM EST	Feb 7, 11:59 PM EST	Canvas
HW1P1	Feb 8, 12:00 AM EST	Feb 28, 11:59 PM EST	Autolab, Writeup (*.pdf)
HW1P2	Feb 8, 12:00 AM EST	Feb 14, 11:59 PM EST (Early Deadline) Feb 28, 11:59 PM EST (Final Deadline)	Kaggle, Writeup (*.pdf)
Quiz 2	Feb 13, 12:00 AM EST	Feb 14, 11:59 PM EST	Canvas
Quiz 3	Feb 20, 12:00 AM EST	Feb 21, 11:59 PM EST	Canvas
∑ HW1P1 BONUS	Feb 20, 12:00 AM EST	Apr 29, 11:59 PM EST	Autolab, Handout (*.zip)
Quiz 4	Feb 27, 12:00 AM EST	Feb 28, 11:59 PM EST	Canvas
∑ HW2P1	Mar. 1, 12:00 AM EST	Mar. 21, 11:59 PM EST	Autolab, Writeup (*.pdf)
∑ HW2P2	Mar. 1, 12:00 AM EST	Mar. 7, 11:59 PM EST (Early Deadline) Mar. 21, 11:59 PM EST (Final Deadline)	Kaggle, Writeup (*.pdf) General Tips for Training CNNs: Revisiting ResNets, Bag of Tricks
Quiz 5	Mar. 6, 12:00 AM EST	Mar. 7, 11:59 PM EST	Canvas
∏ Project Proposal Submission	-	Mar. 10, 11:59 PM EST	Canvas
Quiz 6	Mar. 13, 12:00 AM EST	Mar. 14, 11:59 PM EST	Canvas
Quiz 7	Mar. 20, 12:00 AM EST	Mar. 21, 11:59 PM EST	Canvas
∑ HW3P1	Mar. 21, 12:00 AM EST	Apr. 11, 11:59 PM EST	Autolab, Writeup (*.pdf)
∑ HW3P2	Mar. 21, 12:00 AM EST	Mar. 28, 11:59 PM EST (Early Deadline) Apr. 11, 11:59 PM EST (Final Deadline)	Kaggle, Writeup (*.pdf)
∏ Quiz 8	Mar. 27, 12:00 AM EST	Mar. 28, 11:59 PM EST	Canvas
∏ Project Midterm Report	Mar. 11, 12:00 AM EST	Apr. 10, 11:59 PM EST	Canvas
∏ Project Video Upload and Preliminary Report	May 1, 12:00 AM EST	May 6, 11:59 PM EST	Upload to video to YouTube, Preliminary report to Canvas
∏ Project Defense	May 7, 12:00 AM EST	May 10, 11:59 PM EST	Piazza
∏ Project Final Reports	May 9, 12:00 AM EST	May 12, 11:59 PM EST	Canvas
∑ HW4P1	Apr. 12, 12:00 AM EST	May 2, 11:59 PM EST	Autolab Writeup (*.pdf)
∑ HW4P2	Apr 12, 12:00 AM EST	Apr. 18, 11:59 PM EST (Early Deadline) May 2, 11:59 PM EST (Final Deadline)	Kaggle, Writeup (*.pdf)