14-485/11-685 Deep Learning
Spring 2017

“Deep Learning” systems, typified by deep neural networks, are increasingly taking over all AI tasks, ranging from language understanding, and speech and image recognition, to machine translation, planning, and even game playing and autonomous driving. As a result, expertise in deep learning is fast changing from an esoteric desirable to a mandatory prerequ isite in many advanced academic settings, and a large advantage in the industrial job market.

In this course we will learn about the basics of deep neural networks, and their applications to various AI tasks. By the end of the course, it is expected that students will have significant familiarity with the subject, and to be able to apply to them to a variety of tasks. They will also be positioned to understand much of the current literature on the topic and extend their knowledge through further study.

Instructor: Bhiksha Raj

TAs: Mohammed Ahmed Shah(mshah1@andrew.cmu.edu)

Time: Mondays, 1.30pm-2.50pm Doha time (12.30-1 .50 Kigali)

Office hours:


11-485/11-685 is open to all but is recommended for CS Seniors and Juniors, Quantitative Masters students, and non-SCS PhD students.

  1. We will be using one of several toolkits. The toolkits are largely programmed in Python or Lua. You will need to be able to program in at least one of these languages. Alternately, you will be responsible for finding and learning a toolkit that requires programming in a language you are comfortable with,
  2. You will need familiarity with basic calculus (differentiation, chain rule) and linear algebra.


This course is an application elective of 6 units.

Course Work


Grading will be based on homework assignments and a final project. There will be a minimum of two and a maximum of three assignments.

Assignments 2 or 3, total contribution to grade 40%
Project Total contribution to grade: 40%
Attendance Mandatory, contribution to grade 20%


Deep learning is a relatively new, fast developing topic, and there are no standard textbooks on the subject that cover the state-of-art, although there are several excellent tutorial books that one can refer to. The topics in this course are collected from a variety of sources, including recent papers. As a result, we do not specify a single standard textbook. However, we list a number of useful books at the end of this page, which we greatly encourage students to read, as they will provide much of the background for the course. We will also put up links to relevant reading material for each class. Students are expected to familiarize themselves with the material before the class. The readings will sometimes be arcane and difficult to understand; if so, do not worry, we will present simpler explanations in class.

Discussion board: Piazza

We will use Piazza for discussions. Here is the link. Please sign up.

Academic Integrity

You are expected to comply with the University Policy on Academic Integrity and Plagiarism.

Your course instructor reserves the right to determine an appropriate penalty based on the violation of academic dishonesty that occurs. Violations of the university policy can result in severe penalties including failing this course and possible expulsion from Carnegie Mellon University. If you have any questions about this policy and any work you are doing in the course, please feel free to contact your instructor for help.


Week Start date Topics Lecture notes/Slides Additional readings, if any
1 January 16
  • Introduction to deep learning
  • Course logistics
2 January 23
  • Projects
  • The neural net as a universal approximator
3 January 30
  • Training a neural network
  • Hebb's rule
  • Perceptron learning rule
  • Optimization by gradient descent
4 February 6
  • Back propagation
  • Choosing a divergence
  • Speed up methods
    • Acceleration
    • Nestorov's method
    • Adagrad and derivatives

Documentation and Tools


Neural Networks and Deep Learning By Michael Nielsen Online book, 2016
Deep Learning with Python By J. Brownlee