00:43:19 Anon. Nonlinear Transform: Direction of maximum decrease 00:44:35 Reshmi Ghosh (TA): Kids, you can post answers here, or unmute yourself and answer! We highly encourage answering questions:) 00:44:49 Anon. CTC: K th layer 00:44:51 Anon. Kernel: Are cross entropy loss and negative log-likelihood loss the same concept? Looks like they are the same thing according to the H1 writeup. 00:45:02 Anon. Deep Dream: from neuron I to neuron j 00:45:06 Anon. Matrix: previous node I to current node j 00:45:07 Anon. Spiking NN: weight going from i to j at kth layer 00:45:09 Anon. BERT: Weight at layer k between the ith and jth node 00:46:30 Anon. Alpha: is it the weight from ith node in kth layer to jth node in (k+1)th layer? 00:46:33 Anon. CTC: Here T refers to the number of training examples? 00:47:18 Anon. Loss Surface: delta y = alpha delta x 00:47:38 Anon. Matrix: Row vector 00:50:02 Anon. Dropout (for NNs): f'(g(x)) + g'(x) 00:53:52 Anon. Neurotransmitter: total derivative 00:58:42 Anon. CTC: Here we assume that the coefficient for each edge is 1? 00:59:52 Reshmi Ghosh (TA): POLL 00:59:58 Anon. Vector: Sorry is delta x and d x mean the same thing? 01:00:03 Anon. Vector: *do 01:00:11 Anxiang Zhang (TA): 10 seconds 01:00:29 Anon. Seq2Seq: can i think of an influence diagram as a weighted diagram where all the weights are 1? 01:03:59 Anon. grad_fn: Why is the y in the top right labeled as being in the first layer? 01:04:39 Reshmi Ghosh (TA): Sorry I did not see the diagram 01:04:42 Reshmi Ghosh (TA): Was it y1? 01:05:12 Reshmi Ghosh (TA): Could you specify the slide number? I can check or ask Bhiksha 01:05:19 Anon. grad_fn: The subscript was 2 and the superscript was 1 and it was in the top right on slide 107 01:05:42 Anon. grad_fn: Thanks! 01:06:16 Reshmi Ghosh (TA): Damn the slide numbers are different. Getting back to you @ Dennis 01:06:31 Anon. grad_fn: Take your time sorry! 01:10:01 Anon. grad_fn: I can post on piazza after class 01:10:25 Reshmi Ghosh (TA): Haha helps. Because I am randomly looking through the slide deck posted on course webpage. Sorry! Will get back to you 01:11:53 Anon. grad_fn: Definitely sorry for the confusion! Thanks (: 01:24:14 Anon. grad_fn: This is without activation functions correct? 01:24:24 Anon. Curse of Dimensionality: This does not take into account activation functions, correct? 01:25:18 Anon. grad_fn: yes 01:39:32 Anon. Vector: The function would not be increasing 01:41:39 Anon. Boltzmann: 0 01:43:16 Reshmi Ghosh (TA): Poll folks 01:43:25 Anon. Curse of Dimensionality: yea 01:47:33 Anon. Scheduler: So what is exactly used to update the parameters in the network?