22:19:08	  Anon. Batman:	test
22:23:34	  Anon. Fifth:	Linear + constant term
22:23:55	  Anon. S. Highland:	the coefficient adds up to 1
22:23:59	  Anon. Phillips:	Sum to 1
22:24:00	  Anon. SpyKid2:	Linear combination. + constant term
22:25:04	  Anon. Atom:	Linear
22:25:07	  Anon. Fifth:	hyperplane
22:25:08	  Anon. Murray:	line
22:25:12	  Anon. IronMan:	linear boundary
22:28:03	  Jinhyung David Park (TA):	30 seconds left!
22:29:54	  Anon. Wasp:	But can't you just fold the constant in to make an affine function into a linear combination?
22:30:14	  Anon. Bartlett:	Would it be correct to say all linear functions are affine functions?
22:30:41	  Anon. Fifth:	@sdfs yes I think it is
22:31:07	  Anon. Spiderman:	many layers
22:31:10	  Anon. Ellsworth:	2+ layers
22:31:10	  Anon. Fury:	More than one layer
22:31:11	  Anon. SpyKid2:	No of layers
22:31:11	  Anon. Murray:	a lot of layers
22:31:12	  Anon. Mantis:	Many hidden layers
22:31:12	  Anon. Jarvis:	more number of layers
22:31:12	  Anon. Beechwood:	Many hidden layers
22:31:13	  Anon. Star-Lord:	many layers
22:31:14	  Anon. N.Craig:	More than one hidden layer
22:31:16	  Anon. Odin:	longest path is long
22:31:17	  Anon. Heimdall:	more than3？
22:31:18	  Anon. Thor:	>=2 hidden
22:31:18	  Anon. BlackWidow:	multiple layers
22:31:19	  Anon. SpyKid2:	2
22:31:20	  Anon. Loki:	More than 1
22:31:20	  Anon. Murray:	152
22:31:21	  Anon. Falcon:	> 5
22:31:25	  Anon. Frew:	2
22:33:15	  Anon. Hawkeye:	2
22:33:16	  Anon. Strange:	2
22:33:17	  Anon. Capt. America:	2
22:33:17	  Anon. Flash:	2
22:33:18	  Anon. Jarvis:	2
22:33:18	  Anon. Loki:	2
22:33:19	  Anon. Heimdall:	2
22:33:19	  Anon. S. Highland:	2
22:33:19	  Anon. Thor:	2
22:33:20	  Anon. SpyKid2:	2
22:33:20	  Anon. Odin:	2
22:33:20	  Anon. Beechwood:	2
22:33:21	  Anon. IronWoman:	2
22:33:21	  Anon. WonderWoman:	2
22:33:21	  Anon. Aquaman:	2
22:33:21	  Anon. Myrtle:	2
22:33:22	  Anon. Morewood:	2
22:33:24	  Anon. Penn:	2
22:33:24	  Anon. Forward:	2
22:33:24	  Anon. Wilkins:	2
22:33:24	  Anon. Baum:	2
22:33:25	  Anon. P.J. McArdle:	2
22:33:25	  Anon. Butler:	2
22:33:26	  Anon. Groot:	2
22:33:26	  Anon. Grandview:	2
22:33:27	  Anon. Phillips:	2
22:33:27	  Anon. Nebula:	2
22:33:27	  Anon. N.Craig:	2
22:33:28	  Anon. S. Aiken:	2
22:33:29	  Anon. Star-Lord:	2
22:33:30	  Anon. Darlington:	2
22:33:30	  Anon. Friendship:	2
22:33:32	  Anon. BlackPanther:	2
22:33:33	  Anon. Beacon:	2
22:33:34	  Anon. Smithfield:	2
22:33:37	  Anon. Fury:	2
22:33:39	  Anon. Hawkeye:	4
22:33:40	  Anon. Strange:	4
22:33:41	  Anon. Capt. America:	4
22:33:42	  Anon. Walnut:	4
22:33:42	  Anon. Jarvis:	4
22:33:42	  Anon. Aquaman:	4
22:33:42	  Anon. Tech:	4
22:33:43	  Anon. Nebula:	4
22:33:43	  Anon. Phillips:	4
22:33:43	  Anon. Myrtle:	4
22:33:43	  Anon. SpyKid2:	4
22:33:43	  Anon. Rocket:	4
22:33:43	  Anon. Beechwood:	4
22:33:44	  Anon. Forward:	4
22:33:44	  Anon. BlackPanther:	4
22:33:45	  Anon. Firestorm:	4
22:33:45	  Anon. Star-Lord:	4
22:33:45	  Anon. N.Craig:	4
22:33:46	  Anon. Wanda:	4
22:33:46	  Anon. Hobart:	4
22:33:46	  Anon. WonderWoman:	4
22:33:46	  Anon. Heimdall:	dense net？
22:33:47	  Anon. Wilkins:	4
22:33:53	  Anon. GreenArrow:	what if the graph contains a cycle? what can we say about the depth?
22:34:02	  Anon. Tech:	same depth?
22:34:04	  Anon. Gamora:	Are there NNs that aren't DAGs
22:34:42	  Anon. Murray:	recurrent neural network?
22:35:34	  Jinhyung David Park (TA):	When we talk about depth, we don't usually consider the time dimension (same network repeated over time) as part of the depth
22:35:58	  Jinhyung David Park (TA):	the definition of depth is not well-defined for networks with cycles
22:36:08	  Anon. GreenArrow:	ok thanks
22:40:23	  Anon. Bartlett:	Wasn't the last gate an OR gate since the threshold was 1?
22:40:42	  Anon. Capt. America:	Yes
22:40:50	  Anon. Bellefield:	l
22:40:51	  Anon. Mantis:	L
22:40:51	  Anon. Forward:	l
22:40:52	  Anon. Nebula:	L
22:40:52	  Anon. Murdoch:	L
22:40:53	  Anon. Penn:	L
22:40:54	  Anon. Wanda:	L
22:40:55	  Anon. P.J. McArdle:	L
22:40:56	  Anon. SpyKid2:	l
22:40:56	  Anon. Wilkins:	L
22:41:06	  Anon. Jarvis:	L-1
22:41:07	  Anon. Tech:	L-1
22:41:16	  Anon. Beechwood:	What does -1 mean?
22:41:23	  Anon. SpyKid2:	weight
22:41:28	  Anon. Nebula:	the weight is -1
22:41:55	  Anon. Nebula:	all 0
22:41:56	  Anon. Jarvis:	all are 0
22:42:12	  Anon. Fifth:	The first L are all 0, and the last ones are all 1
22:42:35	  Anon. Strange:	-N
22:42:38	  Anon. S. Highland:	L - N + 1
22:42:40	  Anon. Batman:	L-N
22:46:10	  Anon. Nebula:	O(n)?
22:46:12	  Anon. Batman:	related to the num of squares
22:46:32	  Anon. WonderWoman:	How is the complexity of Boolean function defined?
22:46:40	  Anon. SpyKid2:	me
22:48:09	  Anon. Shady:	would this make your neural network have exponentially many nodes for some boolean functions?
22:49:12	  Jinhyung David Park (TA):	Yes
22:50:21	  Anon. S. Highland:	2^n
22:50:22	  Anon. Murray:	2^n
22:50:22	  Anon. Nebula:	2^n
22:50:26	  Anon. Beechwood:	2^n+1
22:50:37	  Anon. Green Lantern:	I am
22:55:17	  Anon. Nebula:	checkerboard
22:55:19	  Anon. Ellsworth:	checkers
22:55:21	  Anon. SpyKid2:	checkered
22:55:24	  Anon. Phillips:	checkered
22:55:58	  Anon. Falcon:	8
22:55:58	  Anon. IronMan:	8
22:55:58	  Anon. Aquaman:	9
22:55:58	  Anon. Capt. America:	8
22:55:59	  Anon. Bartlett:	8
22:55:59	  Anon. Fifth:	8
22:56:00	  Anon. P.J. McArdle:	8
22:56:00	  Anon. Phillips:	8
22:56:00	  Anon. Nebula:	8
22:56:01	  Anon. Flash:	8
22:56:02	  Anon. Strange:	9
22:56:02	  Anon. Heimdall:	8
22:56:03	  Anon. Wightman:	8
22:56:03	  Anon. Drax:	8
22:56:07	  Anon. SpyKid2:	8
22:56:10	  Anon. Firestorm:	8 in hidden layer
22:56:11	  Anon. Forbes:	8
22:56:13	  Anon. Bigelow:	8
22:56:19	  Anon. Wilkins:	8
22:56:21	  Anon. Vision:	8
22:56:24	  Anon. Atom:	9
22:56:26	  Anon. Atom:	8
22:56:27	  Anon. Gamora:	7?
22:56:51	  Anon. Superman:	8
22:57:04	  Anon. Heimdall:	32
22:57:05	  Anon. Fifth:	32
22:57:06	  Anon. P.J. McArdle:	32
22:57:07	  Anon. SpyKid2:	32
22:57:08	  Anon. Bellefonte:	32
22:57:14	  Anon. Friendship:	32
22:57:16	  Anon. Firestorm:	32
22:57:18	  Anon. Thor:	32
22:58:23	  Jinhyung David Park (TA):	30 seconds left!
22:59:23	  Anon. Green Lantern:	XOR
23:06:54	  Anon. Strange:	What if the # of neural are greater than what is needed?
23:07:20	  Anon. Northumberland:	What is the optimal depth size? Or we can increase the depth as much as we can and thereby reduce the number of neurons?
23:07:38	  Jinhyung David Park (TA):	@sdfsd, what do you mean?
23:07:57	  Anon. SpyKid2:	What are the cons in having maximum depth?
23:08:13	  Anon. Spiderman:	I think if it's too deep you get some gradient issues right? During backprop?
23:08:22	  Anon. S.Craig:	is it always good to choose more depth than more width?
23:08:37	  Jinhyung David Park (TA):	@dgfg in practice, it's a lot of hyperparameter tuning, still an active area of research. There is some evidence to show that increasing depth and width together is better than just one or the other
23:08:43	  Anon. SpyKid2:	Less no of neutrons = less computation
23:09:02	  Jinhyung David Park (TA):	@dfgd not always, it's a balancing act between the two
23:09:50	  Jinhyung David Park (TA):	@sdfsdf vanishing gradients is indeed a problem as you will see later on in the course, but residual connections have been a very good solution to this issue
23:10:03	  Jinhyung David Park (TA):	Along with batchnormalization
23:10:09	  Jinhyung David Park (TA):	(you will see later)
23:12:15	  Anon. Ellsworth:	star
23:12:16	  Anon. Fifth:	The star around the pentagon
23:12:16	  Anon. Ivy:	The star
23:12:16	  Anon. IronMan:	star
23:12:17	  Anon. Schenley:	star
23:13:09	  Jinhyung David Park (TA):	30 seconds left!
23:15:59	  Anon. Fifth:	Do we require all shapes we break it up into to be convex?
23:21:13	  Anon. Odin:	@dfgd Like the star shape is obtained if we set the threshold to be 4, so not necessarily?
23:22:04	  Jinhyung David Park (TA):	Yes, the shapes need not be convex.
23:22:29	  Anon. Strange:	Why’s the value is N/2?
23:23:11	  Anon. SpyKid2:	What do the axes represent?
23:25:22	  Anon. Murray:	I think the axes represents input features
23:25:34	  Anon. IronMan:	For each circle we need infinite neurons right?
23:25:45	  Jinhyung David Park (TA):	yes
23:25:47	  Anon. Northumberland:	How do we establish a network for a  single circular boundary?
23:26:51	  Anon. IronMan:	Put almost infinite neurons and take the and of the output
23:27:44	  Anon. Shady:	you can approximate a circle as a 100 sided polygon, 1000 sided polygon, etc. and do the same thing as the pentagon net
23:28:08	  Anon. Odin:	or just don't use an affine perceptron?
23:28:24	  Anon. IronMan:	Activation layer will help
23:28:33	  Anon. IronMan:	Non linear activation*
23:28:44	  Anon. Northumberland:	What type of activation would we need?
23:29:14	  Anon. IronMan:	Put x1^2 and x2^2 features and you will need only one neuron to produce a circle
23:30:25	  Anon. Northumberland:	Equation of a circle.. I see
23:35:52	  Anon. SilverSurfer:	So the first layer must capture all the information necessary to solve the problem?
23:36:43	  Anon. Grandview:	I think every layer matters
23:42:50	  Anon. Centre:	cant select multiple
23:42:51	  Anon. Liberty:	"Mark all" but you can only choice one.
23:44:22	  Anon. Friendship:	Thank you
23:46:07	  Anon. Jarvis:	its still linear
23:49:13	  Anon. Heimdall:	leaky relu？
23:50:04	  Anon. Falcon:	C*(A*x+B)+D is equivalent to (C*A)*x + (C*B+D)