18:49:34 Anon. IronMan: Sorry when is the first deadline of our project? 18:55:28 Anon. Penn: is it possible to make people muted by default upon joining? 19:01:20 Anon. Liberty: is stochastic? 19:01:33 Anon. Flash: I didn’t understand why doing with a batch was not a good option. 19:02:40 Anon. Friendship: we cant hear you professor 19:02:49 Anon. BlackPanther: I can 19:02:50 Anon. P.J. McArdle: it's ok here 19:02:51 Anon. Bigelow: I can hear you 19:02:52 Anon. Forbes: I can 19:02:52 Anon. Smithfield: yes 19:02:53 Anon. Beacon: I can 19:02:55 Anon. S. Highland: I can 19:02:55 Anon. Friendship: Or is it just me? 19:02:56 Anon. Shady: I can 19:03:21 Anon. BlackPanther: Ohh batch doesn't mean mini-batch right? 19:03:25 Anon. BlackPanther: Batch is whole thing 19:03:28 Anon. BlackPanther: this is just one? 19:03:55 Anon. Liberty: but for small number of data? 19:04:24 Anon. Liberty: isn't better than to use batch? 19:04:53 Anon. IronWoman: Cant we initially do for few points and then for remaining do a batch ? 19:09:31 Anon. Rocket: do we have any standard or common approach for distributing training data? 19:10:47 Anon. Liberty: minibatch is better or stochastic? 19:11:18 Anon. Myrtle: we pass with batches when we make shuffle true right? is that the same? 19:11:32 Anon. Flash: We can pass with individual samples as well 19:11:53 Anon. Flash: And that’s the same 19:13:14 Anon. S.Craig: same 19:13:25 Anon. Beacon: no 19:13:26 Anon. Batman: no 19:13:28 Anon. S. Highland: No 19:13:31 Anon. IronWoman: no 19:15:08 Anon. Rocket: how can we understand this kind of situation in data? 19:21:07 Anon. Beacon: infinite 19:21:15 Anon. Atom: infinite 19:23:11 Anon. Beacon: First is so that we can “get anywhere”, second is so that we can “stay somewhere”. Is this correct? 19:23:38 Anon. Odin: 1 19:24:09 Anon. Baum: Why does the learning rate must be squared? 19:24:29 Anon. Flash: That is the convergence criteria 19:25:18 Anon. Hawkeye: Cauchy sequence 19:25:28 Anon. Beacon: 1/x 19:25:37 Anon. Jarvis: 1/n 19:30:29 Anon. Tech: I thought global optimum is not guaranteed? 19:34:59 Anon. S.Craig: pull sum out 19:42:53 Anon. S.Craig: Also we would want more uniform distributions? 19:43:17 Anon. S.Craig: Not just number of samples? 19:44:23 Anon. Butler: mini-batch? 19:47:41 Anon. Atom: expected divergence 19:47:49 Anon. Jarvis: expected divergence 19:54:58 Anon. BlackPanther: is the learning rate schedule lenient? As in, as long as it decreases and fits criteria, it should be good? Or is there some optimal setup that works significantly better but takes a while to find it? 19:55:16 Anon. Wilkins: Didn't get the poll this time 20:02:58 Anon. Hobart: First 20:03:03 Anon. Butler: left 20:03:09 Anon. Jarvis: 1st 20:22:57 Anon. Groot: thank you professor! 20:22:58 Anon. P.J. McArdle: thank you professor!