Reading lists for new MILA students

来源：互联网发布：su绘图软件编辑：程序博客网时间：2024/05/19 20:58

Research in General

How to write a great research paper

Basic concepts on information theoryin visual terms

Blog post from Christopher Olah onvisualizing the representations of neural networks

http://colah.github.io/posts/2015-09-Visual-Information/

Basics of machine learning

●DL book chapter on linear algebra:http://www.deeplearningbook.org/contents/linear_algebra.html~~http://www.iro.umontreal.ca/~bengioy/DLbook/linear_algebra.html~~

● DL book chapter on probability:http://www.iro.umontreal.ca/~bengioy/dlbook/prob.html

● DL book chapter on numerical computation:http://www.iro.umontreal.ca/~bengioy/dlbook/numerical.html

● DL book chapter on machine learning: http://www.iro.umontreal.ca/~bengioy/DLbook/ml.html

Basics of deep learning

● Intro to deep learning: http://www.iro.umontreal.ca/~bengioy/DLbook/intro.html

● Feedforward multi-layer nets: http://www.iro.umontreal.ca/~bengioy/DLbook/mlp.html

●

● Learning deep architectures for AI

● Practicalrecommendations for gradient-based training of deep architectures

● Quick’n’dirty introduction to deep learning: Advances inDeep Learning

● A fast learning algorithm for deep belief nets

● Greedy Layer-Wise Training of Deep Networks

● Stacked denoising autoencoders: Learning usefulrepresentations in a deep network with a local denoising criterion

● Contractive auto-encoders: Explicit invariance duringfeature extraction

● Why does unsupervised pre-training help deep learning?

● An Analysis of Single Layer Networks in UnsupervisedFeature Learning

● The importance of Encoding Versus Training With SparseCoding and Vector Quantization

● RepresentationLearning: A Review and New Perspectives

● DeepLearning of Representations: Looking Forward

● Measuring Invariances in Deep Networks

● Neural networks course at USherbrooke [youtube]

Feedforward nets

● http://www.iro.umontreal.ca/~bengioy/DLbook/mlp.html

● “Improving Neural Nets with Dropout” by NitishSrivastava

● BatchNormalization

● “Fast Drop Out”

● “Deep Sparse Rectifier Neural Networks”

● “What is the best multi-stage architecture for objectrecognition?”

● “Maxout Networks”

MCMC

● Iain Murray’s MLSS slides

● Radford Neal’s Review Paper (old but stillvery comprehensive)

● BetterMixing via Deep Representations

● Bayesian Learning via Stochastic Gradient LangevinDynamics

Restricted Boltzmann Machines

● Unsupervised learning of distributions of binary vectorsusing 2-layer networks

● A practical guide to training restricted Boltzmannmachines

● Training restricted Boltzmann machines usingapproximations to the likelihood gradient

● Tempered Markov Chain Monte Carlo for training ofRestricted Boltzmann Machine

● How to Center Binary Restricted Boltzmann Machines

● Enhanced Gradient for Training Restricted BoltzmannMachines

● Using fast weights to improve persistent contrastivedivergence

● Training Products of Experts by Minimizing ContrastiveDivergence

Boltzmann Machines

● Deep Boltzmann Machines (Salakhutdinov &Hinton)

● Multimodal Learning with Deep Boltzmann Machines

● Multi-Prediction Deep Boltzmann Machines

● A Two-stage Pretraining Algorithm for Deep BoltzmannMachines

Regularized Auto-Encoders

● The Manifold Tangent Classifier

● DL book chapter on unsupervised learning: http://www.iro.umontreal.ca/~bengioy/dlbook/unsupervised.html

● DL book chapter on manifolds: http://www.iro.umontreal.ca/~bengioy/dlbook/manifolds.html

● RepresentationLearning: A Review and New Perspectives, in particular section 7.

Regularization

Stochastic Nets & GSNs

● Estimatingor Propagating Gradients Through Stochastic Neurons for Conditional Computation

● Learning StochasticFeedforward Neural Networks

● GeneralizedDenoising Auto-Encoders as Generative Models

● DeepGenerative Stochastic Networks Trainable by Backprop

Others

● Slow, DecorrelatedFeatures for Pretraining Complex Cell-like Networks

● WhatRegularized Auto-Encoders Learn from the Data Generating Distribution

● GeneralizedDenoising Auto-Encoders as Generative Models

● Why the logistic function?

RecurrentNets

● DL book chapter on recurrent nets

● Learning long-term dependencies with gradient descent isdifficult

● Advancesin Optimizing Recurrent Networks

● Learning recurrent neural networks with Hessian-freeoptimization

● On the importance of momentum and initialization in deeplearning,

● Long short-term memory (Hochreiter &Schmidhuber)

● GeneratingSequences With Recurrent Neural Networks

● Long Short-Term Memory in Echo State Networks: Details ofa Simulation Study

● The "echo state" approach to analysing andtraining recurrent neural networks

● Backpropagation-Decorrelation: online recurrent learningwith O(N) complexity

● New results on recurrent network training:Unifying thealgorithms and accelerating convergence

● Audio Chord Recognition with Recurrent Neural Networks

● ModelingTemporal Dependencies in High-Dimensional Sequences: Application to PolyphonicMusic Generation and Transcription

Memorynetworks

● Weston,Jason, Sumit Chopra, and Antoine Bordes. "Memory networks." arXivpreprint arXiv:1410.3916 (2014).

● Graves, Alex, Greg Wayne, and IvoDanihelka. "Neural Turing Machines." arXiv preprint arXiv:1410.5401(2014).

● Vinyals, Oriol, Meire Fortunato, andNavdeep Jaitly. "Pointer networks." arXiv preprint arXiv:1506.03134(2015).

● Kurach,Karol, Andrychowicz, Marcin andSutskever,Ilya. "Neural Random-Access Machines." arXiv preprintarXiv:1511.06392 (2015).

● Cho, Kyunghyun, Aaron Courville, andYoshua Bengio. "Describing Multimedia Content using Attention-basedEncoder--Decoder Networks." arXiv preprint arXiv:1507.01053 (2015).

● Salakhutdinov,Ruslan, and Geoffrey Hinton. "Semantic hashing." InternationalJournal of Approximate Reasoning 50.7 (2009): 969-978.

● Hinton,Geoffrey E. "Distributed representations." (1984)

ConvolutionalNets

● DL book chapter on convolutional nts: http://www.iro.umontreal.ca/~bengioy/DLbook/convnets.html

● Generalization and Network Design Strategies(LeCun)

● ImageNet Classification with Deep Convolutional NeuralNetworks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, NIPS2012.

● On Random Weights and Unsupervised Feature Learning

Optimizationissues with DL

● Curriculum Learning

● Evolving Culture vs Local Minima

● Knowledge Matters: Importance of Prior Information forOptimization

● Efficient Backprop

● Practicalrecommendations for gradient-based training of deep architectures

● BatchNormalization

● Natural Gradient Works Efficiently (Amari 1998)

● Hessian Free

● Natural Gradient (TONGA)

● RevisitingNatural Gradient

NLP + DL

● The first journal paper on neural language models (there was aNIPS’2000 paper before):A Neural Probabilistic Language Model

● Natural Language Processing (Almost) from Scratch

● DeViSE: A Deep Visual-Semantic Embedding Model

● Distributed Representations of Words and Phrases andtheir Compositionality

● Dynamic Pooling and Unfolding Recursive Autoencoders forParaphrase Detection

CV+RBM

● Fields of Experts

● What makes a good model of natural images?

● Phone Recognition with the mean-covariance restrictedBoltzmann machine

● Unsupervised Models of Images by Spike-and-Slab RBMs

CV + DL

● Imagenet classiﬁcation with deep convolutional neuralnetworks

● Learning to relate images

ScalingUp

● Large Scale Distributed Deep Networks

● Random search for hyper-parameter optimization

● Practical Bayesian Optimization of Machine LearningAlgorithms

DL + Reinforcement learning

● PlayingAtari with Deep Reinforcement Learning

● True Online TD Lambda

GraphicalModels Background

● An Introduction to Graphical Models (MikeJordan, brief course notes)

● A View of the EM Algorithm that Justifies Incremental,Sparse and Other Variants (Neal & Hinton, important paper to themodern understanding of Expectation-Maximization)

● A Unifying Review of Linear Gaussian Models(Roweis & Ghahramani, ties together PCA, factor analysis, hidden Markovmodels, Gaussian mixtures, k-means, linear dynamical systems)

● An Introduction to Variational Methods for GraphicalModels (Jordan et al, mean-field, etc.)

Writing

● Writing a great research paper (videoof the presentation)

Software documentation

● Python, Theano, Pylearn2, Linux (bash) (at least the 5 first sections),git (5first sections),github/contributing to it (Theano doc), vim tutorial or emacs tutorial

Software lists of built-incommands/functions

● Bashcommands

● List of Built-in Python Functions

● vim commands

OtherSoftware stuff to know about:

● screen/tmux

● ssh

● ipython & ipython notebook (now Jupyter)

● matplotlib

● Caffe - caffe.berkeleyvision.org

● DIGITS - https://developer.nvidia.com/digits

0 0