Neural Network Architectures

This projects aims to examine the complexity of neural network architectures to leverage structure that improves the training phase.

Notebook

My notes on many things ML and DL

Conferences

International Conference on Learning Representations 2017 (April 24 - 26, 2017)`
AAAI (2 – 7 February – New Orleans, Louisiana, USA)

Argonne ML

NN Zoo
NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING, Barret Zoph∗, Quoc V. Le (barretzoph, qvl@google.com)

Network Architectures

-[] NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING https://openreview.net/pdf?id=r1Ue8Hcxg - _ Deep Convolutional Neural Networks For LVCSR

[] Designing Neural Network Architectures using Reinforcement Learning
[] Making Neural Programming Architectures Generalize via Recursion
[] DSD: Dense-Sparse-Dense Training for Deep Neural Networks
[] Introspection:Accelerating Neural Network Training By Learning Weight Evolution
[]
Incremental Growth of Semantic Branches on CNNs via Multi-Shot Learning Quanshi Zhang, Ruiming Cao, Ying Nian Wu and Song-Chun Zhu
Unsupervised Large Graph Embedding Feiping Nie, Wei Zhu and Xuelong Li
Regularization for Unsupervised Deep Neural Nets Baiyang Wang and Diego Klabjan
Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates Ilija Ilievski, Jiashi Feng, Taimoor Akhtar and Christine Shoemaker
Tunable Sensitivity to Large Errors in Neural Network Training Gil Keren, Sivan Sabato and Björn Schuller
Understanding the Semantic Structures of Tables with a Hybrid Deep Neural Network Architecture Kyosuke Nishida, Kugatsu Sadamitsu, Ryuichiro Higashinaka and Yoshihiro Matsuo
Learning to learn by gradient descent by gradient descent

Feature engineering - the data you have may have all info that is required by the model, but these might not be in a mode that can be leveraged.

Hyperparameters

Basic Definitions

Affine
Deep Learning](http://neuralnetworksanddeeplearning.com/chap6.html)
DL Bells and Wistles: Nerual Network Hyperparameters Hyper-Parameters
It has been shown that the use of computer clusters for hyper-parameter selection can have an important effect on results (Pinto et al., 2009).
We define a hyper-parameter for a learning algorithm A as a value to be selected prior to the ac- tual application of A to the data, a value that is not directly selected by the learning algorithm itself.
Matrices: Hessian matrix
- a square matrix of second-order paritla derivatives of a scalar-valued function or scalar-field.
Gauss-Newton matrix
Fisher information matrix

Training Time

What is a long time? So in reference to standard deep learning tasks when working with standard datasets such as MNIST,

Different NN Architectures

On ReLU (rectifying linear unit)
- Dealing with the bias term in NN
- See Yoshua Bengio and Yann Dauphin’s article Big Neural Networks Waste Capacity, which touches on this issue and explores the diminishing returns issue when constructing bigger or deeper neural networks.
Weigts evolution This is an interesting concept.

Other Interesting Papers
TensorFlow implementation of “Learning from Simulated and Unsupervised Images through Adversarial Training”

Frameworks

Convolutional Neural Networks

tensorflow

To avoid the warning, let’s install tensorflow from source. * Dependencies - brew install bazel - Install Bazel Once installe, you can upgrade to a newer version of Bazel with:sudo apt-get upgrade bazel

Warnings
- W tensorflow/core/platform/cpu_feature_guard.cc:45 The TensorFlow library wasn’t compiled to use SSE4.2 instructions
Invalid path to CUDA 8.0 toolkit. /usr/local/cuda/lib/libcudart.8.0.dylib cannot be found

Sal Aguinaga