ML Notes on Learning and working with DNN

Deep learning: A brief walk into the foundations underlying neural networks and deep learning.

Neural Networks

Representation learning is a set of methods that allows an algorithm automatically find the Representations needed for detection or classification.

Historical Notes

  • Convolutional Networks These network architectures came from a power model architecture introduced by the neocognitron (Fukushima, 1980) for processing images and inspired by the structure of the mammalian visual system see LeCun et al. (1998).

  • Rectified linear unit

  • Models of symbolic reasoning

  • Connectionism and concepts of distributed representation

  • Back-propagtion is an algorithm that dominates the way we train deep models

  • Sequence modeling tasks An example is natural language processing tasks (used in Google technology)

  • Kernel Machines

  • Graphical models

  • Deep belief neural network (Hinton 2006)

Machine Learning

In machine learning we start out with a concrete problem or challenge. This problem is usually coupled to a dataset. You have arrived to a new level of computing when you are tackling problems at scale and in the core realm of BigData. In other words, now you have to be in tune with your computing resources (e.g, one needs to understand what CPUs and what GPUs are on target computing system).

Types of learning - Representation learning - Deep learning - Supervised, unsupervised, and semi-supervised learning

Before we look at the various methods available in machine learning for solving problems, let’s start by considering data and the questions around that data motivating our interest.

The Data

  • Training data
  • Unseen data (or test data)

Supervised learning

  • Support Vector Machine This algorithm can handle and infinite number of features (or attributes) SVM is a classification algorithm - the type of question you can ask is if something belongs to a particular class. The objective of this algorithm is to find the optimal separating hyperplane. Think in terms of what is the largest margin we can find on each side of the the line for the given training data.

    Is the data linearly separable, if not, then is do we need nonlinear separation?

  • Links Kathy at Columbia analyticsvidhya

  • Unsupervised learning

Minibatch stochastic gradient descent algorithm

Objective function

  • Sum of differentiable functions (finding its minimums or maximums by iteration.
  • A differentiable function, from Wikipedia, In calculus (a branch of mathematics), a differentiable function of one real variable is a function whose derivative exists at each point in its domain.

    Gradient Descent

Stochastic gradient descent

System Setup

Setting up a dev machine. - Note that on a MacBook Pro (Retina, 13-inch, Early 2015) Theano cannot use the GPU devices. - On a mid 2010 mac mini with NVIDIA graphics

Mac Mini with NVIDIA graphics

Getting the code CUDA downloads - Specifically getting this image: CUDA Toolkit dmg

Working Locally

 # To activate this environment, use:
 # > source activate ipykernel_py3
 #
 # To deactivate this environment, use:
 # > source deactivate ipykernel_py3
 #

Working locally on Mac OS

To set it up here is what I had to do:

saguinag@sailntrpy:~$ pip install --upgrade pip
Collecting pip
  Downloading pip-9.0.1-py2.py3-none-any.whl (1.3MB)
    100% |████████████████████████████████| 1.3MB 971kB/s
Installing collected packages: pip
  Found existing installation: pip 9.0.0
    Uninstalling pip-9.0.0:
      Successfully uninstalled pip-9.0.0
Successfully installed pip-9.0.1
saguinag@sailntrpy:~$ sudo -H pip install --upgrade virtualenv
Requirement already up-to-date: virtualenv in /usr/local/lib/python3.5/site-packages
saguinag@sailntrpy:~$ $ virtualenv --system-site-packages ~/tensorflow


~$ which virtualenv
	/usr/local/bin/virtualenv

virtualenv --system-site-packages ~/tensorflow
	Using base prefix '/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5'
	New python executable in /Users/saguinag/tensorflow/bin/python3.5
	Not overwriting existing python script /Users/saguinag/tensorflow/bin/python (you must use /Users/saguinag/tensorflow/bin/python3.5)
	Installing setuptools, pip, wheel...done.

source ~/tensorflow/bin/activate
(tensorflow) saguinag@sailntrpy:~$

Connect to DSGx

If it applies, first connect via VPN. Using the following command mount DSGx locally (on Mac OS X): sshfs -o IdentityFile=~/.ssh/id_rsa username@dsg2.crc.nd.edu:/home/username/ /local/Vol/working_dir/

But this doesn’t work with virtualenv, so login via ssh to dsgx (!! this might not be entirely true.)

To umount do: umount -f DIR/

virtualenv -p /usr/bin/python2.7 venv/ source venv/bin/activate

With DSG2 and Tensorflow using virtualenv

Got it to work on DSG2:

  513  virtualenv --system-site-packages ~/tensorflow
  514  source ~/tensorflow/bin/activate
  515  export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.11.0rc2-cp27-none-linux_x86_64.whl
  516  pip install --upgrade $TF_BINARY_URL
  517  python

TensorFlow

Links to tensorflow tutorials

ToDo: Need to install it locally for now

The working directory is /Volumes/theory/entropy/DeepLerning. Go into this directory and launch Jupyter using the following command, jupyter notebook&

Basic Test

Theano

On your favorite Search Engine type download install theano and you will end up this site: the docs for Theano 0.8 can be found Installing Theano Docs and now depending on your platform got to the specific instructions and pip install it.

Following are my development notes.

To test that the system has GPU capability

My MacBookPro’s hardware Intel Iris Graphics 6100 is apparently not compatible with Theano in GPU mode. Thus, we are going to switch to a hardware system with NVIDIA graphics.

Installation macOS

Examples

Baby Steps

comments powered by Disqus