Personal/Andrew Ng Notes/VIII Neural Networks - Representation

From Deep_Learning_Machine_Learning_and_Artificial_Intelligence
Jump to: navigation, search

Contents

Non-linear hypotheses

https://class.coursera.org/ml-003/lecture/43

Non-linear classification

  • If you have two terms x1 and x2, you can build hypotheses with many combinations with two features, such as x1, x2, x1x2, x1^2 * x2, x1 * x2^2, and so on. You might be able to build hypotheses that works for the problem.
  • But what if there are more features?
  • It will result in two many features; it may overfit, it might be computationally expensive to run.
  • How about image detection. Too many features (pixel intensity, or RGB value) to represent to solve using logistic regression.

Neurons and the Brain

https://class.coursera.org/ml-003/lecture/44

  • Neural network works pretty well than other machine learning. neural network is very effective.
  • origin: algirhtms that try to mimic the brain.
  • was popular in 80s, and early 90s. popularity diminished late 90s - maybe too many computation power needed.
  • recent resurgence: state-of-art computer technique.

One learning algorithm hypothesis

  • auditory cortex which is a part of brain is connected to ears, and knows how to hear.
  • cut the connection, and connect it to eye, then the auditory cortex learned to see.
  • somatorsensory cortex which is responsible for sensing. It also learned to see.
  • neuro-rewire. maybe brain cells are capable of learning anything. not specific to each part. vision can be handled by any part.
  • connect any sensor to any part of brain, and brain learns how to use it.

Model representation

https://class.coursera.org/ml-003/lecture/45 https://class.coursera.org/ml-003/lecture/46

Neurons in the brain

  • nucleus : in the cell body
  • dendrite : input wires
  • axon : output wires
  • nucleus has calculation unit.
  • pulses of electricity.
  • axon => (pulse) => dendrite of other neuron =>

Neuron model

  • neuron as a logistic unit
  • input wires : gets number of input
  • neuron does computation
  • output wire - computation of h(theta)(x)
  • x0: bias unit.
  • sigmoid (logistic) activation function. g(z) = 1 / (1 + e^-z)
  • theta: weights of the model. parameters.

Neural network

  • group of different neurons connected together.
  • Layer1: input x1, x2, x3 => input layer
  • Layer2: a1, a2, a3 : hidden layer
  • Layer3: neuron : output layer
  • output
  • a_i(j): activation of unit i in layer j
  • THETA(j): matrix of weights from layer j to layer j + 1
  • use sigmoid

Forward propagation: vectorized implementation

Neural netowrk learning its own features

  • similar to logistic regression
  • a1, a2, a3 becomes feature like x1, x2, x3
  • each neuron learns logic regression
  • it can implement complicated model

Neural Network Architecture

  • How neuron are connected.
  • any layer not input or output layer are called hidden layer.

Example and Intuitions

https://class.coursera.org/ml-003/lecture/47 https://class.coursera.org/ml-003/lecture/48

Logic gates

  • (x1, x2) are binary
  • For AND logic, it doesn't need a hidden layer.
  • With logistic regression, can implement and and or logic.
  • sigmoid. sigmoid(4.6) => 0.99, sigmoid(-4.6) => 0.01
  • AND logic: theta = [ -30; 20; 20 ] => and logic.
  • OR logic: theta = [-10; 20; 20 ]
  • NOT logic: theta = [ 10; -20 ]
  • (NOT x1) AND (NOT x2) : output is 1 only for (x1, x2) = (0, 0)
    • positive weight for bias
    • negative weight for both x1 and x2 like for NOT logic
    • the abs(negative weigth) should be bigger than abs(positive weight)
    • theta [ 10; -20; -20 ]

Non-linear classification example: XOR, XNOR

  • XNOR
  • for (x1 and x2) : [ -30; 20; 20 ]
  • for (!x1 and !x2) : [ 10; -20; -20 ]
  • for (x1 or x2) : [ -10; 20; 20 ]
  • input layer to a1(2) : x1 & x2 => (0 0 0 1)
  • input layer to a2(2) : !x & !x2 => (1 0 0 1)
  • a1(2), a2(2) to a1(3) : OR => (1 0 0 1) => XNOR
  • XOR (0 1 1 0)
  • (x1 | x2) & (!x1 | !x2) : (0 1 1 1) & (1 1 1 0)
  • use one hidden layer

Handwritten digit classification (Yann LeCun)

  • Yann LeCun : NYU. Founding father of convolutional network (CNN)
  • CNN: feed forward

Multiclass classification

https://class.coursera.org/ml-003/lecture/49

Multiclass classification

  • like digit recognization
  • multiple output units: one vs all
  • want to detect image => output is A;B;C;D => 4 output ports, each for one vs all
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox