Deep Learning 7 - Reduce the value of a loss function by a gradient
In order to minimize the value of a loss function, you need calculate a gradient for the weight. If the weight has a metrix;
Then, the gradient is represented by;
In numerical differentiation, the first element will be calculated as follow;
The source code is not so difficult.
07_gradient.py
def numerical_gradient(f, x):
h = 1e-4
grad = np.zeros_like(x)
it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
while not it.finished:
idx = it.multi_index
tmp = x[idx]
x[idx] = float(tmp) + h
fxh1 = f(x)
x[idx] = tmp - h
fxh2 = f(x)
grad[idx] = (fxh1 - fxh2) / (2*h)
x[idx] = tmp
it.iternext()
return grad
Finally, let’s calculate the gradient in the simple neural network by using softmax in Deep Learning 2 and cross_entropy_error in Deep Learning 6.
07_gradient.py
import numpy as np
from functions import softmax, cross_entropy_error
from gradient import numerical_gradient
class simpleNet:
def __init__(self):
self.W = np.random.randn(2,3)
def predict(self, x):
return np.dot(x, self.W)
def loss(self, x, t):
z = self.predict(x)
y = softmax(z)
loss = cross_entropy_error(y, t)
return loss
x = np.array([0.6, 0.9])
t = np.array([0, 0, 1])
net = simpleNet()
f = lambda w: net.loss(x, t)
dW = numerical_gradient(f, net.W)
print(dW)
This is the result of the calculation;
[[ 0.02738297 0.22400595 -0.25138893]
[ 0.04107446 0.33600893 -0.37708339]]
You can see and which means and should be updated in a negative and positive sense respectively from the perspective of the loss function.
The sample code is here.