Deep Learning 7 - Reduce the value of a loss function by a gradient

In order to minimize the value of a loss function, you need calculate a gradient for the weight. If the weight has a metrix;

Then, the gradient is represented by;

In numerical differentiation, the first element will be calculated as follow;

The source code is not so difficult.

07_gradient.py
def numerical_gradient(f, x):
    h = 1e-4
    grad = np.zeros_like(x)

    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
    while not it.finished:
        idx = it.multi_index
        tmp = x[idx]
        x[idx] = float(tmp) + h
        fxh1 = f(x)

        x[idx] = tmp - h
        fxh2 = f(x)
        grad[idx] = (fxh1 - fxh2) / (2*h)

        x[idx] = tmp
        it.iternext()

    return grad

Finally, let’s calculate the gradient in the simple neural network by using softmax in Deep Learning 2 and cross_entropy_error in Deep Learning 6.

07_gradient.py
import numpy as np
from functions import softmax, cross_entropy_error
from gradient import numerical_gradient


class simpleNet:
    def __init__(self):
        self.W = np.random.randn(2,3)

    def predict(self, x):
        return np.dot(x, self.W)

    def loss(self, x, t):
        z = self.predict(x)
        y = softmax(z)
        loss = cross_entropy_error(y, t)

        return loss

x = np.array([0.6, 0.9])
t = np.array([0, 0, 1])

net = simpleNet()

f = lambda w: net.loss(x, t)
dW = numerical_gradient(f, net.W)

print(dW)

This is the result of the calculation;

[[ 0.02738297  0.22400595 -0.25138893]
 [ 0.04107446  0.33600893 -0.37708339]]

You can see and which means and should be updated in a negative and positive sense respectively from the perspective of the loss function.

The sample code is here.

Reference