Bipolar Sigmoid Activation Function

5/28/2018by
Bipolar Sigmoid Activation Function Rating: 7,2/10 1110reviews

Perceptron networks are single layer. Because of their non-continues activation function you can't use back-propagation algorithm on them, so they can't be multi-layer. Instead Sigmoid function is a differentiable function and you can use back-propagation algorithm on them. In Perception you want to adjust weights you use: W(new) = W(old) + a(t-x)y when a is learning rate, t is target value, x is your input vector, and y is the output. Instead when you want to use Sigmoid function, you have to use gradient-based algorithms. In these algorithms you adjust weights according to error derivative. W(new) = W(old) - a(dE/dW) In a multi-layer network you can't use Perception algorithm Because it needs correct output and you don't know the correct output of a hidden neuron.

Bipolar Sigmoid Activation Function

Efficient FPGA Implementation of Sigmoid and Bipolar Sigmoid Activation Functions for Multilayer Perceptrons ISSN: 2250-3021 www.iosrjen.org 1353 P a g e Fig 1 represents a basic simple artificial neuron model. The inputs to the neuron are x0, x1, x2 and the w0, w1, w2 are the corresponding weight values. Then apply the bipolar sigmoid function for activation. Which activation function should be used in a prediction. (sigmoid activation function).

So in multi-layer networks you have to use gradient-based algorithm and back-propagation for back propagating Error and dE/dW. In a single layer neural network you can use either Perception or gradient-based algorithm.

You can't tell witch on is better. Perception give you better grouping and gradient-based give you more noise resistance. In gradient-based algorithms you use derivative of activation function in order to find dE/dW. If Z is total input of the neuron (Z = [sum on i] WiXi): dE/dWi = Xi(dE/dZ) dE/dZ = -f'(Z)(dE/dY) In our case, because we used Sigmoid function, f'(Z) is Y(1-Y) for binary Sigmoid and 0. Broadcom Bcm94312mcg Driver Download Windows 7. 5(1-Y)(1+Y) for bipolar Sigmoid. Normally we use following equation for error (cost function): E = 0.5(T-Y)^2 So our equations will transform to: dE/dY = Y-T dE/dZ = -0.5(1+Y)(1-Y)(Y-T) dE/dWi = - 0.5Xi(1+Y)(1-Y)(Y-T) W(new) = W(old) + 0.5aXi(1+Y)(1-Y)(Y-T) If you use following algorithm for updating weights I think your problems will be solved. The following is differentiation of Sigmoid function.

'np.exp' is the same as; The number e, a mathematical constant that is the base of the natural logarithm: the unique number whose natural logarithm is equal to one. It is approximately equal to 2.71828.

(Wikipedia) # This is how mathematical the derivative of sigmoid is computed. # Variables are only used as example for differentiation. Import numpy as np x = 0.32 sigmoid = 1 / 1 + np.exp(-x) differentiate = np.exp(-x) / (1+np.exp(-x)**2 differentiate_1 = np.exp(-x) - 1 / (1+np.exp(-x)**2 differentiate_2 = (1+np.exp(-x) / (1+np.exp(-x)**2) - (1/1+np.exp(-x))**2 differintiate_3 = sigmoid - sigmoid**2 sigmoid_prime = sigmoid * (1- sigmoid) The transfer function, or sigmoid function, converts values in to probabilities from 0 to 1. Sigmoid prime has a nice curve and converts values in range of 0 to 0.5.

Comments are closed.