文章/答案/技术大牛

发布

社区首页 >问答首页 >用多层感知器求解3个数据点的异或

问用多层感知器求解3个数据点的异或
EN

Stack Overflow用户

提问于 2018-02-05 01:23:55

回答 1查看 1.3K关注 0票数 0

已知的XOR问题是由多层感知器解决的，给定所有4个布尔输入和输出，它训练和存储再现I/O等所需的权重。

import numpy as np
np.random.seed(0)

def sigmoid(x): # Returns values that sums to one.
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(sx):
    # See https://math.stackexchange.com/a/1225116
    return sx * (1 - sx)

# Cost functions.
def cost(predicted, truth):
    return truth - predicted

xor_input = np.array([[0,0], [0,1], [1,0], [1,1]])
xor_output = np.array([[0,1,1,0]]).T

X = xor_input
Y = xor_output

# Define the shape of the weight vector.
num_data, input_dim = X.shape
# Lets set the dimensions for the intermediate layer.
hidden_dim = 5
# Initialize weights between the input layers and the hidden layer.
W1 = np.random.random((input_dim, hidden_dim))

# Define the shape of the output vector. 
output_dim = len(Y.T)
# Initialize weights between the hidden layers and the output layer.
W2 = np.random.random((hidden_dim, output_dim))

num_epochs = 10000
learning_rate = 1.0

for epoch_n in range(num_epochs):
    layer0 = X
    # Forward propagation.

    # Inside the perceptron, Step 2. 
    layer1 = sigmoid(np.dot(layer0, W1))
    layer2 = sigmoid(np.dot(layer1, W2))

    # Back propagation (Y -> layer2)

    # How much did we miss in the predictions?
    layer2_error = cost(layer2, Y)
    # In what direction is the target value?
    # Were we really close? If so, don't change too much.
    layer2_delta = layer2_error * sigmoid_derivative(layer2)


    # Back propagation (layer2 -> layer1)
    # How much did each layer1 value contribute to the layer2 error (according to the weights)?
    layer1_error = np.dot(layer2_delta, W2.T)
    layer1_delta = layer1_error * sigmoid_derivative(layer1)

    # update weights
    W2 +=  learning_rate * np.dot(layer1.T, layer2_delta)
    W1 +=  learning_rate * np.dot(layer0.T, layer1_delta)

我们看到，我们已经对网络进行了充分的培训，以便记住XOR的输出：

# On the training data
[int(prediction > 0.5) for prediction in layer2]

输出

[0, 1, 1, 0]

如果我们重新输入相同的输入，就会得到相同的输出：

for x, y in zip(X, Y):
    layer1_prediction = sigmoid(np.dot(W1.T, x)) # Feed the unseen input into trained W.
    prediction = layer2_prediction = sigmoid(np.dot(W2.T, layer1_prediction)) # Feed the unseen input into trained W.
    print(int(prediction > 0.5), y)

输出

0 [0]
1 [1]
1 [1]
0 [0]

但是如果我们没有一个数据点对参数(W1和W2)进行重新训练，即

xor_input = np.array([[0,0], [0,1], [1,0], [1,1]])
xor_output = np.array([[0,1,1,0]]).T

让我们删除最后一行数据，并将其用作不可见的测试。

X = xor_input[:-1]
Y = xor_output[:-1]

对于其他相同的代码，无论我如何更改超参数，都无法学习XOR函数和复制I/O。

for x, y in zip(xor_input, xor_output):
    layer1_prediction = sigmoid(np.dot(W1.T, x)) # Feed the unseen input into trained W.
    prediction = layer2_prediction = sigmoid(np.dot(W2.T, layer1_prediction)) # Feed the unseen input into trained W.
    print(int(prediction > 0.5), y)

输出

0 [0]
1 [1]
1 [1]
1 [0]

即使我们洗牌输入/输出：

# Shuffle the order of the inputs
_temp = list(zip(X, Y))
random.shuffle(_temp)
xor_input_shuff, xor_output_shuff = map(np.array, zip(*_temp))

我们不能完全训练XOR函数：‘

for x, y in zip(xor_input, xor_output):
    layer1_prediction = sigmoid(np.dot(W1.T, x)) # Feed the unseen input into trained W.
    prediction = layer2_prediction = sigmoid(np.dot(W2.T, layer1_prediction)) # Feed the unseen input into trained W.
    print(x, int(prediction > 0.5), y)

输出

[0 0] 1 [0]
[0 1] 1 [1]
[1 0] 1 [1]
[1 1] 0 [0]

因此，当文献指出多层感知器(也称基础深度学习)解决了异或问题时，是否意味着它可以完全学习和记忆给定全部输入/输出集的权重，但在缺少一个数据点？的情况下，不能推广异或问题。

下面是Kaggle数据集的链接，回答者可以自己测试网络：https://www.kaggle.com/alvations/xor-with-mlp/

numpy

neural-network

deep-learning

xor

perceptron

回答 1

Stack Overflow用户

发布于 2018-02-05 01:46:36

我认为学习(概括)异或和记忆异或是不同的。

两层感知器可以记忆异或，也就是说，在损失最小等于0(绝对最小)的情况下，存在一个权重组合。

如果这些权重是随机初始化的，那么最终可能会出现实际学习XOR而不仅仅是记忆的情况。

请注意，多层感知器是非凸函数，因此可能存在多个极小(多个全局极小)。当数据缺少一个输入时，就会有多个极小值(而且所有的值都相等)，并且存在丢失点将被正确分类的极小值。因此，MLP可以学习异或。(不过，要找出权重组合可能很难，因为缺少了一点)。

神经网络是一种通用的函数逼近器，甚至可以逼近非意义标号。在这种情况下，您可能想看看这个工作，https://arxiv.org/abs/1611.03530

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/48614723

复制

相似问题

问用多层感知器求解3个数据点的异或
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问用多层感知器求解3个数据点的异或EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问用多层感知器求解3个数据点的异或
EN