文章/答案/技术大牛

发布

社区首页 >问答首页 >使用relu激活函数的4层神经网络工作不好

问使用relu激活函数的4层神经网络工作不好
EN

Stack Overflow用户

提问于 2019-04-18 15:59:20

回答 1查看 95关注 0票数 0

我一直在尝试使用relu激活函数来构建4层神经网络

但它并不能很好的工作。

我猜问题出在反向传播部分。

因为当我使用sigmoid激活函数时，其余的代码工作得很好

我只修复了反向传播部分

所以你们谁能告诉我我的代码出了什么问题

即将到来的代码是我的神经网络类的一部分。

另外，我不想使用任何深度学习框架，对不起！

    # train the neural network
def train(self, inputs_list, targets_list):
    # convert inputs list to 2d array
    inputs = numpy.array(inputs_list, ndmin=2).T
    targets = numpy.array(targets_list, ndmin=2).T

    # calculate signals into hidden layer
    hidden_inputs = numpy.dot(self.wih, inputs)
    # calculate the signals emerging from hidden layer
    hidden_outputs = self.activation_function(hidden_inputs)

    # calculate signals into hidden layer
    hidden_inputs2 = numpy.dot(self.wh1h2, hidden_outputs)
    # calculate the signals emerging from hidden layer
    hidden_outputs2 = self.activation_function(hidden_inputs2)

    # calculate signals into final output layer
    final_inputs = numpy.dot(self.wh2o, hidden_outputs2)
    # calculate the signals emerging from final output layer
    final_outputs = self.activation_function(final_inputs)

    # output layer error is the (target - actual)
    output_errors = targets - final_outputs
    # hidden layer error is the output_errors, split by weights, recombined at hidden nodes
    hidden_errors2 = numpy.dot(self.wh2o.T, output_errors) 
    # hidden layer error is the output_errors, split by weights, recombined at hidden nodes
    hidden_errors = numpy.dot(self.wh1h2.T, hidden_errors2)

    #Back propagation part
    # update the weights for the links between the hidden and output layers
    # self.wh2o += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs2))
    self.wh2o += self.lr * numpy.dot((output_errors * numpy.heaviside(final_inputs,0.0) ), numpy.transpose(hidden_outputs2))

    # update the weights for the links between the input and hidden layers
    self.wh1h2 += self.lr * numpy.dot((hidden_errors2 * numpy.heaviside(hidden_inputs2, 0.0) ), numpy.transpose(hidden_outputs))

    # update the weights for the links between the input and hidden layers
    self.wih += self.lr * numpy.dot((hidden_errors * numpy.heaviside(hidden_inputs, 0.0) ), numpy.transpose(inputs))



    pass

wh2o表示将第二个隐藏层传播到输出层的权重

wh1h2表示将第一隐藏层传播到第二层的权重

wih表示将输入层传播到隐藏层的权重

python-3.x

deep-learning

回答 1

Stack Overflow用户

发布于 2019-04-18 16:22:07

不考虑代码的细节，relu的受欢迎程度主要源于它在CNN中的成功。对于像这样的小回归问题，它是一个相当糟糕的选择，因为它确实将消失梯度问题带到了最前面；由于复杂的原因，这在CNN的大问题中并不是什么大问题。有很多方法可以让你的架构在渐变消失时更健壮，但我的第一个建议是不要使用relu (maxout是你最好的朋友，渐变消失是一个长期存在的问题)。归根结底，这很可能与您的代码中的问题无关，而纯粹是架构问题。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/55741553

复制

相似问题

问使用relu激活函数的4层神经网络工作不好
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用relu激活函数的4层神经网络工作不好EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用relu激活函数的4层神经网络工作不好
EN