文章/答案/技术大牛

发布

社区首页 >问答首页 >MNIST 2层神经网络无法识别某些数字

问MNIST 2层神经网络无法识别某些数字
EN

Data Science用户

提问于 2020-12-14 03:36:47

回答 1查看 120关注 0票数 0

最近，我研究了一些基本的多层神经网络，我决定尝试使用隐藏层中有100个神经元的2层神经网络来处理手写体数字的MNIST数据库。

我的网络有一个使用ReLU的隐藏层，输出层使用sigmoid函数。成本由MSE计算，权重用SGD更新。下面是用MATLAB编写的训练网络的代码。

在训练这个网络时，我发现它不知怎么没能识别出数据集中的8和9。它在识别其他数字方面存在一些问题，但不像8或9那样多。

我怎样才能解决这个问题？

编辑:从随机的一组权重训练网络的结果是非常不一致的，而且往往精度很低。反向传播步骤是否有问题，还是网络结构不是最好的数字识别方法？

*dsigmoid和dReLU是它们各自函数的导数。

function [trainedWeights1, trainedWeights2] = twoLayerSGD(weights1, weights2, inputs, outputs, alpha, shuffle)
% Trains a 2 layer perceptron using stochastic gradient descent and MSE
%
% function [trainedWeights1, trainedWeights2] = twoLayerSGD(weights2, weights1, inputs, outputs, alpha, numHiddenLayers)
% Inputs:
%   weights1 = weights between input layer and hidden layer as a matrix
%              where row i and column j represents the weight between node
%              j of the input layer and node i of the hidden layer
%   weights2 = weights between hidden layer and output layer as a matrix
%              where row i and column j represents the weight between node j of the
%              input layer and node i of the hidden layer
%   inputs = inputs of the training data, where each set of inputs is
%            stored as a row vector
%   outputs = the correct outputs of the training data, where each set of
%             outputs is stored as a row vector
%   alpha = the learning rate, default is 0.005
%   shuffle = boolean value, determines whether the training cases will be
%             randomly ordered, default is false
%
% Outputs:
%   trainedWeights1 = the trained weight matrix for the weights between the
%                     input layer and hidden layer
%   trainedWeights2 = the trained weight matrix for the weights between the
%                     hidden layer and output layer

if ~exist('alpha','var') || isempty(alpha)
    alpha = 0.005;
end

if ~exist('shuffle','var') || isempty(shuffle)
    shuffle = false;
end

% Determining the size of the input data set
N = size(inputs,1);
fprintf('Data set size: %.0f\n', N);

% Shuffle input data
if shuffle
    fprintf('Shuffling data...\n')
    order = randperm(size(inputs, 1));
    inputs = inputs(order, :);
    outputs = outputs(order, :);
    fprintf('Shuffle complete\n')
end

% Train Model
fprintf('Starting training process...\n')
trainedWeights1 = weights1;
trainedWeights2 = weights2;

for dataSet = 1:N
    % Get inputs of current data set
    input = inputs(dataSet,:); % row vector
    
    % Calculate activations of hidden nodes and the output
    Z1 = trainedWeights1*input'; % column vector
    hiddenActivation = ReLU(Z1); % column vector
    Z2 = trainedWeights2*hiddenActivation; % column vector
    output = sigmoid(Z2)'; % row vector
    
    % Calculate the error/cost of the current weights
    correctOutput = outputs(dataSet,:); % row vector
    cost = sum((output - correctOutput).^2);
    
    % Update Weights
    trainedWeights2 = trainedWeights2 - alpha * 2*(output'-correctOutput') .* hiddenActivation' .* dsigmoid(Z2);
    S = sum(2*(output'-correctOutput') .* trainedWeights2 .* dsigmoid(Z2),1);
    trainedWeights1 = trainedWeights1 - alpha * S' .* input .* dReLU(Z1);
    
    % Display status
    if mod(dataSet, 1000) == 0
        fprintf('Data set %.0fk/%.0fk complete!\n',dataSet/1000,N/1000);
    end
end
end

neural-network

matlab

mnist

回答 1

Data Science用户

回答已采纳

发布于 2020-12-17 20:20:25

乙状结肠激活和MSE丢失通常不太好。首先，我将使用标准方法，即softmax和交叉熵损失.

票数 0

页面原文内容由Data Science提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://datascience.stackexchange.com/questions/86652

复制

相似问题

问MNIST 2层神经网络无法识别某些数字
EN

回答 1

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问MNIST 2层神经网络无法识别某些数字EN

回答 1

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问MNIST 2层神经网络无法识别某些数字
EN