我在用L2正则化读取SGD的开源mllib代码时遇到了困难。
代码是
class SquaredL2Updater extends Updater {
override def compute(
weightsOld: Vector,
gradient: Vector,
stepSize: Double,
iter: Int,
regParam: Double): (Vector, Double) = {
// add up both updates from the gradient of the loss (= step) as well as
// the gradient of the regularizer (= regParam * weightsOld)
// w' = w - thisIterStepSize * (gradient + regParam * w)
// w' = (1 - thisIterStepSize * regParam) * w - thisIterStepSize * gradient
val thisIterStepSize = stepSize / math.sqrt(iter)
val brzWeights: BV[Double] = weightsOld.toBreeze.toDenseVector
brzWeights :*= (1.0 - thisIterStepSize * regParam)
brzAxpy(-thisIterStepSize, gradient.toBreeze, brzWeights)
val norm = brzNorm(brzWeights, 2.0)
(Vectors.fromBreeze(brzWeights), 0.5 * regParam * norm * norm)
}我遇到麻烦的地方是
brzWeights :*= (1.0 - thisIterStepSize * regParam)微风库有解释:*=操作符的文档。
/** Mutates this by element-wise multiplication of b into this. */
final def :*=[TT >: This, B](b: B)(implicit op: OpMulScalar.InPlaceImpl2[TT, B]): This = {
op(repr, b)
repr
}它看起来就像向量被标量相乘一样。
在L2正则化情况下,我找到的梯度公式是

在此更新中,代码如何表示此梯度?有人能帮忙吗。
发布于 2015-09-08 23:05:45
好吧,我想出来了。更新者方程是

重新排列条款

识别最后一项只是梯度。

这相当于代码中的
brzAxpy(-thisIterStepSize, gradient.toBreeze, brzWeights)把它弄出来
brzWeights = brzWeights + -thisIterStepSize * gradient.toBreeze在上一行中,brzWeights :*= (1.0 - thisIterStepSize * regParam)
这意味着brzWeights = brzWeights * (1.0 - thisIterStepSize * regParam)
所以,终于
brzWeights = brzWeights * (1.0 - thisIterStepSize * regParam) + (-thisIterStepSize) * gradient.toBreeze现在,代码和方程在一个归一化因子内匹配,我相信这在下面的行中得到了处理。
https://stackoverflow.com/questions/32403958
复制相似问题