我是scala spark的新手,正在尝试执行下面的操作dataframe列我有一个包含字母数值的列,希望基于数学运算更新这些值,
+--------------------------------------+
|Error |
+--------------------------------------+
|value: 0.25 Does not meet Requirements|
|value: 0.5 Does not meet Requirements|
|value: 0.75 Does not meet Requirements|
|value: 0.66 Does not meet Requirements|
|value: 0.34 Does not meet Requirements|
+--------------------------------------+我想执行数值操作(1- {Numeric from String}),并用新值更新该列。
例如,我希望输出如下所示
+--------------------------------------+
|Error |
+--------------------------------------+
|value: 0.75 Does not meet Requirements|
|value: 0.5 Does not meet Requirements|
|value: 0.25 Does not meet Requirements|
|value: 0.34 Does not meet Requirements|
|value: 0.66 Does not meet Requirements|
+--------------------------------------+任何帮助将不胜感激,我学习使用正则表达式的列方法,但执行数学运算,我没有得到任何线索。
问候Mahi
发布于 2020-05-05 01:08:17
假设您有多个列:
+------+--------------------+
| col1| Error|
+------+--------------------+
| first|value: 0.25 Does ...|
|second|value: 0.5 Does ...|
| third|value: 0.75 Does ...|
|fourth|value: 0.66 Does ...|
| fifth|value: 0.34 Does ...|
+------+--------------------+您可以使用split和mkString来更新列Error。
val subtractFromOne: Double => String = number =>
(BigDecimal(1.0) - BigDecimal(number)).toString()
val transform: String => String = s => s.split(' ') match {
case Array(first, number, rest@_*) =>
(Seq(first, subtractFromOne(number.toDouble)) ++ rest).mkString(" ")
case _ => s // in case if the string is invalid we can return it unchanged
}
implicit val enc: Encoder[Row] = RowEncoder(df.schema)
df
.map(row => Row(row(0), transform(row.getString(1))))
.show()将输出:
+------+--------------------------------------+
| col1| Error|
+------+--------------------------------------+
| first|value: 0.75 Does not meet Requirements|
|second|value: 0.5 Does not meet Requirements|
| third|value: 0.25 Does not meet Requirements|
|fourth|value: 0.34 Does not meet Requirements|
| fifth|value: 0.66 Does not meet Requirements|
+------+--------------------------------------+使用BigDecimal来保持规模
https://stackoverflow.com/questions/61596938
复制相似问题