AnswerBun.com

StandardScaler returns NaN

env:

spark-1.6.0 with scala-2.10.4

usage:

// row of df : DataFrame = (String,String,double,Vector) as (id1,id2,label,feature)
val df = sqlContext.read.parquet("data/Labeled.parquet")
val SC = new StandardScaler()
.setInputCol("feature").setOutputCol("scaled")
.setWithMean(false).setWithStd(true).fit(df) 


val scaled = SC.transform(df)
.drop("feature").withColumnRenamed("scaled","feature")

Code as the example here http://spark.apache.org/docs/latest/ml-features.html#standardscaler

NaN exists in scaled, SC.mean, SC.std

I don’t understand why StandardScaler could do this even in mean or how to handle this situation. Any advice is appreciated.

data size as parquet is 1.6GiB, if anyone needs it just let me know

UPDATE:

Get through the code of StandardScaler and this is likely to be a problem of precision of Double when MultivariateOnlineSummarizer aggregated.

Stack Overflow Asked by skywalkerytx on December 31, 2020

1 Answers

One Answer

There is a value equals to Double.MaxValue and when StandardScaler sum the columns, result overflows.

Simply cast those column to scala.math.BigDecimal works.

ref here:

http://www.scala-lang.org/api/current/index.html#scala.math.BigDecimal

Correct answer by skywalkerytx on December 31, 2020

Add your own answers!

Related Questions

numpy array type not supported?

1  Asked on December 30, 2021 by user3066560

     

Angular Material not applying styles

1  Asked on December 30, 2021 by xeraphim

   

Calibrating undistortion reducing the size of image

1  Asked on December 30, 2021 by naga-kiran

       

Spring Boot @Component doesn’t create Beans

2  Asked on December 27, 2021 by bart-kosmala

     

How to declare a function using assignment operator?

2  Asked on December 27, 2021 by legendrary-saiyan

   

How can i get {{form}} value with javascript?

2  Asked on December 27, 2021 by user12584812

   

how to align image and label side by side in html?

3  Asked on December 27, 2021 by jeet-viramgama

   

Django no module named ‘main’

1  Asked on December 27, 2021 by seyed-moein-ayyoubzadeh

   

How to get the sum of the values of the JSON Array?

4  Asked on December 27, 2021 by iri0021

   

Ask a Question

Get help from others!

© 2022 AnswerBun.com. All rights reserved.