public class StandardScaler
extends Object
implements org.apache.spark.internal.Logging
The "unit std" is computed using the corrected sample standard deviation (https://en.wikipedia.org/wiki/Standard_deviation#Corrected_sample_standard_deviation), which is computed as the square root of the unbiased sample variance.
param: withMean False by default. Centers the data with mean before scaling. It will build a dense output, so take care when applying to sparse input. param: withStd True by default. Scales the data to unit standard deviation.
| Constructor and Description |
|---|
StandardScaler() |
StandardScaler(boolean withMean,
boolean withStd) |
| Modifier and Type | Method and Description |
|---|---|
StandardScalerModel |
fit(RDD<Vector> data)
Computes the mean and variance and stores as a model to be used for later scaling.
|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitializepublic StandardScaler(boolean withMean,
boolean withStd)
public StandardScaler()
public StandardScalerModel fit(RDD<Vector> data)
data - The data used to compute the mean and variance to build the transformation model.