StandardScaler (Spark 3.0.0 JavaDoc)

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

Object
- org.apache.spark.mllib.feature.StandardScaler

All Implemented Interfaces:

org.apache.spark.internal.Logging
```
public class StandardScaler
extends Object
implements org.apache.spark.internal.Logging
```
Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set.
The "unit std" is computed using the corrected sample standard deviation (https://en.wikipedia.org/wiki/Standard_deviation#Corrected_sample_standard_deviation), which is computed as the square root of the unbiased sample variance.
param: withMean False by default. Centers the data with mean before scaling. It will build a dense output, so take care when applying to sparse input. param: withStd True by default. Scales the data to unit standard deviation.

Constructor Summary

Constructors
Constructor and Description

StandardScaler()

StandardScaler(boolean withMean, boolean withStd)

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`StandardScalerModel`	`fit(RDD<Vector> data)` Computes the mean and variance and stores as a model to be used for later scaling.

Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.internal.Logging
$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize

- Constructor Detail
  - StandardScaler
```
public StandardScaler(boolean withMean,
                      boolean withStd)
```
  - StandardScaler
```
public StandardScaler()
```
- Method Detail
  - fit
```
public StandardScalerModel fit(RDD<Vector> data)
```
    Computes the mean and variance and stores as a model to be used for later scaling.
    
    Parameters:
    
    data - The data used to compute the mean and variance to build the transformation model.
    
    Returns:
    
    a StandardScalarModel

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method