ALS¶
- 
class pyspark.mllib.recommendation.ALS[source]¶
- Alternating Least Squares matrix factorization - New in version 0.9.0. - Methods - train(ratings, rank[, iterations, lambda_, …])- Train a matrix factorization model given an RDD of ratings by users for a subset of products. - trainImplicit(ratings, rank[, iterations, …])- Train a matrix factorization model given an RDD of ‘implicit preferences’ of users for a subset of products. - Methods Documentation - 
classmethod train(ratings, rank, iterations=5, lambda_=0.01, blocks=- 1, nonnegative=False, seed=None)[source]¶
- Train a matrix factorization model given an RDD of ratings by users for a subset of products. The ratings matrix is approximated as the product of two lower-rank matrices of a given rank (number of features). To solve for these features, ALS is run iteratively with a configurable level of parallelism. - New in version 0.9.0. - Parameters
- ratingspyspark.RDD
- RDD of Rating or (userID, productID, rating) tuple. 
- rankint
- Number of features to use (also referred to as the number of latent factors). 
- iterationsint, optional
- Number of iterations of ALS. (default: 5) 
- lambda_float, optional
- Regularization parameter. (default: 0.01) 
- blocksint, optional
- Number of blocks used to parallelize the computation. A value of -1 will use an auto-configured number of blocks. (default: -1) 
- nonnegativebool, optional
- A value of True will solve least-squares with nonnegativity constraints. (default: False) 
- seedbool, optional
- Random seed for initial matrix factorization model. A value of None will use system time as the seed. (default: None) 
 
- ratings
 
 - 
classmethod trainImplicit(ratings, rank, iterations=5, lambda_=0.01, blocks=- 1, alpha=0.01, nonnegative=False, seed=None)[source]¶
- Train a matrix factorization model given an RDD of ‘implicit preferences’ of users for a subset of products. The ratings matrix is approximated as the product of two lower-rank matrices of a given rank (number of features). To solve for these features, ALS is run iteratively with a configurable level of parallelism. - New in version 0.9.0. - Parameters
- ratingspyspark.RDD
- RDD of Rating or (userID, productID, rating) tuple. 
- rankint
- Number of features to use (also referred to as the number of latent factors). 
- iterationsint, optional
- Number of iterations of ALS. (default: 5) 
- lambda_float, optional
- Regularization parameter. (default: 0.01) 
- blocksint, optional
- Number of blocks used to parallelize the computation. A value of -1 will use an auto-configured number of blocks. (default: -1) 
- alphafloat, optional
- A constant used in computing confidence. (default: 0.01) 
- nonnegativebool, optional
- A value of True will solve least-squares with nonnegativity constraints. (default: False) 
- seedint, optional
- Random seed for initial matrix factorization model. A value of None will use system time as the seed. (default: None) 
 
- ratings
 
 
- 
classmethod