LinearDataGenerator#
- class pyspark.mllib.util.LinearDataGenerator[source]#
Utils for generating linear data.
New in version 1.5.0.
Methods
generateLinearInput
(intercept, weights, ...)New in version 1.5.0.
generateLinearRDD
(sc, nexamples, nfeatures, eps)Generate an RDD of LabeledPoints.
Methods Documentation
- static generateLinearInput(intercept, weights, xMean, xVariance, nPoints, seed, eps)[source]#
New in version 1.5.0.
- Parameters
- interceptfloat
bias factor, the term c in X’w + c
- weights
pyspark.mllib.linalg.Vector
or convertible feature vector, the term w in X’w + c
- xMean
pyspark.mllib.linalg.Vector
or convertible Point around which the data X is centered.
- xVariance
pyspark.mllib.linalg.Vector
or convertible Variance of the given data
- nPointsint
Number of points to be generated
- seedint
Random Seed
- epsfloat
Used to scale the noise. If eps is set high, the amount of gaussian noise added is more.
- Returns
- list
of
pyspark.mllib.regression.LabeledPoints
of length nPoints