org.apache.spark.mllib.regression
Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.
Construct a StreamingLinearRegression object with default parameters:
{stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
Initial weights must be set before using trainOn or predictOn
(see StreamingLinearAlgorithm
)
The algorithm to use for updating.
The algorithm to use for updating.
Return the latest model.
Return the latest model.
The model to be updated and used for prediction.
The model to be updated and used for prediction.
Java-friendly version of predictOn
.
Java-friendly version of predictOn
.
Use the model to make predictions on batches of data from a DStream
Use the model to make predictions on batches of data from a DStream
DStream containing feature vectors
DStream containing predictions
Java-friendly version of predictOnValues
.
Java-friendly version of predictOnValues
.
Use the model to make predictions on the values of a DStream and carry over its keys.
Use the model to make predictions on the values of a DStream and carry over its keys.
key type
DStream containing feature vectors
DStream containing the input keys and the predictions as values
Set the initial weights.
Set the initial weights. Default: [0.0, 0.0].
Set the fraction of each batch to use for updates.
Set the fraction of each batch to use for updates. Default: 1.0.
Set the number of iterations of gradient descent to run per update.
Set the number of iterations of gradient descent to run per update. Default: 50.
Set the step size for gradient descent.
Set the step size for gradient descent. Default: 0.1.
Java-friendly version of trainOn
.
Java-friendly version of trainOn
.
Update the model by training on batches of data from a DStream.
Update the model by training on batches of data from a DStream. This operation registers a DStream for training the model, and updates the model based on every subsequent batch of data from the stream.
DStream containing labeled data
:: Experimental :: Train or predict a linear regression model on streaming data. Training uses Stochastic Gradient Descent to update the model based on each new batch of incoming data from a DStream (see
LinearRegressionWithSGD
for model equation)Each batch of data is assumed to be an RDD of LabeledPoints. The number of data points per batch can vary, but the number of features must be constant. An initial weight vector must be provided.
Use a builder pattern to construct a streaming linear regression analysis in an application, like:
val model = new StreamingLinearRegressionWithSGD() .setStepSize(0.5) .setNumIterations(10) .setInitialWeights(Vectors.dense(...)) .trainOn(DStream)