NaiveBayes#
- class pyspark.mllib.classification.NaiveBayes[source]#
- Train a Multinomial Naive Bayes model. - New in version 0.9.0. - Methods - train(data[, lambda_])- Train a Naive Bayes model given an RDD of (label, features) vectors. - Methods Documentation - classmethod train(data, lambda_=1.0)[source]#
- Train a Naive Bayes model given an RDD of (label, features) vectors. - This is the Multinomial NB which can handle all kinds of discrete data. For example, by converting documents into TF-IDF vectors, it can be used for document classification. By making every vector a 0-1 vector, it can also be used as Bernoulli NB. The input feature values must be nonnegative. - New in version 0.9.0. - Parameters
- datapyspark.RDD
- The training data, an RDD of - pyspark.mllib.regression.LabeledPoint.
- lambda_float, optional
- The smoothing parameter. (default: 1.0) 
 
- data