Spark/MLib/Pipeline : Multiple Algorithm In A Single Pipeline
Monday, May 2, 2016
MLib provides a pipeline function which can take multiple algorithms as input and execute them A as a sequence of stages, and each stage is either a Transformer or an Estimator. These stages are run in order, and the input DataFrame is transformed as it passes through each stage.
For eg ,in the below code for Text Analysis Classifier,the transformers of Tokenizer, Term Frequency,Inverse Document Frequency and Naive Bayes Classifier is run through a pipeline to create a sequential flow