Let’s keep going to MLlib. Today, let’s study the Linear SVM and logistic Regression
About the methmatic knowledge, you can refer to these links.
Of course Andrew Nguyen’s Machine Learning course is unbeatable execellent tutorial for ML beginners, which I strongly recommended. Here’s the coursera link
Similarly, I will paste my IPython notebook code here, github repo at here.
With higher degreed kernel function it fits better but cosumes more resources and may overfit.
We can see the species 1 and species 0 did have different correspond to sepal_length and sepal_length combinations
III. PySpark SVM
IV PySpark LogisticRegression