scala - How is logistic regression parallelized in Spark? -
i wouldlike have insight method used parallelize logistic regression in ml library, tried check source code didn't understand process.
spark uses called mini batch gradient descent regression:
http://ruder.io/optimizing-gradient-descent/index.html#minibatchgradientdescent
in nutshell, works this:
- select sample of data
- compute gradient on each row of sample
- aggregate gradient
- back step 1
the actual optimisation code spark line: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/optimization/gradientdescent.scala#l234
Comments
Post a Comment