java - Connect to elasticsearch 2.4.4 with spark 2.x -


from offical doc can see :

elasticsearch-hadoop allows elasticsearch used in spark in 2 ways: through dedicated support available since 2.1 or through map/reduce bridge since 2.0

but when try through dedicated support way below:

import org.elasticsearch.spark.rdd.api.java.javaesspark;  sparkconf conf = new sparkconf().setappname("myapp")         .set("spark.serializer", "org.apache.spark.serializer.kryoserializer")         .set("es.nodes", "localhost")         .set("es.port", "9200")         .set("es.resource", "test/main")         .set("es.index.auto.create", "true");  javasparkcontext sc = new javasparkcontext(conf);  javardd<string> input = sc.textfile("file:///home/zht/pycharmprojects/test/text_file.txt"); javardd<map<string, string>> formattedrdd = input.map(...);  javaesspark.savetoes(formattedrdd, "test/spark"); 

and run commind line:

spark-submit --conf spark.es.resource=test/main --jars $spark_home/jars/elasticsearch-hadoop-2.4.4.jar --class org.spark_examples.something.launchspark sparkexamples-1.0-snapshot-jar-with-dependencies.jar 

i got error:

java.lang.nosuchmethoderror: org.apache.spark.taskcontext.addoncompletecallback(lscala/function0;)v     @ org.elasticsearch.spark.rdd.esrddwriter.write(esrddwriter.scala:42)     @ org.elasticsearch.spark.rdd.esspark$$anonfun$dosavetoes$1.apply(esspark.scala:84)     @ org.elasticsearch.spark.rdd.esspark$$anonfun$dosavetoes$1.apply(esspark.scala:84)     @ org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:87)     @ org.apache.spark.scheduler.task.run(task.scala:108)     @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:335)     @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1149)     @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:624)     @ java.lang.thread.run(thread.java:748) 

how fix that?

the addoncompletecallback method removed taskcontext in spark 2.0. spark 2.0 not binary compatible previous releases.

if using maven build project use below code snippet.

 <dependency>            <groupid>org.elasticsearch</groupid>            <artifactid>elasticsearch-hadoop</artifactid>            <version>5.1.2</version>       </dependency>  

if not, please use elasticsearch-hadoop jar version greater version 5.

reference: github issue tracker


Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -