hadoop - Use Spark 2.2.0 to read from Hive metastore 2.x -
this question has answer here:
prior version 2.2.0 spark unable communicate hive 2.x stuck using hive 1 + spark 1/2. have read in both:
https://issues.apache.org/jira/browse/spark-18112 https://spark.apache.org/releases/spark-release-2-2-0.html
it possible use spark 2 + hive 2 i'm still facing issues. using pre-compiled spark-without-hadoop, following error when accessing temporal hive table:
exception in thread "main" java.lang.illegalargumentexception: unable instantiate sparksession hive support because hive classes not found. @ org.apache.spark.sql.sparksession$builder.enablehivesupport(sparksession.scala:845) @ io.bigdatabenchmark.v2.queries.q05.logisticregression$.main(logisticregression.scala:87) @ io.bigdatabenchmark.v2.queries.q05.logisticregression.main(logisticregression.scala) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ org.apache.spark.deploy.sparksubmit$.org$apache$spark$deploy$sparksubmit$$runmain(sparksubmit.scala:755) @ org.apache.spark.deploy.sparksubmit$.dorunmain$1(sparksubmit.scala:180) @ org.apache.spark.deploy.sparksubmit$.submit(sparksubmit.scala:205) @ org.apache.spark.deploy.sparksubmit$.main(sparksubmit.scala:119) @ org.apache.spark.deploy.sparksubmit.main(sparksubmit.scala)
i solve issue compiling own version of spark options "-phive -phive-thriftserver" default spark build hive 1.2.1 bindings seen in documentation.
so, seems spark 2.2.0 solves issue of spark 2 -> hive 2 binding can't find proper way compile can access metastore schema 2.x.
thanks help!
add following dependency maven project.
<dependency> <groupid>org.apache.spark</groupid> <artifactid>spark-hive_2.11</artifactid> <version>2.2.0</version> <scope>provided</scope> </dependency>
Comments
Post a Comment