How do I add a jar using sparklyr? -
when try access hive table using rstudio , sparklyr using code:
library(sparklyr) library(dplyr) sys.setenv(spark_home="/usr/hdp/current/spark2-client") # got ambari spark2 configs sc <- spark_connect(master = "yarn-client", config = config, version = '2.1.0') library(dbi) tabtweets <- dbgetquery(sc, "select * tweets0 limit 10")
i error:
error in value[[3l]](cond) : failed fetch data: java.lang.runtimeexception: java.lang.classnotfoundexception: org.openx.data.jsonserde.jsonserde
this because tweets0 created using jsonserde. fixed, example, when encountered using hive cli by:
add jar /usr/hdp/2.4.2.0-258/hive/lib/json-serde-1.3.7-jar-with-dependencies.jar;
so how do equivalent add jar using sparklyr?
edit: tried this:
spark_dependencies <- function(spark_version, scala_version, ...) { sparklyr::spark_dependency( jars = c( system.file( sprintf("/usr/hdp/2.4.2.0-258/hive/lib/json-serde-1.3.7-jar-with-dependencies.jar"), package = "jsonserde" ) ) ) } .onload <- function(libname, pkgname) { sparklyr::register_extension(pkgname) } library(jsonserde)
but still same error , library(jsonserde) gives:
error in library(jsonserde) : there no package called ‘jsonserde’
i see nothing in spark log adding dependency.
the issue not sparklyr one: setting in tez.lib.uris. changed to:
/hdp/apps/${hdp.version}/tez/tez.tar.gz,hdfs://master.royble.co.uk:8020/jars/json-serde-1.3.7-jar-with-dependencies.jar
(note: no space after comma , hdfs path). issue of:
error in library(jsonserde) : there no package called ‘jsonserde’
still stands , accept answer that.
Comments
Post a Comment