Spark read avro results from a previous write results in "Not an avro data file" due to _SUCCESS file -
i'm using great databricks connector read/write avro files. have following code
df.write.mode(savemode.overwrite).avro(somedirectory)
problem when try read directory using sqlcontext.read.avro(somedirectory)
it fails
java.io.ioexception: not avro data file
due existence of _success file in directory.
setting sc.hadoopconfiguration.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "false")
solves issue rather avoid doing it.
this sounds quite generic problem may doing wrong?
Comments
Post a Comment