azure data lake - u-sql issue using gzip and virtual column -
i have strange issue u-sql job process zipped files. if run u-sql on normal csv file works fine. if gzip file doenst work anymore (generating e_runtime_user_extract_encoding_error: encoding error occured after processing 0 record(s) in vertex' input split.)
so code works
declare @path string = "output/{ids}/{*}.csv"; @data = extract string, b string, c string, d string, ids string @path using extractors.csv(skipfirstnrows:1, silent: true); @output = select * @data ids == "test"; output @output "output/res.csv" using outputters.csv(quoting : false, outputheader: true);
this code not work (gz version of file)
declare @path string = "output/{ids}/{*}.csv.gz"; @data = extract string, b string, c string, d string, ids string @path using extractors.csv(skipfirstnrows:1, silent: true); @output = select * @data ids == "test"; output @output "output/res.csv" using outputters.csv(quoting : false, outputheader: true);
if remove virtual column "ids" works gz version
declare @path string = "output/test/{*}.csv.gz"; @data = extract string, b string, c string, d string @path using extractors.csv(skipfirstnrows:1, silent: true); @output = select * @data; output @output "output/res.csv" using outputters.csv(quoting : false, outputheader: true);
attached 2 files using. have clue going on? if remove virual column ids works both?
i error when run against file in data lake storage. if run against files locally works fine.
the detailed error receive "internaldiagnostics":""-"innererror":{"diagnosticcode":195887128-"severity":"error"-"component":"runtime"-"source":"user"-"errorid":"e_runtime_user_extract_invalid_character"-"message":"invalid character utf-8 encoding in input stream."-"description":"found invalid character utf-8 encoding in input."-"resolution":"correct invalid character in input file- or correct encoding in extractor , try again."
to add additional issue:
this confirmed defect. have deployed fix should not encounter theissue longer, or without
@@featurespreview = "filesetv2dot5:on"
flag.set @@featurespreview = "filesetv2dot5:on"
flag above correct workaround since force different plan generated defect did not exist.set @@featurespreview = "filesetv2dot5:on"
still turned off default.
Comments
Post a Comment