azure data lake - u-sql issue using gzip and virtual column -


i have strange issue u-sql job process zipped files. if run u-sql on normal csv file works fine. if gzip file doenst work anymore (generating e_runtime_user_extract_encoding_error: encoding error occured after processing 0 record(s) in vertex' input split.)

so code works

declare @path string = "output/{ids}/{*}.csv";  @data =     extract         string,         b string,         c string,          d string,         ids string      @path     using          extractors.csv(skipfirstnrows:1, silent: true);  @output =      select *     @data      ids == "test";  output @output "output/res.csv" using outputters.csv(quoting : false, outputheader: true); 

this code not work (gz version of file)

declare @path string = "output/{ids}/{*}.csv.gz";  @data =     extract         string,         b string,         c string,          d string,         ids string      @path     using          extractors.csv(skipfirstnrows:1, silent: true);  @output =      select *     @data      ids == "test";  output @output "output/res.csv" using outputters.csv(quoting : false, outputheader: true); 

if remove virtual column "ids" works gz version

declare @path string = "output/test/{*}.csv.gz";  @data =     extract         string,         b string,         c string,          d string      @path     using          extractors.csv(skipfirstnrows:1, silent: true);  @output =      select *     @data;  output @output "output/res.csv" using outputters.csv(quoting : false, outputheader: true); 

attached 2 files using. have clue going on? if remove virual column ids works both?

test.csv

test.csv.gz

i error when run against file in data lake storage. if run against files locally works fine.

the detailed error receive "internaldiagnostics":""-"innererror":{"diagnosticcode":195887128-"severity":"error"-"component":"runtime"-"source":"user"-"errorid":"e_runtime_user_extract_invalid_character"-"message":"invalid character utf-8 encoding in input stream."-"description":"found invalid character utf-8 encoding in input."-"resolution":"correct invalid character in input file- or correct encoding in extractor , try again."

to add additional issue:

  1. this confirmed defect. have deployed fix should not encounter theissue longer, or without @@featurespreview = "filesetv2dot5:on" flag.

  2. set @@featurespreview = "filesetv2dot5:on" flag above correct workaround since force different plan generated defect did not exist.

  3. set @@featurespreview = "filesetv2dot5:on" still turned off default.


Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -