scala - Check column datatype and execute sql only on Integer and Decimal in Spark SQL -


i'm trying check datatype of column input parquet file, if datatype integer or decimal run spark sql.

//get array of structfields   val datatypes = parquetrdd_subset.schema.fields  //check datatype of column  (val_datatype <- datatypes)  if (val_datatype.datatype.typename == "integer" || val_datatype.datatype.typename.contains("decimal"))   {  //get field name val x = parquetrdd_subset.schema.fieldnames   val dfs = x.map(field => spark.sql(s"select 'dataprofilerstats' table_name,(select 100 * approx_count_distinct($field)/count(1) parquetdftable) percentage_unique_value parquetdftable"))   } 

the issue is, although, datatype validation successful, while inside loop after getting field names, not restricting columns integer or decimals, query being performed on columns types strings well. how fields decimal or integer. how address this.

this how can filter columns integer , double type

// fiter columns  val columns = df.schema.fields.filter(x => x.datatype == integertype || x.datatype == doubletype)  //use these filtered select  df.select(columns.map(x => col(x.name)): _*) 

i hope helps!


Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -