regex - Spark column rlike converts int to boolean -


so i'm using regex spark's column rlike extract last digit string. problem after extracts digit, automatically gets converted boolean. there way me stop being automatically converted boolean?

test.withcolumn("quarter", $"month".rlike("\\d+$"))  

for example:

input:

2015 q 1 2015 q 1 2015 q 2 2015 q 2 

output:

true true true true 

expected: 1 1 2 2

i tried casting after integer returns 1 because gets converted boolean int.

test.withcolumn("quarter", $"month".rlike("\\d+$").cast("integer")) 

spark has function extract matching regex, can use regexp_extract function this.

scala> val df = seq("2015 q 1", "2015 q 1", "2015 q 2", "2015 q 2").todf("col1")  df: org.apache.spark.sql.dataframe = [col1: string]  scala> import org.apache.spark.sql.functions._ import org.apache.spark.sql.functions._  scala> df.withcolumn("quarter",regexp_extract($"col1", ".*(\\d+)$", 1)).show  +--------+-------+ |    col1|quarter| +--------+-------+ |2015 q 1|      1| |2015 q 1|      1| |2015 q 2|      2| |2015 q 2|      2| +--------+-------+ 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -