apache spark - WrappedArray of WrapedArray to java array -
i have column of type set , use collect_set() of spark dataset api returns wrapped array of wrapped array. want single array values of nested wrapped arrays. how can that?
eg. cassandra table:
col1 {1,2,3} {1,5} i'm using spark dataset api.
row.get(0) returns wrapped array of wrapped array.
consider have dataset<row> ds has value column.
+-----------------------+ |value | +-----------------------+ |[wrappedarray(1, 2, 3)]| +-----------------------+ and has below schema
root |-- value: array (nullable = true) | |-- element: array (containsnull = true) | | |-- element: integer (containsnull = false) using udf
define udf1 below.
static udf1<wrappedarray<wrappedarray<integer>>, list<integer>> getvalue = new udf1<wrappedarray<wrappedarray<integer>>, list<integer>>() { public list<integer> call(wrappedarray<wrappedarray<integer>> data) throws exception { list<integer> intlist = new arraylist<integer>(); for(int i=0; i<data.size(); i++){ intlist.addall(javaconversions.seqasjavalist(data.apply(i))); } return intlist; } }; register , call udf1 below
import static org.apache.spark.sql.functions.col; import static org.apache.spark.sql.functions.calludf; import scala.collection.javaconversions; //register udf spark.udf().register("getvalue", getvalue, datatypes.createarraytype(datatypes.integertype)); //call udf dataset<row> ds1 = ds.select(col("*"), calludf("getvalue", col("value")).as("udf-value")); ds1.show(); using explode function
import static org.apache.spark.sql.functions.col; import static org.apache.spark.sql.functions.explode; dataset<row> ds2 = ds.select(explode(col("value")).as("explode-value")); ds2.show(false);
Comments
Post a Comment