How delete the few rows in dataframe scala/spark? -


i hava dataframe,i want delete first , second rows in dataframe,what should do?

this inputs:

+-----+ |value| +-----+ |    1| |    4| |    3| |    5| |    4| |   18| ------- 

this except result:

+-----+ |value| +-----+ |    3| |    5| |    4| |   18| ------- 

in opinion not make sense speak first or second record if cannot define ordering of dataframe. ordering of records result of show statement "arbitrary" , depends on partitioning of data.

suppose have column on can order records, can use window-functions. starting dataframe:

+----+-----+ |year|value| +----+-----+ |2007|    1| |2008|    4| |2009|    3| |2010|    5| |2011|    4| |2012|   18| +----+-----+  

you can do

import org.apache.spark.sql.expressions.window  df .withcolumn("rn",row_number().over(window.orderby($"year"))) .where($"rn">2).drop($"rn") .show 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -