python - Why is Spark's show() function very slow? -
i have
df.select("*").filter(df.itemid==itemid).show()
and never terminated, if do
print df.select("*").filter(df.itemid==itemid)
it prints in less second. why this?
that's because select
, filter
building execution instructions, aren't doing data. then, when call show
executes instructions. if isn't terminating, i'd review logs see if there errors or connection issues. or maybe dataset still large - try taking 5 see if comes quick.
Comments
Post a Comment