python - How to efficiently use webdriver to scrape and store it into a dataframe? -

July 15, 2015

i have data-frame hundreds of urls. try visit each url selenium , store source code in corresponding series. however, seems slow start web driver each visit. thinking starting web driver before apply function, don't know how pass 'driver' apply function. there better way this?

def selenium_download (url):     driver = webdriver.phantomjs()     try:         driver.set_page_load_timeout(10)         driver.get(url)         source_page = driver.page_source         driver.quit()     except:         driver.quit()         source_page = 'time-out'      return source_page  def revisit_urls(df):     df['source_page'] = df['page_url'].apply(selenium_download)

Search This Blog

RT

python - How to efficiently use webdriver to scrape and store it into a dataframe? -

Comments

Post a Comment

Popular posts from this blog

python - Selenium remoteWebDriver (& SauceLabs) Firefox moseMoveTo action exception -

html - How to custom Bootstrap grid height? -

transpose - Maple isnt executing function but prints function term -