python - Count URL's in console instead of progress bar -
i run progress bar part of web-scraper appears both
(a) inaccurate (b) slow process is.
with click.progressbar(range(1000000)) bar: in bar: pass
is there article/training able read better understand printing progress console?
i want program scan url in list , print progress iterates through list, along lines of
scanning url 1 of 30
scanning url 2 of 30
scanning url 3 of 30
if possible, keep on same line not essential.
code below -- if assist either training or reading, appreciated.
import requests import csv lxml import html url_list = [ "https://www.realestate.com.au/property/1-1-goldsmith-st-elwood-vic-3184", "https://www.realestate.com.au/property/1-10-albion-rd-glen-iris-vic-3146", "https://www.realestate.com.au/property/1-109-sydney-rd-manly-nsw-2095", "https://www.realestate.com.au/property/1-1110-glen-huntly-rd-glen-huntly-vic-3163",] open('test.csv', 'wb') csv_file: writer = csv.writer(csv_file) index, url in enumerate(url_list): page = requests.get(url) print 'scanning url....' if text2search in page.text: tree = html.fromstring(page.content) (title,) = (x.text_content() x in tree.xpath('//title')) (price,) = (x.text_content() x in tree.xpath('//div[@class="property-value__price"]')) (sold,) = (x.text_content().strip() x in tree.xpath('//p[@class="property-value__agent"]')) writer.writerow([title, price, sold])
if want print indicator other progress bar show how far along are, easiest regular prints.
since code in question python 2, answered python 2 code, question come python 3 users, i've added section them too.
a version python 2
the following based on , should complement code in question:
for index, url in enumerate(url_list): print 'scanning url #' + str(index+1) + ' of ' + str(len(url_list))
you can optionally add url you're scanning using url
variable for
loop generates.
also, if want have each print replace last, can add comma ,
end of print statement, , add \r
character beginning:
for index, url in enumerate(url_list): print '\rscanning url #' + str(index+1) + ' of ' + str(len(url_list)),
the comma prevents print
adding new line character (\n
) end, , \r
("carriage return") @ beginning erases what's on line before printing rest of line.
differences in print
between python 2 & python 3
it's important note print
functions quite differently in python 2 , python 3. above 'python 2' solution not work in python 3.
for 1 thing, print
in python 3 function, not keyword, has called function (i.e. print('print me!')
), , secondly, adding comma end not prevent output of new line character. normally including comma @ end have no visible effect, interpreter is evaluating (as tuple containing single none
) can seen when using python repl. instead, 1 must supply named argument (named end
) print
function override it's default.
a version python 3
here's python 3 equivalent code supplied @ top of answer:
for index, url in enumerate(url_list): print('scanning url #' + str(index+1) + ' of ' + str(len(url_list)))
and if want have each print reuse same line, second example above:
for index, url in enumerate(url_list): print('\rscanning url #' + str(index+1) + ' of ' + str(len(url_list)), end='')
in case didn't read above, please note end=''
overriding print
function's default action of adding \n
(newline) character end of each line adds empty string instead, , \r
(carriage return) character @ beginning of string causes python go beginning of line print rest of string.
Comments
Post a Comment