scrapy - Graph of Extracted Links -


can tell me if possible analytics on links extracted crawler? know there analytics api can't quite figure out how use , docs pretty scant.

i'm trying troubleshoot why crawler extracting links not others. example, start crawl on home page in there links urls containing word business following rule not return items.

rules = (         rule(linkextractor(allow=('business', )), callback='parse_item', follow=true),     ) 

it great if there way log sort of graph of extracted links cannot find way implement this.

i think easier way test rule test linkextractor obj using scrapy shell , assuming you're talking crawlspider think there's no built-in way of doing that. nonetheless, if want generate sort of directed graph subclass linkeextractor , overwrite extract_links method print "graph edges" like:

logger = logging.getlogger('verboselinkextractor') class verboselinkextractor(linkextractor):     def extract_links(self, response):         links = super(graph, self).extract_links(response)         link in links:             logger.debug("{} ==> {}".format(response.url, link.url)) # or simple print         return links 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -