python - findChildren() method is storing two of the same child rather than one -


from webpage opening urllib2 , scraping beautifulsoup, trying store specific text within webpage.

before see code, here link screenshot of html webpage can understand way using find function beautifulsoup:

html webpage

and finally, here code using:

from beautifulsoup import beautifulsoup import urllib2  url = 'http://www.sciencekids.co.nz/sciencefacts/animals/bird.html' page = urllib2.urlopen(url) soup = beautifulsoup(page.read())  ul = soup.find('ul', {'class': 'style33'}) children = ul.findchildren() child in children:     print child.text 

and here output problem lies:

birds have feathers, wings, lay eggs , warm blooded. birds have feathers, wings, lay eggs , warm blooded. there around 10000 different species of birds worldwide. there around 10000 different species of birds worldwide. ostrich largest bird in world. lays largest eggs , has  fastest maximum running speed (97 kph). ostrich largest bird in world. lays largest eggs , has  fastest maximum running speed (97 kph). scientists believe birds evolved theropod dinosaurs. scientists believe birds evolved theropod dinosaurs. birds have hollow bones them fly. birds have hollow bones them fly. bird species intelligent enough create , use tools. bird species intelligent enough create , use tools. chicken common species of bird found in world. chicken common species of bird found in world. kiwis endangered, flightless birds live in new zealand. lay largest eggs relative body size of bird in world. kiwis endangered, flightless birds live in new zealand. lay largest eggs relative body size of bird in world. hummingbirds can fly backwards. hummingbirds can fly backwards. bee hummingbird smallest living bird in world, length of 5 cm (2 in). bee hummingbird smallest living bird in world, length of 5 cm (2 in). around 20% of bird species migrate long distances every year. around 20% of bird species migrate long distances every year. homing pigeons bred find way home long distances away , have been used thousands of years carry messages. homing pigeons bred find way home long distances away , have been used thousands of years carry messages. 

is there using incorrectly and/or doing incorrectly in code making there 2 children there should one? easy create code don't store duplicates of same information, i'd rather right way 1 of each string looking for.

children = ul.findchildren() selecting both <li> , <p> within <ul>. iterating on children causing print text property of both of these elements. fix this, change children = ul.findchildren() children = ul.findchildren("p").


Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -