beautifulsoup - python: select a specific section from a very long div class output -


i pulling info website , output long. how can select key part interested in , assign new object

heres part of code using pull info -

soup = bs(response.text,"html.parser") cartl = soup.find("div",{"class":"product-view"}) cart_link = cartl.find_all("form") 

this long output (i shortened down example full text pulls 100 lines) -

<form action="https://www.randomsite.com/checkout/cart/add/uenc/ahr0chm6ly93d3cudghlz29vzhdpbgxvdxquy29tl25pa2utywlylwpvcmrhbi0xmy1yzxryby1izy1oaxn0b3j5lw9mlwzsawdodc13agl0zs1tzxrhbgljlxnpbhzlci11bml2zxjzaxr5lxjlzc00mtq1nzqtmtazp19fx1njrd1v/product/92797/form_key/nblk6ie3lydwf0vh/" id="product_addtocart_form" method="post"> <input name="form_key" type="hidden" value="nblk6ie3lydwf0vh"/> <div class="no-display"> <input name="product" type="hidden" value="92797"/> <input id="related-products-field" name="related_product" type="hidden" value=""/> </div> 

i want take add new object- https://www.randomsite.com/checkout/cart/add/uenc/ahr0chm6ly93d3cudghlz29vzhdpbgxvdxquy29tl25pa2utywlylwpvcmrhbi0xmy1yzxryby1izy1oaxn0b3j5lw9mlwzsawdodc13agl0zs1tzxrhbgljlxnpbhzlci11bml2zxjzaxr5lxjlzc00mtq1nzqtmtazp19fx1njrd1v/product/92797/form_key/nblk6ie3lydwf0vh/

this new updated code via answer below that-

from bs4 import beautifulsoup import requests  session = requests.session() endpoint = "https://randomsite.com/" response = session.get(endpoint)  soup0 = beautifulsoup(response.text,"html.parser")  div = soup0.find("div",{"class":"product-view"}) html = div.find("form")  soup = beautifulsoup(html, 'html.parser') form = soup.find('form', { 'id': 'product_addtocart_form' }) action = form['action'] print(action) 

this new error getting idea on i'm going wrong -

traceback (most recent call last):   file "test.py", line 16, in <module>     soup = beautifulsoup(html, 'html.parser')   file "/library/frameworks/python.framework/versions/3.6/lib/python3.6/site-packages/bs4/__init__.py", line 191, in __init__     markup = markup.read() typeerror: 'nonetype' object not callable 

you can use beautifulsoup find method reference <form> tag (optionally filtering on particular id in case there multiple forms on page). then, treat form object dictionary pull action attribute.

code

from bs4 import beautifulsoup  html = ''' <form action="https://www.randomsite.com/checkout/cart/add/uenc/ahr0chm6ly93d3cudghlz29vzhdpbgxvdxquy29tl25pa2utywlylwpvcmrhbi0xmy1yzxryby1izy1oaxn0b3j5lw9mlwzsawdodc13agl0zs1tzxrhbgljlxnpbhzlci11bml2zxjzaxr5lxjlzc00mtq1nzqtmtazp19fx1njrd1v/product/92797/form_key/nblk6ie3lydwf0vh/" id="product_addtocart_form" method="post"> <input name="form_key" type="hidden" value="nblk6ie3lydwf0vh"/> <div class="no-display"> <input name="product" type="hidden" value="92797"/> <input id="related-products-field" name="related_product" type="hidden" value=""/> </div> '''  soup = beautifulsoup(html, 'html.parser') form = soup.find('form', { 'id': 'product_addtocart_form' }) action = form['action'] print action 

output

https://www.randomsite.com/checkout/cart/add/uenc/ahr0chm6ly93d3cudghlz29vzhdpbgxvdxquy29tl25pa2utywlylwpvcmrhbi0xmy1yzxryby1izy1oaxn0b3j5lw9mlwzsawdodc13agl0zs1tzxrhbgljlxnpbhzlci11bml2zxjzaxr5lxjlzc00mtq1nzqtmtazp19fx1njrd1v/product/92797/form_key/nblk6ie3lydwf0vh/ 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -