web scraping - Using requests with a simple form on Python -


i'm trying scrape example sentences specific french word using python, page python doesn't seem have results.

i've inspected element of search box , search button , included them parameters. perhaps i'm missing something?

http://www.online-languages.info/french/examples.php

import requests bs4 import beautifulsoup  word = 'manger' url='http://www.online-languages.info/french/examples.php' params ={'word':word,'go':''}  response=requests.post(url, data=params) soup = beautifulsoup(response.text, 'html5lib') print(soup.prettify()) 

here's i'm looking get:

edit: here output of result. appears may using javascript. if that's case, have different library use?

<!doctype html public "-//w3c//dtd xhtml 1.0 transitional//en" "http://www.w3.org/tr/xhtml1/dtd/xhtml1-transitional.dtd"> <html dir="ltr" lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">  <head>   <title>    french example sentences :: online-languages.info   </title>   <meta content="text/css" http-equiv="content-style-type"/>   <meta content="text/html; charset=utf-8" http-equiv="content-type"/>   <meta content="database containing thousands of example sentences. sentences important learning correct use of words." name="description"/>   <meta content="french language. french grammar. french vocabulary. tests. language certificate. verbs. french phrases. french pronunciation. e-learning. conversation." name="subject"/>   <meta content="french, french grammar, french dictionary, french vocabulary, french language, tests, french test, exam, fce, verbs, exercise, certificate, course, games" name="keywords"/>   <link href="../style.css" rel="stylesheet" type="text/css"/>  </head>  <body style="background-image:url(./img/bg2.jpg);">   <div align="center">    <table bgcolor="white" border="0" cellpadding="6" cellspacing="0" style="-moz-border-radius:20px;" width="1000">     <tbody>      <tr>       <td align="center" colspan="4">        <table border="0" cellspacing="0" width="100%">         <tbody>          <tr>           <td align="center" width="180">            <a href="../">             <img alt="online-languages.info" border="0" src="img/logo.png"/>            </a>           </td>           <td align="left" style="background: url('img/bg.png'); -moz-border-radius:20px; padding: 20px 20px 20px 20px; ">            <h1 style="color:#fff; font-size:20pt;">             french words in example sentences            </h1>            <h3 style="color:#fff; font-size:8pt; font-weight:normal;">             french language resources @             <a href="http://www.online-languages.info" style="color:white;">              online-languages.info             </a>            </h3>           </td>          </tr>         </tbody>        </table>       </td>      </tr>      <tr>       <td align="left" valign="top" width="180">        <table cellpadding="0" cellspacing="0" class="t2" width="180">         <tbody>          <tr>           <td>            <a class="arect" href="index.php">             home            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="grammar.php">             french grammar            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="phrases.php">             french phrases            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="vocabulary.php">             french vocabulary            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="trainer.php">             vocabulary trainer            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="picture-dictionary.php">             picture dictionary            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="dictionary.php">             french dictionary            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="flashcards.php">             flashcards            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="audio.php">             audio            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="video.php">             video            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="translator.php">             french translator            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="tests.php">             french quizzes            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="examples.php">             examples of use            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="pronunciation.php">             french pronunciation            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="news.php">             news in french            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="applications.php">             language software            </a>           </td>          </tr>          <tr>           <td>            <a class="arect" href="mobile.php">             mobile phones            </a>           </td>          </tr>         </tbody>        </table>        <img alt="" border="0" height="0" src="http://whos.amung.us/swidget/fnhahzdo0ncz.gif" style="display:none;" width="0"/>       </td>       <td align="left" bgcolor="#ffffff" valign="top" width="90%">        <script type="text/javascript">         <!-- google_ad_client = "ca-pub-7058441231119392"; /* online-languages */ google_ad_slot = "3704078504"; google_ad_width = 728; google_ad_height = 90; //-->        </script>        <script src="http://pagead2.googlesyndication.com/pagead/show_ads.js" type="text/javascript">        </script>        <br/>        <br/>        <div align="justify">         <div id="content">          <iframe frameborder="0" height="650" src="http://www.dicts.info/examples.php?lang=french&amp;disa=1" width="95%">          </iframe>         </div>        </div>        <!-- cookieconsent2 silktide -->        <script type="text/javascript">         window.cookieconsent_options = { learnmore: 'more info', message: 'this website uses cookies personalize content , improve experience on our website.', link: 'https://www.google.com/policies/technologies/cookies/', theme: 'light-bottom' };        </script>        <script src="https://s3.amazonaws.com/cc.silktide.com/cookieconsent.latest.min.js" type="text/javascript">        </script>        <noscript>         &lt;p&gt;we recommend enable javascript take full advantage of website.&lt;/p&gt;        </noscript>       </td>      </tr>     </tbody>    </table>    <br/>    <table width="700">     <tbody>      <tr>       <td align="center">        <a href="../english">         <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/anglictina"/>         <br/>         english        </a>       </td>       <td align="center">        <a href="../german">         <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/spanelstina"/>         <br/>         german        </a>       </td>       <td align="center">        <a href="../french">         <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/francouzstina"/>         <br/>         french        </a>       </td>       <td align="center">        <a href="../spanish">         <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/spanelstina"/>         <br/>         spanish        </a>       </td>       <td align="center">        <a href="../russian">         <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/rustina"/>         <br/>         russian        </a>       </td>       <td align="center">        <a href="../chinese">         <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/cinstina"/>         <br/>         chinese        </a>       </td>      </tr>     </tbody>    </table>    <br/>    <br/>    <table cellpadding="10" style="background:url(img/bgfoot.jpg);" width="100%">     <tbody>      <tr>       <td align="center">        <font color="#0000aa">         <a href="../licence.html">          licence         </a>         |         <a href="../licence.html">          terms of use         </a>         |         <a href="../licence.html#disclaimer">          disclaimer         </a>         |         <a href="../licence.html#privacy">          privacy policy         </a>         |         <a href="http://www.dicts.info/contact.php?s=online-languages">          contact         </a>        </font>        <br/>        copyright © 2007-2017, online-languages.info       </td>      </tr>     </tbody>    </table>   </div>   <script type="text/javascript">    var gajshost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3cscript src='" + gajshost + "google-analytics.com/ga.js' type='text/javascript'%3e%3c/script%3e"));   </script>   <script type="text/javascript">    try { var pagetracker = _gat._gettracker("ua-8795372-1"); pagetracker._trackpageview(); } catch(err) {}   </script>  </body> </html> 

this works me. notice used get method , uri referenced in actual form on page.

import requests  word = 'manger' url ='http://www.dicts.info/examples.php' headers = {'referer': 'http://www.dicts.info/examples.php?disa=1&lang2=french&word=bon&go=search'} params = {'word':word,'disa':'1','lang2':'french'}  response = requests.get(url, params=params, headers=headers) print(response.text) 

update

it appears php page checks make sure there appropriate referer header sent request. add one, did above (edited original).


Comments

Popular posts from this blog

Ansible warning on jinja2 braces on when -

Parsing a protocol message from Go by Java -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -