python - Web scraping, Url jump prevents authorization? -
i've been trying scrape website (www.dearedu.com), specifically, i've been having tremendous amounts of difficulties logging in...having tried find on answered authorization questions on stack exchange.
currently, using request sessions log in using following code,
cj = cookielib.cookiejar() mysession = requests.session() mysession.headers.update({'user-agent':'mozilla/5.0 (macintosh; intel mac os x 10_11_6) applewebkit/537.36 (khtml, gecko) chrome/59.0.3071.115 safari/537.36'}) data = mysession.get('http://www.dearedu.com/', cookies = cj) data= {'userid': myusername, 'pwd': mypassword, 'fmdo': 'login', 'dopost': 'login', 'keeptime': '604800', 'teshu': 't'} data = mysession.post('http://club.dearedu.com/member/index_do.php', data=data)
when code above run correct password , username, following html
<head> <title>第二教育网提示信息</title> <meta http-equiv="content-type" content="text/html; charset=gb2312" /> <base target='_self'/> <style>div{line-height:160%;}</style></head> <body leftmargin='0' topmargin='0' bgcolor='#ffffff'> <center> <script> var pgo=0; function jumpurl(){ if(pgo==0){ location='http://www.dearedu.com'; pgo=1; } } document.write("<br /><div style='width:450px;padding:0px;border:1px solid #dadada;'><div style='padding:6px;font-size:12px;border- bottom:1px solid #dadada;background:#dbeebd url(/plus/img/wbg.gif)';'><b>第二教育网提示信息!</b></div>"); document.write("<div style='height:130px;font- size:10pt;background:#ffffff'><br />"); document.write("成功登录,现在转向系统主页..."); document.write("<br /><a href='http://www.dearedu.com'>如果你的浏览器没 反应,请点击这里...</a><br/></div>"); settimeout('jumpurl()',1000);</script> </center> </body> </html>
the thing not understand cookies , status code received indicate have logged in, when try access main page indicates not.
if had take guess has url jumping. website waits second or before redirecting main page.
can explain wrong , how fix it? thank you!
edit:
"成功登录,现在转向系统主页" = logged in, redirecting homepage "如果你的浏览器没反应,请点击这里" = if browser not respond, please click here
the rest not think relevant. thanks!
Comments
Post a Comment