文章/答案/技术大牛

发布

社区首页 >问答首页 >为什么我的基于Python的Selenium PhantomJS网络爬虫没有到达网络呢？

问为什么我的基于Python的Selenium PhantomJS网络爬虫没有到达网络呢？
EN

Stack Overflow用户

提问于 2013-08-06 15:01:50

回答 2查看 759关注 0票数 1

我正试着让这个简单的代码进入互联网。我支持代理服务器，但我已经设置了http_proxy、https_proxy和no_proxy环境变量。

Python代码：

from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('http://www.google.com')
driver.page_source

输出：

u'<html><head><title> Web Authentication Redirect</title><meta http-equiv="Cache-control" content="no-cache"><meta http-equiv="Pragma" content="no-cache"><meta http-equiv="Expires" content="-1"><meta http-equiv="refresh" content="1; URL=https://1.1.1.1/login.html?redirect=www.google.com/"></head><body>\n</body></html>'

关于如何绕过这件事有什么想法吗？

另外，我在Ubuntu12.04 LTS上。

python

proxy

selenium-webdriver

web-crawler

phantomjs

回答 2

Stack Overflow用户

回答已采纳

发布于 2013-08-06 15:23:32

看起来就像你的网络负载了自己，然后立即将你重定向到你想要去的地方。我认为在获得页面源之前，您只需遵循重定向即可。

请参阅Getting the final destination of a JavaScript redirect on a website，直到selemium遵循重定向。

票数 0

Stack Overflow用户

发布于 2013-08-06 15:12:02

如果它是您在页面上登陆的页面，然后重定向(也就是等待问题)，则可以尝试wait.until(ExpectedConditions.titleIs("Google"))。

注意:这是Java代码，但应该不会太难转换。wait是WebDriverWait的一个实例

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/18083797

复制

相似问题

问为什么我的基于Python的Selenium PhantomJS网络爬虫没有到达网络呢？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么我的基于Python的Selenium PhantomJS网络爬虫没有到达网络呢？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么我的基于Python的Selenium PhantomJS网络爬虫没有到达网络呢？
EN