urllib3
Python HTTP library with thread-safe connection pooling, file post support, sanity friendly, and more.
urllib3 Documentation — urllib3 dev documentation
I'm working on a simple script that involves CAS, jspring security check, redirection, etc. I would like to use Kenneth Reitz's python requests because it's a great piece of work! However, CAS requires getting validated via SSL so I have to get past that step first. I don't know what Python requests is wanting? Where is this SSL certificate suppose to reside?
Traceback (most recent call last):
File "./test.py", line 24, in <module>
response = requests.get(url1, headers=headers)
File "build/bdist.linux-x86_64/egg/requests/api.py", line 52, in get
File "build/bdist.linux-x86_64/egg/requests/api.py", line 40, in request
File "build/bdist.linux-x86_64/egg/requests/sessions.py", line 209, in request
File "build/bdist.linux-x86_64/egg/requests/models.py", line 624, in send
File "build/bdist.linux-x86_64/egg/requests/models.py", line 300, in _build_response
File "build/bdist.linux-x86_64/egg/requests/models.py", line 611, in send
requests.exceptions.SSLError: [Errno 1] _ssl.c:503: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
Source: (StackOverflow)
I am writing scripts in Python2.6 with use of pyVmomi and while using one of the connection methods:
service_instance = connect.SmartConnect(host=args.ip,
user=args.user,
pwd=args.password)
I get the following warning:
/usr/lib/python2.6/site-packages/requests/packages/urllib3/connectionpool.py:734: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
InsecureRequestWarning)
What's interesting is that I do not have urllib3 installed with pip (but it's there in /usr/lib/python2.6/site-packages/requests/packages/urllib3/).
I have tried as suggested here
import urllib3
...
urllib3.disable_warnings()
but that didn't change anything.
Source: (StackOverflow)
I am try to fetch some tweets in Google App Engine and doing some analysis on that tweets.
Due to some issue in urllib3, I am facing the following error :
AttributeError: 'NoneType' object has no attribute 'wrap_socket'
The last three call are :
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/packages/urllib3/connectionpool.py", line 304, in _make_request
self._validate_conn(conn)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/packages/urllib3/connectionpool.py", line 722, in _validate_conn
conn.connect()
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/packages/urllib3/connection.py", line 164, in connect
self.sock = ssl.wrap_socket(conn, self.key_file, self.cert_file)
AttributeError: 'NoneType' object has no attribute 'wrap_socket'
Traceback (most recent call last):
INFO 2014-08-24 10:37:05,800 connectionpool.py:695] Starting new HTTPS connection (1): api.twitter.com
ERROR 2014-08-24 10:37:06,175 webapp2.py:1528] 'NoneType' object has no attribute 'wrap_socket'
Traceback (most recent call last):
File "/Users/krishna/google-cloud-sdk/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.3/webapp2.py", line 1511, in __call__
rv = self.handle_exception(request, response, e)
File "/Users/krishna/google-cloud-sdk/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.3/webapp2.py", line 1505, in __call__
rv = self.router.dispatch(request, response)
File "/Users/krishna/google-cloud-sdk/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher
return route.handler_adapter(request, response)
File "/Users/krishna/google-cloud-sdk/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.3/webapp2.py", line 1077, in __call__
return handler.dispatch()
File "/Users/krishna/google-cloud-sdk/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.3/webapp2.py", line 547, in dispatch
return self.handle_exception(e, self.app.debug)
File "/Users/krishna/google-cloud-sdk/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.3/webapp2.py", line 545, in dispatch
return method(*args, **kwargs)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/main.py", line 50, in post
tweetTextCotainer = THandler.getTweetsText()
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/main.py", line 82, in getTweetsText
access_token_secret = self.access_token_secret
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/TwitterSearch/TwitterSearch.py", line 63, in __init__
self.authenticate(verify)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/TwitterSearch/TwitterSearch.py", line 83, in authenticate
r = requests.get(self._base_url + self._verify_url, auth=self.__oauth, proxies=self.__proxy)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/sessions.py", line 574, in send
r = adapter.send(request, **kwargs)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/adapters.py", line 345, in send
timeout=timeout
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/packages/urllib3/connectionpool.py", line 516, in urlopen
body=body, headers=headers)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/packages/urllib3/connectionpool.py", line 304, in _make_request
self._validate_conn(conn)
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/packages/urllib3/connectionpool.py", line 722, in _validate_conn
conn.connect()
File "/Users/krishna/Documents/DATASCI/twitterapi/analyzetweets/requests/packages/urllib3/connection.py", line 164, in connect
self.sock = ssl.wrap_socket(conn, self.key_file, self.cert_file)
AttributeError: 'NoneType' object has no attribute 'wrap_socket'
Source: (StackOverflow)
I'm trying to download an HTTPS page from my site hosted on Google App Engine with SNI.
No matter what library I use, I get the following error:
[Errno 8] _ssl.c:504: EOF occurred in violation of protocol
I've tried solving the error in many ways, including using the urllib3 openssl monkeypatch:
from urllib3.contrib import pyopenssl
pyopenssl.inject_into_urllib3
But I always get the same error mentioned above.
Any ideas?
Source: (StackOverflow)
I'm trying to use the awesome Requests library on Google App Engine. I found a patch for urllib3, which requests relies on, that is compatible with App Engine. https://github.com/shazow/urllib3/issues/61
I can successfully
import requests
but then
response = requests.get('someurl')
fails with the following traceback. What's going on?
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/admin/__init__.py", line 317, in post
exec(compiled_code, globals())
File "<string>", line 6, in <module>
File "/Users/Rohan/Dropbox/MuktiTechnologiesINC/MuktiTechnologies/GAE/humanmictweet/GAE/libraries/requests/api.py", line 52, in get
return request('get', url, **kwargs)
File "/Users/Rohan/Dropbox/MuktiTechnologiesINC/MuktiTechnologies/GAE/humanmictweet/GAE/libraries/requests/api.py", line 40, in request
return s.request(method=method, url=url, **kwargs)
File "/Users/Rohan/Dropbox/MuktiTechnologiesINC/MuktiTechnologies/GAE/humanmictweet/GAE/libraries/requests/sessions.py", line 208, in request
r.send(prefetch=prefetch)
File "/Users/Rohan/Dropbox/MuktiTechnologiesINC/MuktiTechnologies/GAE/humanmictweet/GAE/libraries/requests/models.py", line 458, in send
self.auth = get_netrc_auth(url)
File "/Users/Rohan/Dropbox/MuktiTechnologiesINC/MuktiTechnologies/GAE/humanmictweet/GAE/libraries/requests/utils.py", line 43, in get_netrc_auth
for loc in locations:
File "/Users/Rohan/Dropbox/MuktiTechnologiesINC/MuktiTechnologies/GAE/humanmictweet/GAE/libraries/requests/utils.py", line 40, in <genexpr>
locations = (os.path.expanduser('~/{0}'.format(f)) for f in NETRC_FILES)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/posixpath.py", line 260, in expanduser
userhome = pwd.getpwuid(os.getuid()).pw_dir
AttributeError: 'module' object has no attribute 'getuid'
Source: (StackOverflow)
I got a problem on a Debian 8 system with python 2.7.9-2 amd64:
marius@pydev:/usr/lib/python2.7/dist-packages/urllib3/contrib$ pip search doo
Traceback (most recent call last):
File "/usr/bin/pip", line 9, in <module>
load_entry_point('pip==1.5.6', 'console_scripts', 'pip')()
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 356, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2476, in load_entry_point
return ep.load()
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2190, in load
['__name__'])
File "/usr/lib/python2.7/dist-packages/pip/__init__.py", line 74, in <module>
from pip.vcs import git, mercurial, subversion, bazaar # noqa
File "/usr/lib/python2.7/dist-packages/pip/vcs/mercurial.py", line 9, in <module>
from pip.download import path_to_url
File "/usr/lib/python2.7/dist-packages/pip/download.py", line 22, in <module>
import requests, six
File "/usr/local/lib/python2.7/dist-packages/requests/__init__.py", line 53, in <module>
from .packages.urllib3.contrib import pyopenssl
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 73, in <module>
ssl.PROTOCOL_SSLv3: OpenSSL.SSL.SSLv3_METHOD,
**AttributeError: 'module' object has no attribute 'PROTOCOL_SSLv3'**
I checked into the lib and tried to patch /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/contrib/pyopenssl.py
from .. import connection
from .. import util
__all__ = ['inject_into_urllib3', 'extract_from_urllib3']
# SNI only *really* works if we can read the subjectAltName of certificates.
HAS_SNI = SUBJ_ALT_NAME_SUPPORT
# Map from urllib3 to PyOpenSSL compatible parameter-values.
_openssl_versions = {
ssl.PROTOCOL_SSLv23: OpenSSL.SSL.SSLv23_METHOD,
**ssl.PROTOCOL_SSLv3: OpenSSL.SSL.SSLv3_METHOD,**
ssl.PROTOCOL_TLSv1: OpenSSL.SSL.TLSv1_METHOD,
}
_openssl_verify = {
ssl.CERT_NONE: OpenSSL.SSL.VERIFY_NONE,
ssl.CERT_OPTIONAL: OpenSSL.SSL.VERIFY_PEER,
ssl.CERT_REQUIRED: OpenSSL.SSL.VERIFY_PEER
+ OpenSSL.SSL.VERIFY_FAIL_IF_NO_PEER_CERT,
}
Could someone enlighten me how I can fix this? It would be super awesome if someone had a clue. I googled the issue and only found incomplete patches and it's messy. Probably a case for the bug tracker once this is fixed, too. I have this issue for all Python packages.
Source: (StackOverflow)
So I'm looking into urllib3 because it has connection pooling and is thread safe (so performance is better, especially for crawling), but the documentation is... minimal to say the least. urllib2 has build_opener so something like:
#!/usr/bin/python
import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
But urllib3 has no build_opener method, so the only way I have figured out so far is to manually put it in the header:
#!/usr/bin/python
import urllib3
http_pool = urllib3.connection_from_url("http://example.com")
myheaders = {'Cookie':'some cookie data'}
r = http_pool.get_url("http://example.org/", headers=myheaders)
But I am hoping there is a better way and that one of you can tell me what it is. Also can someone tag this with "urllib3" please.
Source: (StackOverflow)
When downloading a large file with python, I want to put a time limit not only for the connection process, but also for the download.
I am trying with the following python code:
import requests
r = requests.get('http://ipv4.download.thinkbroadband.com/1GB.zip', timeout = 0.5, prefetch = False)
print r.headers['content-length']
print len(r.raw.read())
This does not work (the download is not time limited), as correctly noted in the docs: https://requests.readthedocs.org/en/latest/user/quickstart/#timeouts
This would be great if it was possible:
r.raw.read(timeout = 10)
The question is, how to put a time limit to the download?
Source: (StackOverflow)
Smart folks,
I would like to use the awesome requests module in my jython program. It installs and runs just fine in python but I cannot get it to install in jython. I have tried both Jython 2.7a2 and 2.7b1 on mac and ubuntu and get the same errors related to urllib3.
First installed ez_setup.py as mentioned in How can I use jython setup.py install?
Then run easy_install from within the jython bin directory results in exception:
NameError: name 'CERT_NONE' is not defined
gautam-mbp:bin gautam$ ./easy_install requests
Searching for requests
Reading http://pypi.python.org/simple/requests/
Reading http://python-requests.org
Reading https://github.com/kennethreitz/requests
Best match: requests 1.1.0
Downloading http://pypi.python.org/packages/source/r/requests/requests-1.1.0.tar.gz#md5=a0158815af244c32041a3147ee09abf3
Processing requests-1.1.0.tar.gz
Running requests-1.1.0/setup.py -q bdist_egg --dist-dir /var/folders/jf/cb2pc45s7d94hd6sndysvyxw0000gn/T/easy_install-MnOao_/requests-1.1.0/egg-dist-tmp-E2Rkg1
Traceback (most recent call last):
File "./easy_install", line 7, in <module>
sys.exit(
File "/Users/gautam/jython27b1/Lib/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/easy_install.py", line 1712, in main
File "/Users/gautam/jython27b1/Lib/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/easy_install.py", line 1700, in with_ei_usage
File "/Users/gautam/jython27b1/Lib/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/easy_install.py", line 1712, in <lambda>
-----------lots of stack trace---------------
File "setup.py", line 6, in <module>
File "/var/folders/jf/cb2pc45s7d94hd6sndysvyxw0000gn/T/easy_install-MnOao_/requests-1.1.0/requests/__init__.py", line 52, in <module>
File "/var/folders/jf/cb2pc45s7d94hd6sndysvyxw0000gn/T/easy_install-MnOao_/requests-1.1.0/requests/utils.py", line 23, in <module>
File "/var/folders/jf/cb2pc45s7d94hd6sndysvyxw0000gn/T/easy_install-MnOao_/requests-1.1.0/requests/compat.py", line 7, in <module>
File "/var/folders/jf/cb2pc45s7d94hd6sndysvyxw0000gn/T/easy_install-MnOao_/requests-1.1.0/requests/packages/__init__.py", line 3, in <module>
File "/var/folders/jf/cb2pc45s7d94hd6sndysvyxw0000gn/T/easy_install-MnOao_/requests-1.1.0/requests/packages/urllib3/__init__.py", line 16, in <module>
File "/var/folders/jf/cb2pc45s7d94hd6sndysvyxw0000gn/T/easy_install-MnOao_/requests-1.1.0/requests/packages/urllib3/connectionpool.py", line 45, in <module>
File "/var/folders/jf/cb2pc45s7d94hd6sndysvyxw0000gn/T/easy_install-MnOao_/requests-1.1.0/requests/packages/urllib3/util.py", line 293, in <module>
NameError: name 'CERT_NONE' is not defined
Looks like problem related to urllib3 not working with jython. Appreciate help in getting requests (and urllib3 ) to work on jython. The same error shows up on ubuntu as well.
Thanks
Gautam
Source: (StackOverflow)
I'm trying to use Requests to create a robust way of consuming from Twitter's user streams. So far, I've produced the following basic working example:
"""
Example of connecting to the Twitter user stream using Requests.
"""
import sys
import json
import requests
from oauth_hook import OAuthHook
def userstream(access_token, access_token_secret, consumer_key, consumer_secret):
oauth_hook = OAuthHook(access_token=access_token, access_token_secret=access_token_secret,
consumer_key=consumer_key, consumer_secret=consumer_secret,
header_auth=True)
hooks = dict(pre_request=oauth_hook)
config = dict(verbose=sys.stderr)
client = requests.session(hooks=hooks, config=config)
data = dict(delimited="length")
r = client.post("https://userstream.twitter.com/2/user.json", data=data, prefetch=False)
# TODO detect disconnection somehow
# https://github.com/kennethreitz/requests/pull/200/files#L13R169
# Use a timeout? http://pguides.net/python-tutorial/python-timeout-a-function/
for chunk in r.iter_lines(chunk_size=1):
if chunk and not chunk.isdigit():
yield json.loads(chunk)
if __name__ == "__main__":
import pprint
import settings
for obj in userstream(access_token=settings.ACCESS_TOKEN, access_token_secret=settings.ACCESS_TOKEN_SECRET, consumer_key=settings.CONSUMER_KEY, consumer_secret=settings.CONSUMER_SECRET):
pprint.pprint(obj)
However, I need to be able to handle disconnections gracefully. Currently, when the stream disconnects, the above just hangs, and there are no exceptions raised.
What would be the best way to achieve this? Is there a way to detect this through the urllib3 connection pool? Should I use a timeout?
Source: (StackOverflow)
I would like to download file over HTTP
protocol using urllib3
.
I have managed to do this using following code:
url = 'http://url_to_a_file'
connection_pool = urllib3.PoolManager()
resp = connection_pool.request('GET',url )
f = open(filename, 'wb')
f.write(resp.data)
f.close()
resp.release_conn()
But I was wondering what is the proper way of doing this.
For example will it work well for big files and If no what to do to make this code more bug tolerant and scalable.
Note. It is important to me to use urllib3
library not urllib2
for example, because I want my code to be thread safe.
Source: (StackOverflow)
I'm reading XML events with the requests library as stated in the code below. How do I raise a connection-lost error once the request is started? The Server is emulating a HTTP push / long polling -> http://en.wikipedia.org/wiki/Push_technology#Long_polling and will not end by default.
If there is no new message after 10minutes, the while loop should be exited.
import requests
from time import time
if __name__ == '__main__':
#: Set a default content-length
content_length = 512
try:
requests_stream = requests.get('http://agent.mtconnect.org:80/sample?interval=0', stream=True, timeout=2)
while True:
start_time = time()
#: Read three lines to determine the content-length
for line in requests_stream.iter_lines(3, decode_unicode=None):
if line.startswith('Content-length'):
content_length = int(''.join(x for x in line if x.isdigit()))
#: pause the generator
break
#: Continue the generator and read the exact amount of the body.
for xml in requests_stream.iter_content(content_length):
print "Received XML document with content length of %s in %s seconds" % (len(xml), time() - start_time)
break
except requests.exceptions.RequestException as e:
print('error: ', e)
The server push could be tested with curl via command line:
curl http://agent.mtconnect.org:80/sample\?interval\=0
Source: (StackOverflow)
I am using Python 2.7 64 bit on Windows 8. I have Requests version 2.3 installed. I am trying to run this import statement as part of bringing in number of retries within my code:
from requests.packages.urllib3.util import Retry
I have urllib3 installed also (I've just installed it now via Pip). I am getting the error message:
Traceback (most recent call last):
File "C:\Python27\counter.py", line 3, in <module>
from requests.packages.urllib3.util import Retry
ImportError: cannot import name Retry
Can anyone tell me why this is? Are there any other dependencies I am unaware of to run this line of code successfully?
Thanks
Source: (StackOverflow)
I'm running Python 2.7.6 on an Ubuntu machine. When I run twill-sh
(Twill is a browser used for testing websites) in my Terminal, I'm getting the following:
Traceback (most recent call last):
File "dep.py", line 2, in <module>
import twill.commands
File "/usr/local/lib/python2.7/dist-packages/twill/__init__.py", line 52, in <module>
from shell import TwillCommandLoop
File "/usr/local/lib/python2.7/dist-packages/twill/shell.py", line 9, in <module>
from twill import commands, parse, __version__
File "/usr/local/lib/python2.7/dist-packages/twill/commands.py", line 75, in <module>
browser = TwillBrowser()
File "/usr/local/lib/python2.7/dist-packages/twill/browser.py", line 31, in __init__
from requests.packages.urllib3 import connectionpool as cpl
ImportError: No module named packages.urllib3
However, I can import urllib in Python console just fine. What could be the reason?
Source: (StackOverflow)
What is the correct way to update the user agent information in urllib3
?
How can I check that the user agent information was indeed changed and is being used?
For example:
user_agent = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0'}
http = urllib3.PoolManager(10, headers=user_agent)
r1 = http.request('GET', 'http://example.com/')
if r1.status is 200:
with open('somefile','w+') as f:
f.write(r1.data)
When I create a PoolManager
at http
I looked at it by dir(http)
and saw that http.headers
was empty by default and updated to the user agent info specified, but is it being used? Is there anyway to check without having to look at apache
logs?
And actually checking /var/log/apache2/access.log
after trying to update the user agent:
>>> import urllib3
>>> user_agent = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0'}
>>> http = urllib3.PoolManager(2, headers=user_agent)
>>> r = http.request('GET','localhost')
>>> with open('/var/log/apache2/access.log','r') as f:
... last_line = f.readlines()[-1]
...
>>> last_line
'127.0.0.1 - - [08/Dec/2014:20:42:04 -0500] "GET / HTTP/1.1" 200 461 "-" "-"\n'
Source: (StackOverflow)