EzDevInfo.com

pycurl

PycURL - Python interface to libcurl PycURL Home Page pycurl homepage

Create missing directories in ftplib storbinary

I was using pycurl to transfer files over ftp in python. I could create the missing directories automatically on my remote server using:

c.setopt(pycurl.FTP_CREATE_MISSING_DIRS, 1)

for some reasons, I have to switch to ftplib. But I don't know how to to the same here. Is there any option to add to storbinary function to do that? or I have to create the directories manually?


Source: (StackOverflow)

Logging in and using cookies in pycurl

I need to download a file that is on a password protected page. To get to the page manually I first have to authenticate via an ordinary login page. I want to use curl to fetch this page in script.
My script first logins. It appears to succeed--it returns a 200 from a PUT to /login. However, the fetch of the desired page fails, with a 500.

I am using a "cookie jar":

C.setopt(pycurl.COOKIEJAR, 'cookie.txt')

In verbose mode, I can see cookies being exchanged when I fetch the file I need. Now my question: Is there more to using a COOKIEJAR?


Source: (StackOverflow)

Advertisements

pip install pycurl - ssl backend error

I was trying to install pycurl in a virtualenv using pip and I got this error

ImportError: pycurl: libcurl link-time ssl backend (openssl) is different from compile-time ssl backend (none/other)

I read some documentation saying that "To fix this, you need to tell setup.py what SSL backend is used" (source) although I am not sure how to do this since I installed pycurl using pip.

How can I specify the SSL backend when installing pycurl with pip?

Thanks


Source: (StackOverflow)

Custom headers with pycurl

Can I send a custom header like "yaddayadda" to the server with the pycurl request?


Source: (StackOverflow)

PyCurl installed but not found

I've been trying to install pycurl in a virtualenv with easy_install, and it appears to install correctly:

(xxx) $ easy_install pycurl
Searching for pycurl
Reading http://pypi.python.org/simple/pycurl/
Reading http://pycurl.sourceforge.net/
Reading http://pycurl.sourceforge.net/download/
Best match: pycurl 7.19.0
Downloading http://pycurl.sourceforge.net/download/pycurl-7.19.0.tar.gz
Processing pycurl-7.19.0.tar.gz
Running pycurl-7.19.0/setup.py -q bdist_egg --dist-dir /tmp/easy_install-B2W9dY/pycurl-7.19.0/egg-dist-tmp-RmIsVr
Using curl-config (libcurl 7.19.7)
src/pycurl.c:85:4: warning: #warning "libcurl was compiled with SSL support, but configure could not determine which " "library was used; thus no SSL crypto locking callbacks will be set, which may " "cause random crashes on SSL requests"
src/pycurl.c: In function ‘do_multi_info_read’:
src/pycurl.c:2843: warning: call to ‘_curl_easy_getinfo_err_string’ declared with attribute warning: curl_easy_getinfo expects a pointer to char * for this info
src/pycurl.c: In function ‘multi_socket_callback’:
src/pycurl.c:2355: warning: call to ‘_curl_easy_getinfo_err_string’ declared with attribute warning: curl_easy_getinfo expects a pointer to char * for this info
In function ‘util_curl_unsetopt’,
    inlined from ‘do_curl_unsetopt’ at src/pycurl.c:1551:
src/pycurl.c:1476: warning: call to ‘_curl_easy_setopt_err_CURLSH’ declared with attribute warning: curl_easy_setopt expects a CURLSH* argument for this option
gcc (GCC) 4.4.4 20100726 (Red Hat 4.4.4-13)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

zip_safe flag not set; analyzing archive contents...
Adding pycurl 7.19.0 to easy-install.pth file

Installed /home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/pycurl-7.19.0-py2.7-linux-x86_64.egg
Processing dependencies for pycurl
Finished processing dependencies for pycurl

However, when trying to use, I get the following error:

(xxx) $ python
Python 2.7.1 (r271:86832, Sep 12 2011, 13:51:42) 
[GCC 4.4.4 20100726 (Red Hat 4.4.4-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycurl
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named pycurl

Despite the fact that I can clearly see a pycurl-7.19.0-py2.7-linux-x86_64.egg in my ~/.virtualenvs/xxx/lib/python2.7/site-packages directory.

Any suggestions on how to solve this problem would be much appreciated!

Distribution is Amazon Linux (EC2, 64 bit), Python is 2.7, installed from source. curl, libcurl and libcurl-dev are all supplied by the Amazon yum repos and are at version 7.19.7-26.20.

I can't install pycurl package provided by the Amazon repos because the distro-supplied version of python is 2.6, and my project runs on 2.7.

It is not a path issue, the contents of sys.path is as follows:

>>> import sys; sys.path
['', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/pip-1.0.2-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/pygeoip-0.2.1-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/daryl_lib-1.8.4-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/xxx-1.8.4-py2.7.egg', '/home/ec2-user/dealutils-1.5.0-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/six-1.0.0-py2.7.egg', '/home/ec2-user/fm2c-1.3.0-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/affiliatewindow-1.3.0-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/icodes-1.3.0-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/supervisor-3.0a10-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/meld3-0.6.7-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/tornado-1.2-py2.7.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages/pycurl-7.19.0-py2.7-linux-x86_64.egg', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/site-packages', '/root', '/home/ec2-user/.virtualenvs/xxx/lib/python27.zip', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/plat-linux2', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/lib-tk', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/lib-old', '/home/ec2-user/.virtualenvs/xxx/lib/python2.7/lib-dynload', '/opt/python2.7/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg', '/opt/python2.7/lib/python2.7/site-packages/pip-1.0.2-py2.7.egg', '/opt/python2.7/lib/python2.7', '/opt/python2.7/lib/python2.7/plat-linux2', '/opt/python2.7/lib/python2.7/lib-tk', '/opt/python2.7/lib/python2.7/site-packages']

Source: (StackOverflow)

What good tutorials exist for learning pycURL? [closed]

I'm planning on building my own FTP client in Python for learning purposes. I'm planning on using PycURL but the documentation seems to be lacking.

What good tutorials are there for learning pycURL?


Source: (StackOverflow)

Getting HTML with Pycurl

I've been trying to retrieve a page of HTML using pycurl, so I can then parse it for relevant information using str.split and some for loops. I know Pycurl retrieves the HTML, since it prints it to the terminal, however, if I try to do something like

html = str(c.perform())  

The variable will just hold a string which says "None".

How can I use pycurl to get the html, or redirect whatever it sends to the console so it can be used as a string as described above?

Thanks a lot to anyone who has any suggestions!


Source: (StackOverflow)

Screen scraping with Python

Does Python have screen scraping libraries that offer JavaScript support?

I've been using pycurl for simple HTML requests, and Java's HtmlUnit for more complicated requests requiring JavaScript support.

Ideally I would like to be able to do everything from Python, but I haven't come across any libraries that would allow me to do it. Do they exist?


Source: (StackOverflow)

How to read the header with pycurl

How do I read the response headers returned from a PyCurl request?


Source: (StackOverflow)

Install pycurl on Mac OSX Snow Leopard [closed]

After hours of reading on the internet I found the easiest solution to install pycurl on mac os x. Make sure you have Xcode UNIX tool installed and run this command.

sudo env ARCHFLAGS="-arch x86_64" easy_install setuptools pycurl==7.16.2.1


Source: (StackOverflow)

ERROR "Extra data: line 2 column 1" when using pycurl with gzip stream

Thanks for reading.

Background: I am trying to read a streaming API feed that returns data in JSON format, and then storing this data to a pymongo collection. The streaming API requires a "Accept-Encoding" : "Gzip" header.

What's happening: Code fails on json.loads and outputs - Extra data: line 2 column 1 - line 4 column 1 (char 1891 - 5597) (Refer Error Log below)

This does NOT happen while parsing every JSON object - it happens at random.

My guess is I am encountering some weird JSON object after every "x" proper JSON objects.

I did reference how to use pycurl if requested data is sometimes gzipped, sometimes not? and Encoding error while deserializing a json object from Google but so far have been unsuccessful at resolving this error.

Could someone please help me out here?

Error Log: Note: The raw dump of the JSON object below is basically using the repr() method that prints the raw representation of the string without resolving CRLF/LF(s).


'{"id":"tag:search.twitter.com,2005:207958320747782146","objectType":"activity","actor":{"objectType":"person","id":"id:twitter.com:493653150","link":"http://www.twitter.com/Deathnews_7_24","displayName":"Death News 7/24","postedTime":"2012-02-16T01:30:12.000Z","image":"http://a0.twimg.com/profile_images/1834408513/deathnewstwittersquare_normal.jpg","summary":"Crashes, Murders, Suicides, Accidents, Crime and Naturals Death News From All Around World","links":[{"href":"http://www.facebook.com/DeathNews724","rel":"me"}],"friendsCount":56,"followersCount":14,"listedCount":1,"statusesCount":1029,"twitterTimeZone":null,"utcOffset":null,"preferredUsername":"Deathnews_7_24","languages":["tr"]},"verb":"post","postedTime":"2012-05-30T22:15:02.000Z","generator":{"displayName":"web","link":"http://twitter.com"},"provider":{"objectType":"service","displayName":"Twitter","link":"http://www.twitter.com"},"link":"http://twitter.com/Deathnews_7_24/statuses/207958320747782146","body":"Kathi Kamen Goldmark, Writers\xe2\x80\x99 Catalyst, Dies at 63 http://t.co/WBsNlNtA","object":{"objectType":"note","id":"object:search.twitter.com,2005:207958320747782146","summary":"Kathi Kamen Goldmark, Writers\xe2\x80\x99 Catalyst, Dies at 63 http://t.co/WBsNlNtA","link":"http://twitter.com/Deathnews_7_24/statuses/207958320747782146","postedTime":"2012-05-30T22:15:02.000Z"},"twitter_entities":{"urls":[{"display_url":"nytimes.com/2012/05/30/boo\xe2\x80\xa6","indices":[52,72],"expanded_url":"http://www.nytimes.com/2012/05/30/books/kathi-kamen-goldmark-writers-catalyst-dies-at-63.html","url":"http://t.co/WBsNlNtA"}],"hashtags":[],"user_mentions":[]},"gnip":{"language":{"value":"en"},"matching_rules":[{"value":"url_contains: nytimes.com","tag":null}],"klout_score":11,"urls":[{"url":"http://t.co/WBsNlNtA","expanded_url":"http://www.nytimes.com/2012/05/30/books/kathi-kamen-goldmark-writers-catalyst-dies-at-63.html?_r=1"}]}}\r\n{"id":"tag:search.twitter.com,2005:207958321003638785","objectType":"activity","actor":{"objectType":"person","id":"id:twitter.com:178760897","link":"http://www.twitter.com/Mobanu","displayName":"Donald Ochs","postedTime":"2010-08-15T16:33:56.000Z","image":"http://a0.twimg.com/profile_images/1493224811/small_mobany_Logo_normal.jpg","summary":"","links":[{"href":"http://www.mobanuweightloss.com","rel":"me"}],"friendsCount":10272,"followersCount":9698,"listedCount":30,"statusesCount":725,"twitterTimeZone":"Mountain Time (US & Canada)","utcOffset":"-25200","preferredUsername":"Mobanu","languages":["en"],"location":{"objectType":"place","displayName":"Crested Butte, Colorado"}},"verb":"post","postedTime":"2012-05-30T22:15:02.000Z","generator":{"displayName":"twitterfeed","link":"http://twitterfeed.com"},"provider":{"objectType":"service","displayName":"Twitter","link":"http://www.twitter.com"},"link":"http://twitter.com/Mobanu/statuses/207958321003638785","body":"Mobanu: Can Exercise Be Bad for You?: Researchers have found evidence that some people who exercise do worse on ... http://t.co/mTsQlNQO","object":{"objectType":"note","id":"object:search.twitter.com,2005:207958321003638785","summary":"Mobanu: Can Exercise Be Bad for You?: Researchers have found evidence that some people who exercise do worse on ... http://t.co/mTsQlNQO","link":"http://twitter.com/Mobanu/statuses/207958321003638785","postedTime":"2012-05-30T22:15:02.000Z"},"twitter_entities":{"urls":[{"display_url":"nyti.ms/KUmmMa","indices":[116,136],"expanded_url":"http://nyti.ms/KUmmMa","url":"http://t.co/mTsQlNQO"}],"hashtags":[],"user_mentions":[]},"gnip":{"language":{"value":"en"},"matching_rules":[{"value":"url_contains: nytimes.com","tag":null}],"klout_score":12,"urls":[{"url":"http://t.co/mTsQlNQO","expanded_url":"http://well.blogs.nytimes.com/2012/05/30/can-exercise-be-bad-for-you/?utm_medium=twitter&utm_source=twitterfeed"}]}}\r\n'
json exception: Extra data: line 2 column 1 - line 4 column 1 (char 1891 - 5597)

Header Output:


HTTP/1.1 200 OK

Content-Type: application/json; charset=UTF-8

Vary: Accept-Encoding

Date: Wed, 30 May 2012 22:14:48 UTC

Connection: close

Transfer-Encoding: chunked

Content-Encoding: gzip

get_stream.py:


#!/usr/bin/env python
import sys
import pycurl
import json
import pymongo

STREAM_URL = "https://stream.test.com:443/accounts/publishers/twitter/streams/track/Dev.json"
AUTH = "userid:passwd"

DB_HOST = "127.0.0.1"
DB_NAME = "stream_test"

class StreamReader:
    def __init__(self):
        try:
            self.count = 0
            self.buff = ""
            self.mongo = pymongo.Connection(DB_HOST)
            self.db = self.mongo[DB_NAME]
            self.raw_tweets = self.db["raw_tweets_gnip"]
            self.conn = pycurl.Curl()
            self.conn.setopt(pycurl.ENCODING, 'gzip')
            self.conn.setopt(pycurl.URL, STREAM_URL)
            self.conn.setopt(pycurl.USERPWD, AUTH)
            self.conn.setopt(pycurl.WRITEFUNCTION, self.on_receive)
            self.conn.setopt(pycurl.HEADERFUNCTION, self.header_rcvd)
            while True:
                self.conn.perform()
        except Exception as ex:
            print "error ocurred : %s" % str(ex)

    def header_rcvd(self, header_data):
        print header_data

    def on_receive(self, data):
        temp_data = data
        self.buff += data
        if data.endswith("\r\n") and self.buff.strip():
            try:
                tweet = json.loads(self.buff, encoding = 'UTF-8')
                self.buff = ""
                if tweet:
                    try:
                        self.raw_tweets.insert(tweet)
                    except Exception as insert_ex:
                        print "Error inserting tweet: %s" % str(insert_ex)
                    self.count += 1

                if self.count % 10 == 0:
                    print "inserted "+str(self.count)+" tweets"
            except Exception as json_ex:
                print "json exception: %s" % str(json_ex)
                print repr(temp_data)



stream = StreamReader()

Fixed Code:


def on_receive(self, data):
        self.buff += data
        if data.endswith("\r\n") and self.buff.strip():
           # NEW: Split the buff at \r\n to get a list of JSON objects and iterate over them
            json_obj = self.buff.split("\r\n")
            for obj in json_obj:
                if len(obj.strip()) > 0:
                    try:
                        tweet = json.loads(obj, encoding = 'UTF-8')
                    except Exception as json_ex:
                        print "JSON Exception occurred: %s" % str(json_ex)
                        continue

Source: (StackOverflow)

Python code like curl

in curl i do this:

curl -u email:password http://api.foursquare.com/v1/venue.json?vid=2393749

How i can do this same thing in python?


Source: (StackOverflow)

How upload folder by FTP using cURL?

I need to create FTP-uploader, i am using pycurl and python, but i dont know how to make folder with cURL on ftp's host. Help me please.


Source: (StackOverflow)

How to get HTTP status message in (py)curl?

spending some time studying pycurl and libcurl documentation, i still can't find a (simple) way, how to get HTTP status message (reason-phrase) in pycurl.

status code is easy:

import pycurl
import cStringIO

curl = pycurl.Curl()
buff = cStringIO.StringIO()
curl.setopt(pycurl.URL, 'http://example.org')
curl.setopt(pycurl.WRITEFUNCTION, buff.write)
curl.perform()

print "status code: %s" % curl.getinfo(pycurl.HTTP_CODE)
# -> 200

# print "status message: %s" % ???
# -> "OK"

Source: (StackOverflow)

How do i install pyCurl?

I am VERY new to python. I used libcurl with no problems and used pyCurl once in the past. Now i want to set it up on my machine and dev. However i have no idea how to do it. I rather not DL libcirl files and compile that along with pycurl, i want to know the simplest method. I have libcurl installed on my machine.

i'm on windows, i tried DLing the sources and use pycurl setup script, i had no luck.


Source: (StackOverflow)