EzDevInfo.com

spynner

Programmatic web browsing module with AJAX support for Python spynner 2.19 : Python Package Index programmatic web browsing module with ajax support for python

How can I set GUI size of spynner python module?

I would like to set a particular size for spynner GUI, how can I do that ? (fullscreen)

I know there's QT doc but I don't know any C++ and moreover I need to mix C++ & python


Source: (StackOverflow)

Python 2.7 - Spynner - High RAM Usage Upon Page Refresh

I ran into the trouble when I made Page Refresh script.

Here's the code:

import spynner
browser = spynner.Browser()

When I type

browser.load("http://stackoverflow.com")

..a few times, the script eats very much RAM.

I tried:

browser = spynner.Browser()
browser.load("http://stackoverflow.com")
browser.close()

but it does not help, in other words, the eaten RAM was still the same.

So my question would be how I could load any page many times without getting my RAM to be eaten up.

Thanks in advance!


Source: (StackOverflow)

Advertisements

QtWebKit - Userscript/Javascript Injection

I've been doing testing work in Python using QtWebkit/Spynner. As QtWebKit has Javascript support just like Chrome's Webkit, is it possible to inject a userscript or a piece of javascript at the beginning of a page just like you would a regular user script in Chrome?

Hopefully a simple question for those experience! Thanks in advance!


Source: (StackOverflow)

Click Javascript button python, spynner

I want to click button without name using a spynner. The button looks like this:

<li> <a onclick="save(); return false;" rel='nofollow' href="">
<img src="/pathtoimage" width="31" height="13" alt="Save Changes"img  border="0"></a>
</li>
</ul>

Have you any idea? Please write some code. Any help is much appreciated!


Source: (StackOverflow)

Installing Spynner on Python on Windows XP

I have Python 2.7 running on Windows XP. I am trying to install Spynner as an alternative to Mechanize that supports Javascript. When I run easy_install spynner, I get an error while installing lxml:

Make sure the development packages of libxml2 and libxlst are installed

Where can I find those files? I found instructions for linux but no instructions for Windows. I also tried easy_install but it could not find the packages.


Source: (StackOverflow)

How to run a POST request programmatically in python with a GUI ? (spynner, webkit...)

I have a web site with forms that I need to scrape. Instead of filling the flash forms, I would like to POST some keys/values to the URL that doesn't support GET requests.

I use spynner to interact with the site, and spynner can have a GUI, but my search on google, stackoverflow, spynner github and in the spynner module are unsuccessful.

If spynner can't do a POST request, maybe gtk or qt + webkit can do that ? Any real life code sample will be really appreciated.


Source: (StackOverflow)

Spynner install issues

Running Mac OS Lion

I was trying to do some scraping with Mechanize but I was having massive issues with javascript. So, after some browsing I decided to try out Spynner. I've tried to download it with both pip and easy_install but I get the same error each time:

Command /usr/bin/python -c "import setuptools;__file__='/private/tmp/pip-build-root/autopy/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-53vGzx-record/install-record.txt --single-version-externally-managed failed with error code 1 in /private/tmp/pip-build-root/autopy

Looking for some help (or maybe another suggestion rather than Spynner).

I have XCode installed and updated (along with command line tools).

I've made the links between gcc and gcc-4.2 (ln -s /usr/bin/gcc /usr/bin/gcc-4.21)

I have virtualenv installed. I only mention that because I've noticed that a lot of the same problems occur with its installation as well.


Source: (StackOverflow)

Installing Spynner on Heroku

I'm trying to use Python's spynner module in a Heroku app, but I'm having trouble getting it working on Heroku. I obviously have it in my requirements.txt and that's fine, even installing lxml with no problem, but I run into a problem with PyQt4. It's not listed in its dependencies, and if I manually put a statement in the requirements.txt like pyqt>=0.0.0, it does try to install it, but I always end up with the following error:

       Downloading/unpacking pyqt>=0.0.0 (from -r requirements.txt (line 12))
         Storing download in cache at /app/tmp/repo.git/.cache/pip_downloads/htt
p%3A%2F%2Fwww.riverbankcomputing.com%2Fstatic%2FDownloads%2FPyQt4%2FPyQt-x11-gpl
-4.9.4.tar.gz
         Running setup.py egg_info for package pyqt
           Traceback (most recent call last):
             File "<string>", line 14, in <module>
           IOError: [Errno 2] No such file or directory: '/tmp/build_11i9jfrncqa
t2/.heroku/venv/build/pyqt/setup.py'
           Complete output from command python setup.py egg_info:
           Traceback (most recent call last):

         File "<string>", line 14, in <module>

       IOError: [Errno 2] No such file or directory: '/tmp/build_11i9jfrncqat2/.
heroku/venv/build/pyqt/setup.py'

       ----------------------------------------
       Command python setup.py egg_info failed with error code 1 in /tmp/build_1
1i9jfrncqat2/.heroku/venv/build/pyqt
       Storing complete log in /app/.pip/pip.log
 !     Heroku push rejected, failed to compile Python app

No matter what I do, even trying manual heroku run easy_install xxx with some PyQt4 distribution, it never works. Does anyone have any advice on how to get spynner running?


Source: (StackOverflow)

Monitoring for Program Crash

I'm currently working with the Python module Spynner for automating some web tasks. I've run in to a problem though where for some reason the process would just simply stop moving, freeze but still be responsive according to Windows.

What I'd like to do is setup some form of a monitor to check and see if this happens, and then restart the process. I was thinking of possibly monitoring the terminal output of the program, and if it stops pushing data after a certain amount of time, it would kill the program and start up again.

I know how I'd like to kill the program and run it again, simply using os and subprocess, but I'm not sure how to setup the piece to monitor if the terminal stops sending data for a specific amount of time.


Source: (StackOverflow)

Spynner: get html of second page after submitting form

I have just started using Spynner to scrape webpages and am not finding any good tutorials out there. I have here a simple example where I type a word into Google and then I want to see the resulting page.

But how do I go from clicking the button to actually getting the new page?

import spynner

def content_ready(browser):
    if 'gbqfba' in browser.html:
        return True #id of search button

b = spynner.Browser()
b.show()
b.load("http://www.google.com", wait_callback=content_ready)
b.wk_fill('input[name=q]', 'soup')
# b.browse() # Shows the word soup in the input box
with open("test.html", "w") as hf: # writes the initial page to a file
    hf.write(b.html.encode("utf-8"))
b.wk_click("#gbqfba") # Clicks the google search button (or so I think)

But now what? I'm not even sure that I have clicked the google search button, although it does have id=gbqfba. I have also tried just b.click("#gbqfba"). How do I get the search results?

I have tried just doing:

 with open("test.html", "w") as hf: # writes the initial page to a file
    hf.write(b.html.encode("utf-8"))

but that still prints the initial page.


Source: (StackOverflow)

Python six library error while using spynner

I installed python, pip and easy_install on my computer. and with pip command installed spynner but i've got an error with autopy installation, but i solved it by using easy_install and after installation, i tried to use spynner but it give me an error with crashing...

Here's what i've got import spynner br = spynner.Browser() br.load("http://www.google.com") Traceback (most recent call last): File "C:\Python27\lib\site-packages\spynner\browser.py", line 1674, in createRequest url = six.u(toString(request.url())) File "C:\Python27\lib\site-packages\six.py", line 589, in u return unicode(s.replace(r'\', r'\\'), "unicode_escape") TypeError: decoding Unicode is not supported

On my Windows 7 64bit Ultimate and Python 2.7.8 64bit

I tried 32bit python also but gave me same error. Anyone can solve this errror?


Source: (StackOverflow)

Spynner module on python 3

I used python's 2to3.py to do the fixing on spynner module. Then there seemed to be an issue with QString on python 3. I modified the browser.py in spynner with QString = str as some users suggested. For a start I tried the following code

import spynner
browser = spynner.Browser()
browser.set_proxy("http://username:password@host:3128")
browser.load("http://www.google.com/")

Now python is throwing the following error

File "G:\Python33\lib\site-packages\spynner\browser.py", line 1163, in runjs
js_has_runned_successfully = res.isValid() or res.isNull()
AttributeError: 'str' object has no attribute 'isValid'

res is defined in browser.py as

res = self.webframe.evaluateJavaScript(jscode)

Does spynner actually work on python 3?


Source: (StackOverflow)

'str' object is not callable

Not quite understanding why I am getting this trace error:

Traceback (most recent call last):
  File "S:/Personal Folders/Andy/Python Projects/Salesforce BZ API/Automated Reports.py", line 15, in <module>
    parse = br.soup("find('div')")
  File "build\bdist.win32\egg\spynner\browser.py", line 409, in _get_soup
    return self._html_parser(self.html)
TypeError: 'str' object is not callable

Here is my code:

from __future__ import division
#from __future__ import unicode_literals
from __future__ import print_function
import spynner
from BeautifulSoup import BeautifulSoup

#Loading up Salesforce

br = spynner.Browser()
#br.debug_level = spynner.DEBUG
br.create_webview()
br.show()
br.set_html_parser("BeautifulSoup")
br.load("https://login.salesforce.com/")
parse = br.soup("find('div')")
print(parse)
br.browse()
br.close()

Source: (StackOverflow)

jQuery/Python - Disable Dialog Boxes

I have a webpage that whenever I leave the page, a dialog box comes up asking if I'd "Really like to leave the page", with a 'Leave' or 'Stay options. This box is created by javascript.

I'm currently using Spynner for Python as the browser tool that's going on this page. Spynner has the ability to inject Javascript. This is what I came up with, but doesn't seem to get the job done in Python.

browser.runjs("""window.alert = function() {};""")

I've also tried alternatives such as these, which I've used for previous scripts to inject in to the page, however I get a syntax error when using this that I can't seem to point my finger on.

Injecting Script that DOES work:

browser.runjs("""jQuery(document.elementFromPoint(204,51)).click();""",debug=True)

Injecting Script that DOES NOT work, but need to get working:

browser.runjs("""jQuery(window.alert = function()) {};""")

Any help is greatly appreciated, thank you!

EDIT: Tried giving this a shot. Didn't work either. I'm a bit lost.

Ran This:

browser.runjs("""window.alert = function();""") 

Console said this:

Run Javascript code: window.alert = function(); 
Javascript console (undefined:1): SyntaxError: Parse error

Source: (StackOverflow)

download file over https query with python headless browser

I try to do web scraping in python on a website (using spynner and BeautifulSoup). At some point I want to test a zip file download, triggered by the following html query:

https://mywebsite.com/download?from=2011&to=2012

If explicitly used in a browser (chrome) this will trigger the download of a zip file with a given name. I have not been able to reproduce this behavior with my headless browser. I know it's not the right way to do it but using something like spynner:

from spynner import Browser
b = Browser()
b.load(webpage,wait_callback=wait_page_load, tries=3)
b.load_jquery(True)
...
output = b.load("https://website.com/download?from=2011&to=2012")
print b.html
>> ...

does not work of course (no zip file download). The last print statement shows I end up on an error page, with a java exception stack.

Is there a way to

  1. properly call the html query without using the spynner load mechanism?
  2. capture the resulting zip file?
  3. download it with a chosen name?

Thanks for your help.

One last thing that came after some testing on chrome with the java debugger, I have the following warning when doing it in the browser:

Resource interpreted as Document but transferred with MIME type application/zip "https://mywebsite.com/download?from=2011&to=2012"

Edited:

Found out that the call made was:

https://mywebsite.com/download?from=10%2F18%2F2011&to=10%2F18%2F2012

which can be used in a browser and should be replaced by

https://mywebsite.com/download?from=10/18/2011&to=10/18/2012

which could not be used in python because the URL encoding would map %2F into %252F


Source: (StackOverflow)