EzDevInfo.com

Ghost.py

Webkit based scriptable web browser for python. ghost.py — Ghost.py 0.2.1 documentation

How can I run casperjs javascript tests from Jenkins?

I've written some casperjs tests to test my Django application. If the Django application is started (on port 8000 for example), casperjs can be run as a separate process and access my running Django app.

My other tests are written with Django's (web)testing framework that sets up the test database with fixtures, and are run with ./manage.py test. With Django webtest, you don't need to start a separate Django webserver (doing requests and url routing is proxied/simulated).

Is there a way to rung casperjs tests from within Django webtest? Without starting a different webserver and having yet another test database?

I've seen ghost.py exists, but haven't tried it yet.


Source: (StackOverflow)

Ghost.py not finding PySide?

I'm trying to get started with the Ghost.py headless browser on a Mac. I installed Ghost.py and its dependencies using these links/commands:

  1. Qt 5.0.1 for Mac, has a GUI installer
  2. PySide 1.1.0, which requires Qt Version >= 4.7.4, has a GUI installer
  3. sudo pip install Ghost.py

I launched Python, and confirmed that I can import PySide. However, when I do from ghost import Ghost, it fails to find PySide:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ghost/__init__.py", line 1, in <module>
    from ghost import Ghost
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ghost/ghost.py", line 28, in <module>
    raise Exception("Ghost.py requires PySide or PyQt")
Exception: Ghost.py requires PySide or PyQt

By doing import PySide; print PySide;, it appears that PySide is installed here on my system: /Library/Python/2.7/site-packages/PySide. So, appended the PYTHONPATH like this:
export PYTHONPATH=$PYTHONPATH:/Library/Python/2.7/site-packages #for PySide.

However, Ghost.py still cannot find PySide.

How can I convince Ghost.py to find my installation of PySide?


Environment:

  • Mac OS X 10.7.5
  • Python 2.7
  • Qt 5.0.1
  • PySide 1.1.0

Source: (StackOverflow)

Advertisements

binding is not defined while using Ghost.py

I am writing a script and having trouble even getting a basic one to work. Ghost seems to not be importing properly. I keep getting the following error:

>>>from ghost import Ghost
  File "/Library/Python/2.7/site-packages/ghost/__init__.py", line 1, in <module>
from .ghost import Ghost, Error, TimeoutError
  File "/Library/Python/2.7/site-packages/ghost/ghost.py", line 23, in <module>
    if binding is None:
NameError: name 'binding' is not defined

Nothing special in the code:

 from ghost import Ghost
 ghost = Ghost()

I have PySide and PyQt both installed and I install Ghost by doing: sudo pip install ghost


Source: (StackOverflow)

Python.exe appcrash with Ghost.py [closed]

I'm using Ghost.py

    from ghost import Ghost
    url = "http://www.kiev.prom.ua"
    gh = Ghost()
    page, page_name = gh.create_page()
    page_resource = page.open(url, wait_onload_event=True)

When I run the above script, Python crashes:

Problem Event Name: APPCRASH 
   Application Name: python.exe 
   Application Version: 0.0.0.0 
   Application Timestamp: 4c303241 
   Name of the module with the error: python27.dll 
   Version of the module with the error: 2.7.5150.1013 
   The time stamp module with the error: 5237f3d5 
   Exception Code: c0000005 
   Exception Offset: 00107f7a 
   OS Version: 6.1.7601.2.1.0.256.1 
   Language Code: 1049 
   Additional Information 1: 0a9e 
   Additional information 2: 0a9e372d3b4ad19135b953a78882e789 
   Additional Information 3: 0a9e 
   Additional Information 4: 0a9e372d3b4ad19135b953a78882e789

How can I find the source of this problem?


Source: (StackOverflow)

Not able to install Ghost.py

I want to try Ghost.py.
its documentation says that installation require PyQt or PySide.
I have installed pyqt4-dev-tools using command apt-get install pyqt4-dev-tools on my Kubuntu 13.10.

I am getting error

root@alok:~# pip install Ghost.py
Downloading/unpacking Ghost.py Could not find a version that satisfies the requirement Ghost.py (from versions: 0.1a, 0.1a2, 0.1a3, 0.1b, 0.1b2, 0.1b3)
Cleaning up...
No distributions matching the version for Ghost.py
Storing complete log in /root/.pip/pip.log

I have installed PySide too still I am not able to install Ghost.py using pip install Ghost.py using pip is so straight forward but I am not able to figure out what is wrong this time.
output of /root/.pip/pip.log is available at http://paste.ubuntu.com/7189458/


Source: (StackOverflow)

Why isn't Ghost.py loading/running my Javascript?

Ghost.py is supposed to run JS: http://jeanphix.me/Ghost.py/

require.js gets fetched over http, but as far as I can tell it doesn't get run, since "js/main.built" never gets fetched and none of its specified JS files get loaded. It all works perfectly in a real browser.

In [51]: ghost = Ghost(wait_timeout=60)

In [52]: page, resources = ghost.open(url)

In [53]: [r.url for r in resources]
Out[53]: 
[PyQt4.QtCore.QUrl(u'https://example.com/#consume/283e6571bcecf34143cbd60f35e0464b'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/ui.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/colorpicker.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/selectize.default.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/datepicker.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/site.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/jquery.fileupload-ui.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/style.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/bootstrap.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/css/redactor.css'),
 PyQt4.QtCore.QUrl(u'https://example.com/js/lib/require.js')]

In [54]: ghost.con
ghost.confirm  ghost.content  

In [54]: ghost.content
Out[54]: u'<!DOCTYPE html><html lang="en"><head>\n    <meta charset="utf-8">\n    <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">\n    <title>Ving</title>\n\n    <link rel="stylesheet" rel='nofollow' href="css/ui.css">\n    <link rel="stylesheet" rel='nofollow' href="css/bootstrap.css">\n    <link rel="stylesheet" rel='nofollow' href="css/colorpicker.css">\n    <link rel="stylesheet" rel='nofollow' href="css/redactor.css">\n    <link rel="stylesheet" rel='nofollow' href="css/site.css">\n    <link rel="stylesheet" rel='nofollow' href="css/selectize.default.css">\n    <!-- My Bug fixes / overrides to work with real app -->\n    <link rel="stylesheet" rel='nofollow' href="css/style.css">\n    <link rel="stylesheet" rel='nofollow' href="css/datepicker.css">\n    <!-- Theme for uploader -->\n    <link rel="stylesheet" rel='nofollow' href="css/jquery.fileupload-ui.css">\n    <!-- Shims for IE support  -->\n    <!--[if lte IE 8]>\n      <script src="js/lib/html5shiv.js"></script>\n      <script src="js/lib/r2d3.min.js" charset="utf-8"></script>\n    <![endif]-->\n  </head>\n\n  <body>\n    <div id="wrapper">\n      <header id="header" class="container"></header>\n      <div id="content" class="container"></div>\n    </div>\n    <footer id="footer" class="container"></footer>\n    <div id="modal" class="modal modal-box hide fade"></div>\n    <script data-main="js/main.built" src="js/lib/require.js"></script>\n\n  \n\n</body></html>'

In [55]:

I also tried loading "https://mail.google.com/":

In [7]: url='https://mail.google.com/'

In [8]: ghost = Ghost(wait_timeout=60)

In [9]: page, resources = ghost.open(url)

In [10]: ghost.content
Out[10]: u'<html><head><meta http-equiv="Refresh" content="0;URL=https://mail.google.com/mail/"></head><body><script type="text/javascript" language="javascript"><!--\nlocation.replace("https://mail.google.com/mail/")\n--></script></body></html>'

In [11]: 

Source: (StackOverflow)

Screen scraping dynamic webpage in python with Ghost.py

ghost = Ghost()
page, rcs = ghost.open(https://soundcloud.com/passionpit/sets/favorites)
page, rcs = ghost.wait_for_page_loaded()
songs = ghost.evaluate("document.getElementsByClassName('soundTitle__title');")
print songs

I am attempting to use the above code to find all html elements on the above page that have the class 'soundTitle__title' however as of right now my output is

QFont::setPixelSize: Pixel size <= 0 (0)
({PyQt4.QtCore.QString(u'length'): 0.0}, [])

Can anyone help me see where my problem is? When I run document.getElementsByClassName('soundTitle__title') in my browsers console I get the output I expect, why is the Python output different?

Or is there some way for me to use Ghost.py or another similar library to get the source of the page after the JavaScript has run (the source seen when inspecting an element with browser developer tools)?


Source: (StackOverflow)

Ghost.py - what does this stack trace mean? [closed]

How do I go about debugging this stack trace?

Traceback (most recent call last):
  File "<string>", line 73, in execInThread
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\netref.py", line 196, in __call__
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\netref.py", line 71, in syncreq
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 431, in sync_request
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 379, in serve
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 337, in _recv
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\channel.py", line 50, in recv
  File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\stream.py", line 166, in read
EOFError: [Errno 10054] An existing connection was forcibly closed by the remote host

The stack trace does not refer to a line in my code, which is too long to reproduce here. However, the key component other than the standard python library is Ghost.py.

Thanks!


Source: (StackOverflow)

Python: Using Ghost for dynamic webscraping

Trying to get the weather data from: http://metservice.com/maps-radar/local-observations/local-3-hourly-observations

Did find example here on how to use Ghost for web scraping dynamic content but I have not found out how to handle the result.

Since ghost seems to have issues when running in interactive shell I use

print(result)

to pipe output to file:

python getMetObservation.py > proper_result

This is my python code:

from ghost import Ghost
url = 'http://metservice.com/maps-radar/local-observations/local-3-hourly-observations'
gh = Ghost(wait_timeout=60)
page, resources = gh.open(url)
result, resources = gh.evaluate("document.getElementsByClassName('obs-content');")
print(result)

When examining the file it does contain what I am after but it also contains a huge amount of information I am not after. It is also not clear how to use the variable result that evaluate returns. Inspecting ghost.py it seems to be handled by

self.main_frame.evaluateJavaScript("%s" % script)

in:

def evaluate(self, script):
"""Evaluates script in page frame.

:param script: The script to evaluate.
"""
return (
self.main_frame.evaluateJavaScript("%s" % script),
self._release_last_resources(),
)

When I execute the command:

document.getElementsByClassName('obs-content');

in a Chromium console I get the correct response.

I am a beginner when it comes to python but willing to learn. Also note that I am running this in a python virtual environment under Ubuntu if it matters.


Source: (StackOverflow)

Is there a way to open the same page of ghost.py in browser?

So I'm opening some page with ghost.py

ghost.open('http://someUrl.com')
ghost.wait_for_page_loaded()     
ghost.show()
ghost.wait_for_text('some text here')

after that user goes to somewhere, so I don't know the url and don't know the action.is there a way to capture the page with proxy and coockies and url and open the same page in any browser(firefox or chrome or etc..)?

Thank you.


Source: (StackOverflow)

python Ghost.py not opening popup

I am trying to click a link:

 ## Open the website 
 from ghost import Ghost
 ghost = Ghost(wait_timeout=40)
 ghost.show()     
 ghost.open('https://my.url.com/')
 ghost.wait_for_selector('#some')

 if ghost.click('#this click will open popup selector'):
     print"i m done click correctly"
 else:
     print"i fail to click"

 if ghost.wait_for_selector('#popup selector'):
     print"selector found"
 else:
     print"selector not found"

I am getting

i m done click correctly 
selector found

But window is not popping up...


Source: (StackOverflow)

If I open this URL in one of the popular browsers, will I probably get a HTTPS error?

I need to write a Python 3 script that answers the question in the title.

By "HTTPS error" I mean both the obvious error page advising user not to proceed and the errors visible in browser console, like “Blocked loading mixed active content”.

So far I tried Ghost.py, but it did not report any errors (with ignore_ssl_errors=False) while loading a page that caused “Blocked loading mixed active content”.

Is there a way to fix this in Ghost.py / PySide? Is there another tool I should use?

I would rather not use a tool like Selenium which requires an actual browser, if there is an other way.


Source: (StackOverflow)

Ghost.py throwing errors in QPainter with default everything

I am trying to figure out why when i ghost.capture() a webpage i get 6 errors per capture. I am using Ghost.py and PySide to capture full screen browsers.

Errors below

QT: QPainter::begin: Paint device returned engine == 0, type: 3
QT: QPainter::setRenderHint: Painter must be active to set rendering hints
QT: QPainter::setBrush: Painter not active
QT: QPainter::pen: Painter not active
QT: QPainter::setPen: Painter not active
QT: QPainter::end: Painter not active, aborted

Code:

from ghost import Ghost
url = "someurl"
dir = "somedir"
self.ghost = Ghost()
self.ghost.set_viewport_size(1920, 0)
self.ghost.open(url)
self.ghost.capture_to(dir)

I have searched online and couldn't find any simple solutions for python. The problem doesn't seem to arise 100% of the time, but i can't seem to nail down exactly why its failing on some but not others. It might have to do with heavy page animations? Either way shouldn't this still just take a screencap?


Source: (StackOverflow)

QWaitCondition error when multiprocessing with python ghost.py

I'm using multiprocessing and ghost.py to crawl some data from the internet, but there are some errors:

2015-03-31T23:22:30 QT: QWaitCondition: Destroyed while threads are still waiting

This is some of my code:

    l.acquire()
    global ghost
    try:
        ghost = Ghost(wait_timeout=60)
        ghost.open(website) #download page
        ghost.wait_for_selector('#pagenum') #wait JS
        html = []
        #print u"\t\t the first page"
        html.append(ghost.content)
        pageSum = findPageSum(ghost.content)
        for i in xrange(pageSum-1): #crawl all pages
            #print u"\t\tthe"+ str(i+2) +"page"
            ghost.set_field_value('#pagenum', str(i+2)) 
            ghost.click('#page-go') 
            ghost.wait_for_text("<td>"+str(20*(i+1)+1)+"</td>")
            html.append(ghost.content)
        for i in html:
            souped(i)
        print  website, "\t\t OK!"
    except :
        pass
    l.release()

Other code:

    global _use_line
    q = Queue.Queue(0)
    for i in xrange(len(websitelist)):
        q.put((websitelist[i]))
    lock = Lock()

    while (not q.empty()):
        if (_use_line > 0):
            for i in range(_use_line):
                dl = q.get()
                _use_line -= 1
                print "_use_line: ", _use_line
                p = Process(target=download, args=(lock,dl))
                p.start()
        else:
            time.sleep(1)

ghost.py uses pyqt and pyside, and I think this issue is because ofsome local variable's error, but I don't know how to find it.


Source: (StackOverflow)

Possible to install PySide or PyQt on Heroku?

I am not able to install PySide and cannot figure out how to install PyQt on Heroku.

I need PySide in order to use Ghost.py.

Here is what I include in my requirements.txt:

Ghost.py==0.1b3    
PySide==1.2.2

And here is the error when pushing to Heroku:

Python architecture is 64bit

       error: Failed to find cmake. Please specify the path to cmake with --cmake parameter.

       ----------------------------------------
       Cleaning up...
       Command /app/.heroku/python/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip_build_u30455/PySide/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-LqhYm6-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip_build_u30455/PySide
       Storing debug log for failure in /app/.pip/pip.log

 !     Push rejected, failed to compile Python app

Thanks for your help ahead of time!


Source: (StackOverflow)