Ghost.py
Webkit based scriptable web browser for python.
ghost.py — Ghost.py 0.2.1 documentation
I've written some casperjs tests to test my Django application. If the Django application is started (on port 8000 for example), casperjs can be run as a separate process and access my running Django app.
My other tests are written with Django's (web)testing framework that sets up the test database with fixtures, and are run with ./manage.py test
. With Django webtest, you don't need to start a separate Django webserver (doing requests and url routing is proxied/simulated).
Is there a way to rung casperjs tests from within Django webtest? Without starting a different webserver and having yet another test database?
I've seen ghost.py exists, but haven't tried it yet.
Source: (StackOverflow)
I'm trying to get started with the Ghost.py headless browser on a Mac. I installed Ghost.py and its dependencies using these links/commands:
- Qt 5.0.1 for Mac, has a GUI installer
- PySide 1.1.0, which requires
Qt Version >= 4.7.4
, has a GUI installer
sudo pip install Ghost.py
I launched Python, and confirmed that I can import PySide
. However, when I do from ghost import Ghost
, it fails to find PySide:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ghost/__init__.py", line 1, in <module>
from ghost import Ghost
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ghost/ghost.py", line 28, in <module>
raise Exception("Ghost.py requires PySide or PyQt")
Exception: Ghost.py requires PySide or PyQt
By doing import PySide; print PySide;
, it appears that PySide is installed here on my system: /Library/Python/2.7/site-packages/PySide
. So, appended the PYTHONPATH
like this:
export PYTHONPATH=$PYTHONPATH:/Library/Python/2.7/site-packages #for PySide
.
However, Ghost.py
still cannot find PySide
.
How can I convince Ghost.py
to find my installation of PySide
?
Environment:
- Mac OS X 10.7.5
- Python 2.7
- Qt 5.0.1
- PySide 1.1.0
Source: (StackOverflow)
I am writing a script and having trouble even getting a basic one to work. Ghost seems to not be importing properly. I keep getting the following error:
>>>from ghost import Ghost
File "/Library/Python/2.7/site-packages/ghost/__init__.py", line 1, in <module>
from .ghost import Ghost, Error, TimeoutError
File "/Library/Python/2.7/site-packages/ghost/ghost.py", line 23, in <module>
if binding is None:
NameError: name 'binding' is not defined
Nothing special in the code:
from ghost import Ghost
ghost = Ghost()
I have PySide and PyQt both installed and I install Ghost by doing: sudo pip install ghost
Source: (StackOverflow)
I'm using Ghost.py
from ghost import Ghost
url = "http://www.kiev.prom.ua"
gh = Ghost()
page, page_name = gh.create_page()
page_resource = page.open(url, wait_onload_event=True)
When I run the above script, Python crashes:
Problem Event Name: APPCRASH
Application Name: python.exe
Application Version: 0.0.0.0
Application Timestamp: 4c303241
Name of the module with the error: python27.dll
Version of the module with the error: 2.7.5150.1013
The time stamp module with the error: 5237f3d5
Exception Code: c0000005
Exception Offset: 00107f7a
OS Version: 6.1.7601.2.1.0.256.1
Language Code: 1049
Additional Information 1: 0a9e
Additional information 2: 0a9e372d3b4ad19135b953a78882e789
Additional Information 3: 0a9e
Additional Information 4: 0a9e372d3b4ad19135b953a78882e789
How can I find the source of this problem?
Source: (StackOverflow)
I want to try Ghost.py.
its documentation says that installation require PyQt or PySide.
I have installed pyqt4-dev-tools using command apt-get install pyqt4-dev-tools
on my Kubuntu 13.10.
I am getting error
root@alok:~# pip install Ghost.py
Downloading/unpacking Ghost.py
Could not find a version that satisfies the requirement Ghost.py (from versions: 0.1a, 0.1a2, 0.1a3, 0.1b, 0.1b2, 0.1b3)
Cleaning up...
No distributions matching the version for Ghost.py
Storing complete log in /root/.pip/pip.log
I have installed PySide too still I am not able to install Ghost.py using pip install Ghost.py
using pip
is so straight forward but I am not able to figure out what is wrong this time.
output of /root/.pip/pip.log
is available at http://paste.ubuntu.com/7189458/
Source: (StackOverflow)
Ghost.py is supposed to run JS: http://jeanphix.me/Ghost.py/
require.js gets fetched over http, but as far as I can tell it doesn't get run, since "js/main.built" never gets fetched and none of its specified JS files get loaded. It all works perfectly in a real browser.
In [51]: ghost = Ghost(wait_timeout=60)
In [52]: page, resources = ghost.open(url)
In [53]: [r.url for r in resources]
Out[53]:
[PyQt4.QtCore.QUrl(u'https://example.com/#consume/283e6571bcecf34143cbd60f35e0464b'),
PyQt4.QtCore.QUrl(u'https://example.com/css/ui.css'),
PyQt4.QtCore.QUrl(u'https://example.com/css/colorpicker.css'),
PyQt4.QtCore.QUrl(u'https://example.com/css/selectize.default.css'),
PyQt4.QtCore.QUrl(u'https://example.com/css/datepicker.css'),
PyQt4.QtCore.QUrl(u'https://example.com/css/site.css'),
PyQt4.QtCore.QUrl(u'https://example.com/css/jquery.fileupload-ui.css'),
PyQt4.QtCore.QUrl(u'https://example.com/css/style.css'),
PyQt4.QtCore.QUrl(u'https://example.com/css/bootstrap.css'),
PyQt4.QtCore.QUrl(u'https://example.com/css/redactor.css'),
PyQt4.QtCore.QUrl(u'https://example.com/js/lib/require.js')]
In [54]: ghost.con
ghost.confirm ghost.content
In [54]: ghost.content
Out[54]: u'<!DOCTYPE html><html lang="en"><head>\n <meta charset="utf-8">\n <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">\n <title>Ving</title>\n\n <link rel="stylesheet" rel='nofollow' href="css/ui.css">\n <link rel="stylesheet" rel='nofollow' href="css/bootstrap.css">\n <link rel="stylesheet" rel='nofollow' href="css/colorpicker.css">\n <link rel="stylesheet" rel='nofollow' href="css/redactor.css">\n <link rel="stylesheet" rel='nofollow' href="css/site.css">\n <link rel="stylesheet" rel='nofollow' href="css/selectize.default.css">\n <!-- My Bug fixes / overrides to work with real app -->\n <link rel="stylesheet" rel='nofollow' href="css/style.css">\n <link rel="stylesheet" rel='nofollow' href="css/datepicker.css">\n <!-- Theme for uploader -->\n <link rel="stylesheet" rel='nofollow' href="css/jquery.fileupload-ui.css">\n <!-- Shims for IE support -->\n <!--[if lte IE 8]>\n <script src="js/lib/html5shiv.js"></script>\n <script src="js/lib/r2d3.min.js" charset="utf-8"></script>\n <![endif]-->\n </head>\n\n <body>\n <div id="wrapper">\n <header id="header" class="container"></header>\n <div id="content" class="container"></div>\n </div>\n <footer id="footer" class="container"></footer>\n <div id="modal" class="modal modal-box hide fade"></div>\n <script data-main="js/main.built" src="js/lib/require.js"></script>\n\n \n\n</body></html>'
In [55]:
I also tried loading "https://mail.google.com/":
In [7]: url='https://mail.google.com/'
In [8]: ghost = Ghost(wait_timeout=60)
In [9]: page, resources = ghost.open(url)
In [10]: ghost.content
Out[10]: u'<html><head><meta http-equiv="Refresh" content="0;URL=https://mail.google.com/mail/"></head><body><script type="text/javascript" language="javascript"><!--\nlocation.replace("https://mail.google.com/mail/")\n--></script></body></html>'
In [11]:
Source: (StackOverflow)
ghost = Ghost()
page, rcs = ghost.open(https://soundcloud.com/passionpit/sets/favorites)
page, rcs = ghost.wait_for_page_loaded()
songs = ghost.evaluate("document.getElementsByClassName('soundTitle__title');")
print songs
I am attempting to use the above code to find all html elements on the above page that have the class 'soundTitle__title' however as of right now my output is
QFont::setPixelSize: Pixel size <= 0 (0)
({PyQt4.QtCore.QString(u'length'): 0.0}, [])
Can anyone help me see where my problem is? When I run document.getElementsByClassName('soundTitle__title')
in my browsers console I get the output I expect, why is the Python output different?
Or is there some way for me to use Ghost.py or another similar library to get the source of the page after the JavaScript has run (the source seen when inspecting an element with browser developer tools)?
Source: (StackOverflow)
How do I go about debugging this stack trace?
Traceback (most recent call last):
File "<string>", line 73, in execInThread
File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\netref.py", line 196, in __call__
File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\netref.py", line 71, in syncreq
File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 431, in sync_request
File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 379, in serve
File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\protocol.py", line 337, in _recv
File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\channel.py", line 50, in recv
File "C:\Program Files (x86)\PyScripter\Lib\rpyc.zip\rpyc\core\stream.py", line 166, in read
EOFError: [Errno 10054] An existing connection was forcibly closed by the remote host
The stack trace does not refer to a line in my code, which is too long to reproduce here. However, the key component other than the standard python library is Ghost.py.
Thanks!
Source: (StackOverflow)
Trying to get the weather data from: http://metservice.com/maps-radar/local-observations/local-3-hourly-observations
Did find example here on how to use Ghost for web scraping dynamic content but I have not found out how to handle the result.
Since ghost seems to have issues when running in interactive shell I use
print(result)
to pipe output to file:
python getMetObservation.py > proper_result
This is my python code:
from ghost import Ghost
url = 'http://metservice.com/maps-radar/local-observations/local-3-hourly-observations'
gh = Ghost(wait_timeout=60)
page, resources = gh.open(url)
result, resources = gh.evaluate("document.getElementsByClassName('obs-content');")
print(result)
When examining the file it does contain what I am after but it also contains a huge amount of information I am not after.
It is also not clear how to use the variable result that evaluate returns.
Inspecting ghost.py it seems to be handled by
self.main_frame.evaluateJavaScript("%s" % script)
in:
def evaluate(self, script):
"""Evaluates script in page frame.
:param script: The script to evaluate.
"""
return (
self.main_frame.evaluateJavaScript("%s" % script),
self._release_last_resources(),
)
When I execute the command:
document.getElementsByClassName('obs-content');
in a Chromium console I get the correct response.
I am a beginner when it comes to python but willing to learn.
Also note that I am running this in a python virtual environment under Ubuntu if it matters.
Source: (StackOverflow)
So I'm opening some page with ghost.py
ghost.open('http://someUrl.com')
ghost.wait_for_page_loaded()
ghost.show()
ghost.wait_for_text('some text here')
after that user goes to somewhere, so I don't know the url and don't know the action.is there a way to capture the page with proxy and coockies and url and open the same page in any browser(firefox or chrome or etc..)?
Thank you.
Source: (StackOverflow)
I am trying to click a link:
## Open the website
from ghost import Ghost
ghost = Ghost(wait_timeout=40)
ghost.show()
ghost.open('https://my.url.com/')
ghost.wait_for_selector('#some')
if ghost.click('#this click will open popup selector'):
print"i m done click correctly"
else:
print"i fail to click"
if ghost.wait_for_selector('#popup selector'):
print"selector found"
else:
print"selector not found"
I am getting
i m done click correctly
selector found
But window is not popping up...
Source: (StackOverflow)
I need to write a Python 3 script that answers the question in the title.
By "HTTPS error" I mean both the obvious error page advising user not to proceed and the errors visible in browser console, like “Blocked loading mixed active content”.
So far I tried Ghost.py, but it did not report any errors (with ignore_ssl_errors=False
) while loading a page that caused “Blocked loading mixed active content”.
Is there a way to fix this in Ghost.py / PySide? Is there another tool I should use?
I would rather not use a tool like Selenium which requires an actual browser, if there is an other way.
Source: (StackOverflow)
I am trying to figure out why when i ghost.capture() a webpage i get 6 errors per capture. I am using Ghost.py and PySide to capture full screen browsers.
Errors below
QT: QPainter::begin: Paint device returned engine == 0, type: 3
QT: QPainter::setRenderHint: Painter must be active to set rendering hints
QT: QPainter::setBrush: Painter not active
QT: QPainter::pen: Painter not active
QT: QPainter::setPen: Painter not active
QT: QPainter::end: Painter not active, aborted
Code:
from ghost import Ghost
url = "someurl"
dir = "somedir"
self.ghost = Ghost()
self.ghost.set_viewport_size(1920, 0)
self.ghost.open(url)
self.ghost.capture_to(dir)
I have searched online and couldn't find any simple solutions for python. The problem doesn't seem to arise 100% of the time, but i can't seem to nail down exactly why its failing on some but not others. It might have to do with heavy page animations? Either way shouldn't this still just take a screencap?
Source: (StackOverflow)
I'm using multiprocessing and ghost.py to crawl some data from the internet, but there are some errors:
2015-03-31T23:22:30 QT: QWaitCondition: Destroyed while threads are still waiting
This is some of my code:
l.acquire()
global ghost
try:
ghost = Ghost(wait_timeout=60)
ghost.open(website) #download page
ghost.wait_for_selector('#pagenum') #wait JS
html = []
#print u"\t\t the first page"
html.append(ghost.content)
pageSum = findPageSum(ghost.content)
for i in xrange(pageSum-1): #crawl all pages
#print u"\t\tthe"+ str(i+2) +"page"
ghost.set_field_value('#pagenum', str(i+2))
ghost.click('#page-go')
ghost.wait_for_text("<td>"+str(20*(i+1)+1)+"</td>")
html.append(ghost.content)
for i in html:
souped(i)
print website, "\t\t OK!"
except :
pass
l.release()
Other code:
global _use_line
q = Queue.Queue(0)
for i in xrange(len(websitelist)):
q.put((websitelist[i]))
lock = Lock()
while (not q.empty()):
if (_use_line > 0):
for i in range(_use_line):
dl = q.get()
_use_line -= 1
print "_use_line: ", _use_line
p = Process(target=download, args=(lock,dl))
p.start()
else:
time.sleep(1)
ghost.py uses pyqt and pyside, and I think this issue is because ofsome local variable's error, but I don't know how to find it.
Source: (StackOverflow)
I am not able to install PySide and cannot figure out how to install PyQt on Heroku.
I need PySide in order to use Ghost.py.
Here is what I include in my requirements.txt:
Ghost.py==0.1b3
PySide==1.2.2
And here is the error when pushing to Heroku:
Python architecture is 64bit
error: Failed to find cmake. Please specify the path to cmake with --cmake parameter.
----------------------------------------
Cleaning up...
Command /app/.heroku/python/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip_build_u30455/PySide/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-LqhYm6-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip_build_u30455/PySide
Storing debug log for failure in /app/.pip/pip.log
! Push rejected, failed to compile Python app
Thanks for your help ahead of time!
Source: (StackOverflow)