EzDevInfo.com

phantomjs interview questions

Top phantomjs frequently asked interview questions

Save and render a webpage with PhantomJS and node.js

I'm looking for an example of requesting a webpage, waiting for the JavaScript to render (JavaScript modifies the DOM), and then grabbing the HTML of the page.

This should be a simple example with an obvious use-case for PhantomJS. I can't find a decent example, the documentation seems to be all about command line use.


Source: (StackOverflow)

How do I debug my JavaScript that is being executed by Chutzpah/PhantomJS

I am using Chutzpah to execute my JavaScript unit tests.

I reference paths to my source files and below have a series of tests. Text Explorer in Visual Studio lists my tests and I can execute them directly from the IDE, so everything seems to be working correctly.

However I would like to step into the source code that is being executed when my tests are run.

Is this possible?


Source: (StackOverflow)

Advertisements

phantomjs not waiting for "full" page load

I'm using PhantomJS v1.4.1 to load some web pages. I don't have access to their server-side, I just getting links pointing to them. I'm using obsolete version of Phantom because I need to support Adobe Flash on that web pages.

The problem is many web-sites are loading their minor content async and that's why Phantom's onLoadFinished callback (analogue for onLoad in HTML) fired too early when not everything still has loaded. Can anyone suggest how can I wait for full load of a webpage to make, for example, a screenshot with all dynamic content like ads?


Source: (StackOverflow)

Casperjs/PhantomJs vs Selenium

We are using Selenium to automate our UI testing. Recently we have seen majority of our users using Chrome. So we wanted to know - pros and cons of using PhantomJS vs Selenium:

  • Is there any real advantage in terms of performance, e.g. time taken to execute the test cases?
  • When should one prefer PhantomJS over Selenium?

Source: (StackOverflow)

PhantomJS: getting "Killed: 9" for anything I'm trying

Just installed phantomjs, mac os x yosemite. Whenever I run /bin/phantomjs, with any parameter, I get Killed: 9. Any idea?


Source: (StackOverflow)

PhantomJS; click an element

How do I click an element in PhantomJS?

page.evaluate(function() {
    document.getElementById('idButtonSpan').click();  
});

This gives me an error "undefined is not a function..."

If I instead

 return document.getElementById('idButtonSpan');

and then print it,

then it prints [object object], so the element does exist.

The element acts as a button, but it's actually just a span element, not a submit input.

I was able to get this button click to work with Casper, but Casper had other limitations so I'm back to PhantomJS.


Source: (StackOverflow)

VCRProxy: Record PhantomJS ajax calls with VCR inside Capybara

I already did some research in this field, but didn't find any solution. I have a site, where asynchron ajax calls are made to facebook (using JSONP). I'm recording all my HTTP requests on the Ruby side with VCR, so I thought it would be cool, to use this feature for AJAX calls as well.

So I played a little bit around, and came up with a proxy attempt. I'm using PhantomJS as a headless browser and poltergeist for the integration inside Capybara. Poltergeist is now configured to use a proxy like this:

  Capybara.register_driver :poltergeist_vcr do |app|
    options = {
      :phantomjs_options => [
        "--proxy=127.0.0.1:9100",
        "--proxy-type=http",
        "--ignore-ssl-errors=yes",
        "--web-security=no"
      ],
      :inspector => true
    }
    Capybara::Poltergeist::Driver.new(app, options)
  end
  Capybara.javascript_driver = :poltergeist_vcr

For testing purposes, I wrote a proxy server based on WEbrick, that integrates VCR:

require 'io/wait'
require 'webrick'
require 'webrick/httpproxy'

require 'rubygems'
require 'vcr'

module WEBrick
  class VCRProxyServer < HTTPProxyServer
    def service(*args)
      VCR.use_cassette('proxied') { super(*args) }
    end
  end
end

VCR.configure do |c|
  c.stub_with :webmock
  c.cassette_library_dir = '.'
  c.default_cassette_options = { :record => :new_episodes }
  c.ignore_localhost = true
end

IP   = '127.0.0.1'
PORT = 9100

reader, writer = IO.pipe

@pid = fork do
  reader.close
  $stderr = writer
  server = WEBrick::VCRProxyServer.new(:BindAddress => IP, :Port => PORT)
  trap('INT') { server.shutdown }
  server.start
end

raise 'VCR Proxy did not start in 10 seconds' unless reader.wait(10)

This works well with every localhost call, and they get well recorded. The HTML, JS and CSS files are recorded by VCR. Then I enabled the c.ignore_localhost = true option, cause it's useless (in my opinion) to record localhost calls.

Then I tried again, but I had to figure out, that the AJAX calls that are made on the page aren't recorded. Even worse, they doesn't work inside the tests anymore.

So to come to the point, my question is: Why are all calls to JS files on the localhost recorded, and JSONP calls to external ressources not? It can't be the jsonP thing, cause it's a "normal" ajax request. Or is there a bug inside phantomjs, that AJAX calls aren't proxied? If so, how could we fix that?

If it's running, I want to integrate the start and stop procedure inside

------- UPDATE -------

I did some research and came to the following point: the proxy has some problems with HTTPS calls and binary data through HTTPS calls.

I started the server, and made some curl calls:

curl --proxy 127.0.0.1:9100 http://d3jgo56a5b0my0.cloudfront.net/images/v7/application/stories_view/icons/bug.png

This call gets recorded as it should. The request and response output from the proxy is

GET http://d3jgo56a5b0my0.cloudfront.net/images/v7/application/stories_view/icons/bug.png HTTP/1.1
User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
Host: d3jgo56a5b0my0.cloudfront.net
Accept: */*
Proxy-Connection: Keep-Alive

HTTP/1.1 200 OK 
Server: WEBrick/1.3.1 (Ruby/1.9.3/2012-10-12)
Date: Tue, 20 Nov 2012 10:13:10 GMT
Content-Length: 0
Connection: Keep-Alive

But this call doesn't gets recorded, there must be some problem with HTTPS:

curl --proxy 127.0.0.1:9100 https://d3jgo56a5b0my0.cloudfront.net/images/v7/application/stories_view/icons/bug.png

The header output is:

CONNECT d3jgo56a5b0my0.cloudfront.net:443 HTTP/1.1
Host: d3jgo56a5b0my0.cloudfront.net:443
User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
Proxy-Connection: Keep-Alive

HTTP/1.1 200 OK 
Server: WEBrick/1.3.1 (Ruby/1.9.3/2012-10-12)
Date: Tue, 20 Nov 2012 10:15:48 GMT
Content-Length: 0
Connection: close

So, I thought maybe the proxy can't handle HTTPS, but it can (as long as I'm getting the output on the console after the cURL call). Then I thought, maybe VCR can't mock HTTPS requests. But using this script, VCR mocks out HTTPS requests, when I don't use it inside the proxy:

require 'vcr'

VCR.configure do |c|
  c.hook_into :webmock
  c.cassette_library_dir = 'cassettes'
end

uri = URI("https://d3jgo56a5b0my0.cloudfront.net/images/v7/application/stories_view/icons/bug.png")

VCR.use_cassette('https', :record => :new_episodes) do
  http = Net::HTTP.new(uri.host, uri.port)
  http.use_ssl = true
  http.verify_mode = OpenSSL::SSL::VERIFY_NONE
  response = http.request_get(uri.path)
  puts response.body
end

So what is the problem? VCR handles HTTPS and the proxy handles HTTPS. Why they don't play together?


Source: (StackOverflow)

Headless Browser and scraping - solutions [closed]

I'm trying to put list of possible solutions for browser automatic tests suits and headless browser platforms capable of scraping.


BROWSER TESTING / SCRAPING:

  • Selenium - polyglot flagship in browser automation, bindings for Python, Ruby, JavaScript, C#, Haskell and more, IDE for Firefox (as an extension) for faster test deployment. Can act as a Server and has tons of features.

JAVASCRIPT

  • PhantomJS - JavaScript, headless testing with screen capture and automation, uses Webkit. As of version 1.8 Selenium's WebDriver API is implemented, so you can use any WebDriver binding and tests will be compatible with Selenium
  • SlimerJS - similar to PhantomJS, uses Gecko (Firefox) instead of WebKit
  • CasperJS - JavaScript, build on both PhantomJS and SlimerJS, has extra features
  • Ghost Driver - JavaScript implementation of the WebDriver Wire Protocol for PhantomJS.
  • new PhantomCSS - CSS regression testing. A CasperJS module for automating visual regression testing with PhantomJS and Resemble.js.
  • new WebdriverCSS - plugin for Webdriver.io for automating visual regression testing
  • new PhantomFlow - Describe and visualize user flows through tests. An experimental approach to Web user interface testing.
  • new trifleJS - ports the PhantomJS API to use the Internet Explorer engine.
  • new CasperJS IDE (commercial)

NODE.JS

  • Node-phantom - bridges the gap between PhantomJS and node.js
  • WebDriverJs - Selenium WebDriver bindings for node.js by Selenium Team
  • WD.js - node module for WebDriver/Selenium 2
  • yiewd - WD.js wrapper using latest Harmony generators! Get rid of the callback pyramid with yield
  • ZombieJs - Insanely fast, headless full-stack testing using node.js
  • NightwatchJs - Node JS based testing solution using Selenium Webdriver
  • Chimera - Chimera: can do everything what phantomJS does, but in a full JS environment
  • Dalek.js - Automated cross browser testing with JavaScript through Selenium Webdriver
  • Webdriver.io - better implementation of WebDriver bindings with predefined 50+ actions
  • Nightmare - Electron bridge with a high-level API.
  • jsdom - Tailored towards web scraping. A very lightweight DOM implemented in Node.js, it supports pages with javascript.

WEB SCRAPING / MINING

  • Scrapy - Python, mainly a scraper/miner - fast, well documented and, can be linked with Django Dynamic Scraper for nice mining deployments, or Scrapy Cloud for PaaS (server-less) deployment, works in terminal or an server stand-alone proces, can be used with Celery, built on top of Twisted
  • Snailer - node.js module, untested yet.
  • Node-Crawler - node.js module, untested yet.

ONLINE TOOLS


RELATED LINKS & RESOURCES

Questions:

  • Any pure Node.js solution or Nodejs to PhanthomJS/CasperJS module that actually works and is documented?

Answer: Chimera seems to go in that direction, checkout Chimera

  • Other solutions capable of easier JavaScript injection than Selenium?

  • Do you know any pure ruby solutions?

Answer: Checkout the list created by rjk with ruby based solutions

  • Do you know any related tech or solution?

Feel free to reedit this question and add content as you wish! Thank you for your contributions!


Updates

  1. added SlimerJS to the list
  2. added Snailer and Node-Crawler and Node-phantom
  3. added Yiewd WebDriver wrapper
  4. added WebDriverJs and WD.js
  5. added Ghost Driver
  6. added Comparsion of Webscraping software on Screen Scraper Blog
  7. added ZombieJs
  8. added Resemble.js and PhantomCSS and PhantomFlow, categorised and reedited content
  9. 04.01.2014, added Chimera, answered 2 questions
  10. added NightWatchJs
  11. added DalekJS
  12. added WebdriverCSS
  13. added CasperBox
  14. added trifleJS
  15. added CasperJS IDE
  16. added Nightmare
  17. added jsdom
  18. added Online HTTP client, updated CasperBox (dead)

Source: (StackOverflow)

How can I setup & run PhantomJS on Ubuntu?

I set up PhantomJS and recorded it to video: http://www.dailymotion.com/video/xnizmh_1_webcam

Build instructions: http://code.google.com/p/phantomjs/wiki/BuildInstructions

Is there anything wrong in my setup?

After I set it up I read the quick start tutorial and tried to write this code

phantomjs hello.js 

It gives me "command not found" error. How can I solve this problem?


Source: (StackOverflow)

How to submit a form using PhantomJS

I'm trying to use phantomJS (what an awesome tool btw!) to submit a form for a page that I have login credentials for, and then output the content of the destination page to stdout. I'm able to access the form and set its values successfully using phantom, but I'm not quite sure what the right syntax is to submit the form and output the content of the subsequent page. What I have so far is:

var page = new WebPage();
var url = phantom.args[0];

page.open(url, function (status) {

  if (status !== 'success') {
      console.log('Unable to access network');
  } else {

    console.log(page.evaluate(function () {

      var arr = document.getElementsByClassName("login-form");
      var i;

      for (i=0; i < arr.length; i++) {

        if (arr[i].getAttribute('method') == "POST") {
          arr[i].elements["email"].value="mylogin@somedomain.com";
          arr[i].elements["password"].value="mypassword";

          // This part doesn't seem to work. It returns the content
          // of the current page, not the content of the page after 
          // the submit has been executed. Am I correctly instrumenting
          // the submit in Phantom?
          arr[i].submit();
          return document.querySelectorAll('html')[0].outerHTML;
        }

      }

      return "failed :-(";

    }));
  }

  phantom.exit();
}

Source: (StackOverflow)

PhantomJS create page from string

Is it possible to create a page from a string?

example:

html = '<html><body>blah blah blah</body></html>'

page.open(html, function(status) {
  // do something
});

I have already tried the above with no luck....

Also, I think it's worth mentioning that I'm using nodejs with phantomjs-node(https://github.com/sgentle/phantomjs-node)

Thanks!


Source: (StackOverflow)

PhantomJS: specify User Agent when making a call

I am using PhantomJS to make calls to a web page, like this:

page.open('http://example.com', function (s) {
  console.log(page.content);
  phantom.exit();
});

I am using this in the context of Drupal Simpletests, which require me to set a special USERAGENT in order to use the test database instead of the real database. I would like to fetch the web page a specific user agent. For example, in PHP with Curl, I can do this with CURLOPT_USERAGENT before making a cUrl call.

Thanks!

Albert


Source: (StackOverflow)

phantomjs: command not found

I followed these instructions (except for copying the executable to my PATH because I cannot seem to find it and it does not seem necessary). Then I made a file called image_render.js in my public javascripts directory with

console.log('Hello, world!');
phantom.exit();

inside it, saved it, and ran phantomjs render_image.js in my terminal. However, my terminal does not recognize the command:

-bash: phantomjs: command not found

What have I done wrong?


Source: (StackOverflow)

Is it possible to use Selenium WebDriver to drive PhantomJS?

I'm going through the documentation for the Selenium WebDriver, and it can drive Chrome for example. I got thinking, wouldn't it be far more efficient to 'drive' PhantomJS?

Is there a way to use selenium with PhathomJS?

My intended use would be webscraping: The sites I scrape are loaded with AJAX and lots of lovely javascript, and I'm thinking this setup could be a good replacement for the scrappy python framework that I'm currently working with.


Source: (StackOverflow)

Getting remote debugging set up with PhantomJS

I'm trying to set up remote debugging with PhantomJS, without much luck. I am following the instructions at https://github.com/ariya/phantomjs/wiki/Troubleshooting. I have a little program named debug.js:

var system  = require('system' ), fs = require('fs'), webpage = require('webpage');

(function(phantom){
    var page=webpage.create();

    function debugPage(){
        console.log("Refresh a second debugger-port page and open a second webkit inspector for the target page.");
        console.log("Letting this page continue will then trigger a break in the target page.");
        debugger; // pause here in first web browser tab for steps 5 & 6
        page.open(system.args[1]);
        page.evaluateAsync(function() {
            debugger; // step 7 will wait here in the second web browser tab
        });
    }
    debugPage();
}(phantom));

Now I run this from the command line:

$ phantomjs --remote-debugger-port=9001 --remote-debugger-autorun=yes debug.js my.xhtml

The console.log messages are now displayed in the shell window. I open a browser page to localhost:9001. It is at this point that the documentation says "get first web inspector for phantom context" However, I see only a single entry for about:blank. When I click on that, I get an inspector for the irrelevant about:blank page, with the URL http://localhost:9001/webkit/inspector/inspector.html?page=1. The documentation talks about executing __run(), but I can't seem to get to the page where I would do that; about:html seems to contina a __run() which is a no-op.

FWIW, I am using PhantomJS 1.9.1 under W8.

What am I missing?


Source: (StackOverflow)