EzDevInfo.com

squid interview questions

Top squid frequently asked interview questions

Modify Images With An Http Proxy

April fools is coming up. I'd like to play this prank where we use an http proxy to modify images as they come through the proxy. I've been told that there are scripts/add-ons to ISA/Squid that can do this but I haven't been able to find much on my own.

Ideally, we'd like to superimpose another image on top of every .gif/.jpeg/.png that comes through the proxy. The only problem is I have no idea how to do this!

If I had my choice I'd rather do it with ISA than Squid, but beggars can't be choosers. If you can think of another way I'm open to that too!

Source: (StackOverflow)

Tux, Varnish or Squid?

We need a web content accelerator for static images to sit in front of our Apache web front end servers

Our previous hosting partner used Tux with great success and I like the fact it's part of Red Hat Linux which we're using, but its last update was in 2006 and there seems little chance of future development. Our ISP recommends we use Squid in reverse caching proxy role.

Any thoughts between Tux and Squid? Compatibility, reliability and future support are as important to us as performance.

Also, I read in other threads here about Varnish; anyone have any real-world experience of Varnish compared with Squid, and/or Tux, gained in high-traffic environments?

Cheers

Ian

UPDATE: We're testing Squid now. Using ab to pull the same image 10,000 times with a concurrency of 100, both Apache on its own and Squid/Apache burned through the requests very quickly. But Squid made only a single request to Apache for the image then served them all from RAM, whereas Apache alone had to fork a large number of workers in order to serve the images. It looks like Squid will work well in freeing up the Apache workers to handle dynamic pages.

Source: (StackOverflow)

Advertisements

How to block website using SQUID server

I am using squid server in my Debian server, I want to block some websites in my system and I followed all the procedures for this but there is no result.

Source: (StackOverflow)

Is there any way to determine the amount of time a client spends on a web page

Assuming I have an open source web server or proxy I can enhance, let's say apache or squid.

Is there a way to determine the time each client spends on a web page?

HTTP is of course stateless, so it's not trivial, but maybe someone has an idea on how to approach this problem?

Source: (StackOverflow)

Squid3 proxy with multiple ip addresses

Is there any way I can set up squid3 to reroute traffic through multiple other servers in order to get a random ip address for each request?

Source: (StackOverflow)

REST API caching, should I use a Reverse proxy or memcache(d)?

I have a REST API where I would like to cache the JSON response of the index (GET /foo) and the read actions (GET /foo/1) to significantly increase the performance. When there is a POST or a PUT on a resource the cache entries for the index and read results need to be expired, so no old content is served.

Is this a scenario that's best done with a Reverse proxy like Squid / Varnish or would you choose memcache(d)?

Source: (StackOverflow)

How to configure HTTPS support in squid3 [closed]

I have vps, and i would like to configure my squid support HTTPS proxy. I have configured http proxy and is work, but not support https.

question

How to configure HTTPS proxy in squid3?

This is my squid.conf configuration details.

acl manager proto cache_object
acl localhost src 127.0.0.1/32 ::1
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 ::1
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT
acl larang url_regex -i "/etc/squid3/larang.txt"
http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost
http_access deny larang
http_access allow all
http_port 143 transparent
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320

Source: (StackOverflow)

What would client receive through Squid when request timeout after it had received a header "200 OK"

As I know, squid would send a 504 gateway timeout when request timeout. But what if the client has already received a response header 200 ok. I mean when the response data is sent back in chunked encoding.

For example: header "200 ok" body part "a" body part "b" body part "c". After receiving "200 ok" and "a", request timeout happen, what would squid do this time, would it send a 504 gateway timeout back to the client ? If so, can the client received this header "504 gateway timeout" since it has already received a header "200 ok"

Source: (StackOverflow)

Can I force LWP::UserAgent to accept an expired SSL certificate?

I would like to know whether it is possible to force LWP::UserAgent to accept an expired SSL certificate for a single, well-known server. The issue is slightly complicated by the Squid proxy in between.

I went as far as to set up a debugging environment like:

use warnings;
use strict;
use Carp;
use LWP::UserAgent;
use LWP::Debug qw(+);
use HTTP::Cookies;

my $proxy = 'http://proxy.example.net:8118';
my $cookie_jar = HTTP::Cookies->new( file => 'cookies.tmp' );
my $agent = LWP::UserAgent->new;
$agent->proxy( [ 'http' ], $proxy );
$agent->cookie_jar( $cookie_jar );

$ENV{HTTPS_PROXY} = $proxy;
$ENV{HTTPS_DEBUG} = 1;
$ENV{HTTPS_VERSION} = 3;
$ENV{HTTPS_CA_DIR}    = '/etc/ssl/certs';
$ENV{HTTPS_CA_FILE}    = '/etc/ssl/certs/ca-certificates.crt';

$agent->get( 'https://www.example.com/'

exit;

Fortunately the issue was eventually fixed on the remote server before I was able to come up with my own solution, but I would like to be able to optionally circumvent the problem should it arise again (the underlying service had been disrupted for several hours before I was called into action).

I would favor a solution at the LWP::UserAgent level over one based on the underlying Crypt::SSLeay or openSSL implementations, if such a solution exists, since I prefer not to relax security for other unrelated applications. Of course I am still looking for such a solution myself, in my copious free time.

Source: (StackOverflow)

squid3 can't access google.com or bing.com

I've a strange problem with squid3. It is normally working and I can access most web sites through the proxy. However some sites like

google.com
bing.com

just seem to get blocked but not always. Restarting squid3 doesn't seem to help or either clearing the /var/spool/squid3 ( cache ) directory.

If I login to the machine that squid3 is running on and

wget --no-proxy google.com

then there is no problem however if I wget through the proxy it never responds. Most other websites are accessible including stackoverflow.com which I am using through the proxy right at this moment. Any idea what might be special about google.com and bing.com so that squid3 is treating them differently and is there any setting in the squid3 conf file that might be related to such behaviour.

Source: (StackOverflow)

Ideal place to store Binary data that can be rendered by calling a url

I am looking for an ideal (performance effective and maintainable) place to store binary data. In my case these are images. I have to do some image processing,scale the images and store in a suitable place which can be accesses via a RESTful service.

From my research so far I have a few options, like:

NoSql solution like MongoDB,GridFS
Storing as files in a file system in a directory hierarchy and then using a web server to access the images by url
Apache Jackrabbit Document repository
Store in a cache something like Memcache,Squid Proxy

Any thoughts of which one you would pick and why would be useful or is there a better way to do it?

Source: (StackOverflow)

Java 6 NTLM proxy authentication and HTTPS - has anyone got it to work?

I have a Java application (not an applet) that needs to access a web service. Proxies for the web service have been generated with JAX-WS, and seem to work fine. In one scenario it needs to talk through a web proxy server (actually Squid 3.0), which is set to require NTLM authentication.

Running on Sun's JRE 1.6.0_14, everything works fine for accessing HTTP URLs, without requiring any changes: the built-in NTLM authenticator kicks in and it all works seemlessly. If, however, the web service URL is a HTTPS URL, the web service call fails deep inside Sun's code:

com.sun.xml.internal.ws.client.ClientTransportException: HTTP transport error: java.lang.NullPointerException
        at com.sun.xml.internal.ws.transport.http.client.HttpClientTransport.getOutput(HttpClientTransport.java:121)
        at com.sun.xml.internal.ws.transport.http.client.HttpTransportPipe.process(HttpTransportPipe.java:142)
        at com.sun.xml.internal.ws.transport.http.client.HttpTransportPipe.processRequest(HttpTransportPipe.java:83)
        at com.sun.xml.internal.ws.transport.DeferredTransportPipe.processRequest(DeferredTransportPipe.java:105)
        at com.sun.xml.internal.ws.api.pipe.Fiber.__doRun(Fiber.java:587)
        at com.sun.xml.internal.ws.api.pipe.Fiber._doRun(Fiber.java:546)
        at com.sun.xml.internal.ws.api.pipe.Fiber.doRun(Fiber.java:531)
        at com.sun.xml.internal.ws.api.pipe.Fiber.runSync(Fiber.java:428)
        at com.sun.xml.internal.ws.client.Stub.process(Stub.java:211)
        at com.sun.xml.internal.ws.client.sei.SEIStub.doProcess(SEIStub.java:124)
        at com.sun.xml.internal.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:98)
        at com.sun.xml.internal.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:78)
        at com.sun.xml.internal.ws.client.sei.SEIStub.invoke(SEIStub.java:107)
        ... our web service call ...
Caused by: java.lang.NullPointerException
        at sun.net.www.protocol.http.NTLMAuthentication.setHeaders(NTLMAuthentication.java:175)
        at sun.net.www.protocol.http.HttpURLConnection.doTunneling(HttpURLConnection.java:1487)
        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:164)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:896)
        at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:230)
        at com.sun.xml.internal.ws.transport.http.client.HttpClientTransport.getOutput(HttpClientTransport.java:109)
        ... 16 more

Looking in Sun's bug database turns up a few exceptions in such classes, but all of them seem to have been fixed. Has anyone come across anything like this? Has anyone got this to work?

Source: (StackOverflow)

POST Absolute URI is sent to server by XmlHttpRequest when using Squid3 proxy

I've faced strange problem now with sending http request from Chrome extension i am developing (normal JavaScript). This's a POST request with XmlHttpRequest (from background.js) with url like:

http://host.com/postform/upload

I also send this request from normal webpage (not chrome extension), and important bit is that if i open Developer Tools and check Network tab for my request (selecting raw headers display) i see first header there:

POST /postform/upload HTTP/1.1

And it works fine before i enable proxy instead of a direct conneciton. I use squid3 on my Ubuntu for this. Only one thing is different between requests when using proxy - and it makes HTTP server return 404 not found - only when using proxy..

When i force Chrome to work with my proxy based on squid3 (i use PAC script for this in my chrome extension), my request will not work. I checked many times and did all i could to reduce any difference in request body, and all i have left for now is first header.

It looks like this when request sent with proxy active (from Network tab of Developer Tools, opened from background page):

POST http://host.com/postform/upload HTTP/1.1

I've tried using chome.webRequest.onBeforeSendHeaders API, but that did not help. I also tried to remove hostname from URL in XmlHttpRequest.open but this did not help.

Yep, i am sending correct Host and Origin headers in any case. Could this be a problem in my squid3 configuration, or what should i change in my javaScript?

UPDATE Realized now that squid is not the problem in any way, and the problem is that POST request contains FULL uri (http://...) instead of "path". GET works fine. It's killing me.

I cannot use iframe workarounds. What's my problem?

Source: (StackOverflow)

How to get started with web caching, CDNs, and proxy servers? [closed]

I'm newbie programmer building a startup that I (naturally) hope will create a large amount of traffic. I am hosting my django project on dotcloud, which is on Amazon EC2. I have some streaming media (Http though, not rmtp) so the dotcloud guys recommended I go with a CDN. I am also using Amazon S3 for storage and so decided to go with Amazon CloudFront as my CDN.

The time has come where I need to turn my attention to caching and I am lost and confused. I am completely new to the concept. The entire extent of my knowledge comes from a tutorial I just read (http://www.mnot.net/cache_docs/) and a confusing weekend spent consulting google. Most troubling of all is that I am not even sure what I need to do for my site.

What is the difference between a CDN and a proxy server?
Is it possible I might want to use a caching service (e.g. memcached, redis), a CDN (CloudFront), AND a proxy server (squid)?
Our site is DB driven and produces dynamically generated lists specific to user locations. Can such a site be cached? (The lists themselves are filterable via AJAX, so the URL might remain the same while producing largely different results. For instance, example.com/some_url/ might generate a list of 40 objects, but only 10 appearing on the page. By clicking on a filter, the user could end up with 10 different objects while still at /some_url/)
What are the best practices for a high traffic, rich content site?
How can I learn about this? Everywhere I look seems to take for granted some basics that I just don't have as a part of my own foundation yet.

I'm not certain I'm asking the right questions. Just feeling very lost. I've now built 95% of my entire site and thought I was just ironing out the details but caching seems like another major undertaking. Any guidance/advice/encouragement would be much appreciated!

Source: (StackOverflow)

Rotating Proxies for web scraping

I've got a python web crawler and I want to distribute the download requests among many different proxy servers, probably running squid (though I'm open to alternatives). For example, it could work in a round-robin fashion, where request1 goes to proxy1, request2 to proxy2, and eventually looping back around. Any idea how to set this up?

To make it harder, I'd also like to be able to dynamically change the list of available proxies, bring some down, and add others.

If it matters, IP addresses are assigned dynamically.

Thanks :)

Source: (StackOverflow)