EzDevInfo.com

haproxy interview questions

Top haproxy frequently asked interview questions

HAProxy + WebSocket Disconnection

I am using HAProxy to send requests, on a subdomain, to a node.js app.

I am unable to get WebSockets to work. So far I have only been able to get the client to establish a WebSocket connection but then there is a disconnection which follows very soon after.

I am on ubuntu. I have been using various versions of socket.io and node-websocket-server. The client is either the latest versions of Safari or Chrome. HAProxy version is 1.4.8

Here is my HAProxy.cfg

global 
    maxconn 4096 
    pidfile /var/run/haproxy.pid 
    daemon 

defaults 
    mode http 

    maxconn 2000 

    option http-server-close
    option http-pretend-keepalive

    contimeout      5000
    clitimeout      50000
    srvtimeout      50000

frontend HTTP_PROXY
    bind *:80 

    timeout client  86400000

    #default server
    default_backend NGINX_SERVERS

    #node server
    acl host_node_sockettest hdr_beg(host) -i mysubdomain.mydomain

use_backend NODE_SOCKETTEST_SERVERS if host_node_sockettest


backend NGINX_SERVERS 
server THIS_NGINX_SERVER 127.0.0.1:8081

backend NODE_SOCKETTEST_SERVERS
timeout queue   5000
timeout server  86400000

server THIS_NODE_SERVER localhost:8180 maxconn 200 check

Any help is really appreciated. I've trawled the web and mailing list but can not get any of the suggested solutions to work.

(p.s. this could be for serverfault, but there are other HAProxy question on S.O, so I have chosen to post here)

Thanks Ross


Source: (StackOverflow)

Can I use GPL software binaries in commercial environment? [closed]

I have a concern of using GPL v2 and GPL v3 licensed software in commercial production environment. I would like to use HaProxy as a load balancing solution. Is it safe against copy-left? I won't modify anything from source code and the architecture of the system requires the use of a load balancer.

It will be embedded in a larger distributed system. So what we sell is the whole system. On another site, we will need to install the load balancer again and could mix with something else. I think it's the "Distributing" term which is confusing me.


Source: (StackOverflow)

Advertisements

Random "peer not authenticated" exceptions with Java SSLContextImpl$TLS10Context

I get connection failures that appear randomly when connecting to an HAProxy server using SSL. I have confirmed that these failures happen on JDK versions 1.7.0_21 and 1.7.0_25 but not with 1.7.0_04 or with 1.6.0_38.

The exception is

 Exception in thread "main" javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
    at sun.security.ssl.SSLSessionImpl.getPeerCertificates(SSLSessionImpl.java:397)
    at SSLTest2.main(SSLTest2.java:52)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

These failures only happen when using the TLS SSL context and not with the default context. The following code is run in a loop a thousand times and failures happen before the loop completes (about 2% of the connections fail):

SSLContext sslcontext = SSLContext.getInstance("TLS");   
sslcontext.init(null, null, null);
SSLSocketFactory factory = sslcontext.getSocketFactory(); 
SSLSocket socket = (SSLSocket)factory.createSocket("myserver", 443);

//socket.startHandshake();
SSLSession session = socket.getSession();
session.getPeerCertificates();
socket.close();

If, however, I create the SSL context this way I have no connections failures on any of the Java versions I mentioned:

SSLSocketFactory factory = (SSLSocketFactory)SSLSocketFactory.getDefault();

The first way uses SSLContextImpl$TLS10Context and the later uses SSLContextImpl$DefaultSSLContext. Looking at the code, I don't see any differences that would cause the exception to occur.

Why would I be getting the failures and what are the advantages/disadvantages of using the getDefault() call?

Note: The exceptions were first seen using the Apache HttpClient (version 4). This code is the smallest subset that reproduces the problem seen with HttpClient.

Here's the error I see when adding -Djavax.net.debug=ssl:

main, READ: TLSv1 Alert, length = 2
main, RECV TLSv1 ALERT:  fatal, bad_record_mac
%% Invalidated:  [Session-101, SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA]
main, called closeSocket()
main, handling exception: javax.net.ssl.SSLException: Received fatal alert:   bad_record_mac
main, IOException in getSession():  javax.net.ssl.SSLException: Received fatal alert: bad_record_mac

Another piece of information is that the errors do not occur if I turn off Diffie-Hellman on the proxy server.


Source: (StackOverflow)

Handling OPTIONS request in nginx

We're using HAProxy as a load balancer at the moment, and it regularly makes requests to the downstream boxes to make sure they're alive using an OPTIONS request:

OPTIONS /index.html HTTP/1.0

I'm working with getting nginx set up as a reverse proxy with caching (using ncache). For some reason, nginx is returning a 405 when an OPTIONS request comes in:

192.168.1.10 - - [22/Oct/2008:16:36:21 -0700] "OPTIONS /index.html HTTP/1.0" 405 325 "-" "-" 192.168.1.10

When hitting the downstream webserver directly, I get a proper 200 response. My question is: how to you make nginx pass that response along to HAProxy, or, how can I set the response in the nginx.conf?


Source: (StackOverflow)

HAProxy redirecting http to https (ssl)

I'm using HAProxy for load balancing and only want my site to support https. Thus, I'd like to redirect all requests on port 80 to port 443.

How would I do this?

Edit: We'd like to redirect to the same url on https, preserving query params. Thus, http://foo.com/bar would redirect to https://foo.com/bar


Source: (StackOverflow)

Elastic Load Balancing in EC2 [closed]

It's been on the cards for a while, but now that Amazon have released Elastic Load balancing (ELB), what are your thoughts on deploying this solution for a high-traffic web application?

Should we replace HAProxy or consider ELB as a complimentary service in front of HAProxy?


Source: (StackOverflow)

Duplicate TCP traffic with a proxy

I need to send (duplicate) traffic from one machine (port) and to two different machines (ports). I need to take care of TCP session as well.

In the beginnig I used em-proxy (http://docs.engineyard.com/em-proxy.html), but it seems to me that the overhead is quite large (it goes over 50% of cpu). Then I installed haproxy (http://haproxy.1wt.eu/) and I managed to redirect traffic (not to duplicate). The overhead is reasonable (less than 5%).

The problem is that I could not say in haproxy config file the following:
- listen on specific address:port and whatever you find send on the two different machines:ports and discard the answers from one of them.

Em-proxy code for this is quite simple, but it seems to me that EventMachine generates a lot of overhead.

Before I dig in haproxy code and try to change (duplicate traffic) I would like to know is there something similar out there?

Thanks.


Source: (StackOverflow)

Any thoughts on RightScale and Scalr for dynamic Ec2 instance managment [closed]

I'm looking for a cost effective tool for managing an web app on Ec2. Rightscale seems to the big dog and charges for it. Scalr looks like a more cost effective solution but it's hard to find out any real customer experiences..

The key aspects I'm looking for is a load balancer (http and https) and a way to automatically bring online additional web servers capacity as load increases as well as terminate the instances when load falls off.

From what I can tell, lots of people are rolling their own stuff here. We're trying to release an app and don't really want to have to fight too many heavy sys admin battles. Given the importance of performance etc I'd be grateful to hear advise and experiences from the field on this.


Source: (StackOverflow)

Haproxy ssl configuration - install root and intermediate certificate

After to much googling, i finally made my haproxy ssl to works. But now i got problem because root and intermediate certificate is not installed so my ssl don`t have green bar.

My haproxy config

global
      maxconn     4096 
      nbproc      1
      #debug
      daemon
      log         127.0.0.1    local0

  defaults
      mode        http
      option      httplog
      log         global
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms

  frontend unsecured
      bind 192.168.0.1:80
      timeout     client 86400000
      reqadd X-Forwarded-Proto:\ http
      default_backend      www_backend

  frontend  secured
  mode http
   bind 192.168.0.1:443 ssl crt /etc/haproxy/cert.pem
   reqadd X-Forwarded-Proto:\ https
  default_backend www_backend

  backend www_backend
      mode        http
      balance     roundrobin
      #cookie      SERVERID insert indirect nocache
      #option      forwardfor
      server      server1 192.168.0.2:80  weight 1 maxconn 1024 check
      server      server2 192.168.0.2:80  weight 1 maxconn 1024 check

192.168.0.1 is my load balancer ip. /etc/haproxy/cert.pem contain private key and domain certificate eg. www.domain.com

There is another question with ssl configuration, which include bundle.crt. When i contacted my ssl support, they told me i need to install root and intermediate certificate.

From Comodo Documentation, creating bundle is simple as merging their crt, which i made.

But when i try to reconfig my haproxy config as

bind 192.168.0.1:443 ssl crt /etc/haproxy/cert.pem ca-file /path/to/bundle.crt

Im getting error that i cant use that config parameter on bind.

p.s im using 1.5 dev12 version. With latest dev17 version i had problems even starting haproxy as on this post

enter image description here


Source: (StackOverflow)

How to configure HAProxy to send GET and POST HTTP requests to two different application servers

I am using RESTful architecture. I have two application servers running. One should serve only GET request and other should serve only POST request. I want to configure HAProxy to loadbalance the requests depending upon the above condition. Please help me


Source: (StackOverflow)

Haproxy redirect www to non-www

I'm currently using Haproxy to balance several express.js nodes. I know that it's possible to redirect using express.js, but I was hoping to do so with Haproxy.

I was wondering how I can do a permanent redirect from www.mysite.com to mysite.com?


Source: (StackOverflow)

Load Balancing (HAProxy or other) - Sticky Sessions

I'm working on scaling out my app to multiple servers, and one requirement is that a client is always communicating with the same server (too much live data is used to allow bouncing between servers efficiently).

My current setup is a small server cluster (using Linode). I have a frontend node running HAProxy using "balance source" so that an IP is always pointed towards the same node.

I'm noticing that "balance source" is not a very even distribution. With my current test setup (2 backend servers), one server often has 3-4x as many connections when using a sample size of 80-100 source IPs.

Is there any way to achieve a more balanced distribution? Obviously sticky sessions prohibits a "perfect" balance, but a 40/60 split would be preferred to a 25/75 split.


Source: (StackOverflow)

HAProxy - URL Based routing with load balancing

I am new to HAProxy and I have a question about HAProxy configuration which helps me make a key decision in taking the right approach. This will greatly help me deciding the architecture.

I have 3 apps. Let's say app1, app2, app3.

Each app is differentiated by the urls as follows:

www.example.com/app1/123 -> app1
www.example.com/app2/123 -> app2
www.example.com/app3/123 -> app3

I am planning to have 2 instances of each app in 2 different regions:

Region 1 - app1, app2, app3
Region 2 - app1, app2, app3

I see 2 methods to configure this but I am not sure which is the best practice here:

  • Method 1: Have HAProxy1 to first differentiate the requests using the url patterns. Requests from HAProxy1 will be routed to another HAProxy server set up individual apps (3 HAProxy servers in this case) for load balancing.

  • Method 2: Have one great HAProxy server which does the both as stated in method 1. That is, have configuration to segregate the requests depending on the url and then pass each request through individual filter like things set up for each app for load balancing.

I am not sure if Method 2 is supported in haproxy. Any ideas or suggested is greatly appreciated. Please put some light.


Source: (StackOverflow)

Difference between global maxconn and server maxconn haproxy

I have a question about my haproxy config:

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log         127.0.0.1 syslog emerg
    maxconn     4000
    quiet
    user        haproxy
    group       haproxy
    daemon
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will 
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode        http
    log         global
    option      abortonclose
    option      dontlognull
    option      httpclose
    option      httplog
    option      forwardfor
    option      redispatch
    timeout connect 10000 # default 10 second time out if a backend is not found
    timeout client 300000 # 5 min timeout for client
    timeout server 300000 # 5 min timeout for server
    stats       enable

listen  http_proxy  localhost:81

    balance     roundrobin
    option      httpchk GET /empty.html
    server      server1 myip:80 maxconn 15 check inter 10000
    server      server2 myip:80 maxconn 15 check inter 10000

As you can see it is straight forward, but I am a bit confused about how the maxconn properties work.

There is the global one and the maxconn on the server, in the listen block. My thinking is this: the global one manages the total number of connections that haproxy, as a service, will que or process at one time. If the number gets above that, it either kills the connection, or pools in some linux socket? I have no idea what happens if the number gets higher than 4000.

Then you have the server maxconn property set at 15. First off, I set that at 15 because my php-fpm, this is forwarding to on a separate server, only has so many child processes it can use, so I make sure I am pooling the requests here, instead of in php-fpm. Which I think is faster.

But back on the subject, my theory about this number is each server in this block will only be sent 15 connections at a time. And then the connections will wait for an open server. If I had cookies on, the connections would wait for the CORRECT open server. But I don't.

So questions are:

  1. What happens if the global connections get above 4000? Do they die? Or pool in Linux somehow?
  2. Are the global connection related to the server connections, other than the fact you can't have a total number of server connections greater than global?
  3. When figuring out the global connections, shouldn't it be the amount of connections added up in the server section, plus a certain percentage for pooling? And obviously you have other restrains on the connections, but really it is how many you want to send to the proxies?

Thank you in advance.


Source: (StackOverflow)

Load Balancing in Amazon EC2?

We've been fighting with HAProxy for a few days now in Amazon EC2; the experience has so far been great, but we're stuck on squeezing more performance out of the software load balancer. We're not exactly Linux networking whizzes (we're a .NET shop, normally), but we've so far held our own, attempting to set proper ulimits, inspecting kernel messages and tcpdumps for any irregularities. So far though, we've reached a plateau of about 1,700 requests/sec, at which point client timeouts abound (we've been using and tweaking httperf for this purpose). A coworker and I were listening to the most recent Stack Overflow podcast, in which the Reddit founders note that their entire site runs off one HAProxy node, and that it so far hasn't become a bottleneck. Ack! Either there's somehow not seeing that many concurrent requests, we're doing something horribly wrong, or the shared nature of EC2 is limiting the network stack of the Ec2 instance (we're using a large instance type). Considering the fact that both Joel and the Reddit founders agree that network will likely be the limiting factor, is it possible that's the limitation we're seeing?

Any thoughts are greatly appreciated!

Edit It looks like the actual issue was not, in fact, with the load balancer node! The culprit was actually the nodes running httperf, in this instance. As httperf builds and tears down a socket for each request, it spends a good amount of CPU time in the kernel. As we bumped the request rate higher, the TCP FIN TTL (being 60s by default) was keeping sockets around too long, and the ip_local_port_range's default was too low for this usage scenario. Basically, after a few minutes of the client (httperf) node constantly creating and destroying new sockets, the number of unused ports ran out, and subsequent 'requests' errored-out at this stage, yielding low request/sec numbers and a large amount of errors.

We also had looked at nginx, but We've been working with RighScale, and they've got drop-in scripts for HAProxy. Oh, and we've got too tight a deadline [of course] to switch out components unless it proves absolutely necessary. Mercifully, being on AWS allows us to test out another setup using nginx in parallel (if warranted), and make the switch overnight later on.

This page describes each of the sysctl variables fairly well (ip_local_port_range and tcp_fin_timeout were tuned, in this case).


Source: (StackOverflow)