Top scalability frequently asked interview questions

Does Scala scale better than other JVM languages?

Here is the only way I know to ask it at the moment. As Understand it Scala uses the Java Virtual Machine. I thought Jruby did also. Twitter switched its middleware to Scala. Could they have done the same thing and used Jruby?

Could they have started with Jruby to start with and not had their scaling problems that caused them to move from Ruby to Scala in the first place? Do I not understand what Jruby is? I'm assuming that because Jruby can use Java it would have scaled where Ruby would not.

Does it all boil down to the static versus dynamic types, in this case?

Source: (StackOverflow)

How to design scalable applications? [closed]

How do you design/architect a scalable application? Any suggestion of books or websites that could help to understand how to scale out applications?

Thanks

Source: (StackOverflow)

Pros and Cons of Sticky Session / Session Affinity load blancing strategy?

One approach to high scalability is to use network load balancing to split processing load between several servers.

One challenge that this approach presents is where servers are state aware - storing user state in a "session".

One solution to this problem is "sticky session" (aka "session affinity") where each user is assigned to a single server and his/her state data is contained on that server exclusively throughout the duration of the session.

What are the Pros and Cons of the "sticky session" approach? Do you use it and if so are you satisfied with it?

Source: (StackOverflow)

A beginner's guide to SQL database design [closed]

Do you know a good source to learn how to design SQL solutions?

Beyond the basic language syntax, I'm looking for something to help me understand:

What tables to build and how to link them
How to design for different scales (small client APP to a huge distributed website)
How to write effective / efficient / elegant SQL queries

Source: (StackOverflow)

Does the Thundering Herd Problem exist on Linux anymore?

Many linux/unix programming books and tutorials speak about the "Thundering Herd Problem" which happens when multiple threads or forks are blocked on a select() call waiting for readability of a listening socket. When the connection comes in, all threads and forks are woken up but only one "wins" with a successful call to "accept()". In the meantime, a lot of cpu time is wasted waking up all the threads/forks for no reason.

I noticed a project which provides a "fix" for this problem in the linux kernel, but this is a very old patch.

I think there are two variants; One where each fork does select() and then accept(), and one that just does accept().

Do modern unix/linux kernels still have the Thundering Herd Problem in both these cases or only the "select() then accept()" version?

Source: (StackOverflow)

Algorithm for autocomplete?

I am referring to the algorithm that is used to give query suggestions when a user types a search term in Google.

I am mainly interested in how Google's algorithm is able to show: 1. Most important results (most likely queries rather than anything that matches) 2. Match substrings 3. Fuzzy matches

I know you could use Trie or generalized trie to find matches, but it wouldn't meet the above requirements...

How does I/O work in Akka?

How does the actor model (in Akka) work when you need to perform I/O (ie. a database operation)?

It is my understanding that a blocking operation will throw an exception (and essentially ruin all concurrency due to the evented nature of Netty, which Akka uses). Hence I would have to use a Future or something similar - however I don't understand the concurrency model.

Can 1 actor be processing multiple message simultaneously?
If an actor makes a blocking call in a future (ie. future.get()) does that block only the current actor's execution; or will it prevent execution on all actors until the blocking call has completed?
If it blocks all execution, how does using a future assist concurrency (ie. wouldn't invoking blocking calls in a future still amount to creating an actor and executing the blocking call)?
What is the best way to deal with a multi-staged process (ie. read from the database; call a blocking webservice; read from the database; write to the database) where each step is dependent on the last?

The basic context is this:

I'm using a Websocket server which will maintain thousands of sessions.
Each session has some state (ie. authentication details, etc);
The Javascript client will send a JSON-RPC message to the server, which will pass it to the appropriate session actor, which will execute it and return a result.
Execution of the RPC call will involve some I/O and blocking calls.
There will be a large number of concurrent requests (each user will be making a significant amount of requests over the WebSocket connection and there will be a lot of users).

Is there a better way to achieve this?

Source: (StackOverflow)

How am I supposed to interpret the results from Apache's ab benchmarking tool?

Alright, I've searched everywhere and I can't seem to find a detailed resource online for how to interpret the results from Apache's ab server benchmarking tool. I've run several tests with what I thought were drastically different parameters, but have seen very similar results (I have a hard time thinking that this means my site is scaling perfectly!). If there is a detailed resource someone could point me to, on how to understand the results from this test, or if someone feels like creating one here, I think that would be very useful to me and others.

Source: (StackOverflow)

Is there some industry standard for unacceptable webapp response time?

There's a cots (commercial off-the-shelf) application that I work on customizing, where a couple of pages take an extremely long time to load for certain distributions of data. (I'm talking approximately 3 minutes for a page to load in this instance... and the time is growing exponentially).

Clearly this is unacceptable but are there studies out there where I can point what acceptable response time is?

I'd like some good studies possibly that discuss response time.

Source: (StackOverflow)

Increasing PHP memory_limit. At what point does it become insane?

In a system I am currently working on, there is one process that loads large amount of data into an array for sorting/aggregating/whatever. I know this process needs optimising for memory usage, but in the short term it just needs to work.

Given the amount of data loaded into the array, we keep hitting the memory limit. It has been increased several times, and I am wondering is there a point where increasing it becomes generally a bad idea? or is it only a matter of how much RAM the machine has?

The machine has 2GB of RAM and the memory_limit is currently set at 1.5GB. We can easily add more RAM to the machine (and will anyway).

Have others encountered this kind of issue? and what were the solutions?

Source: (StackOverflow)

How are you taking advantage of Multicore?

As someone in the world of HPC who came from the world of enterprise web development, I'm always curious to see how developers back in the "real world" are taking advantage of parallel computing. This is much more relevant now that all chips are going multicore, and it'll be even more relevant when there are thousands of cores on a chip instead of just a few.

My questions are:

How does this affect your software roadmap?
I'm particularly interested in real stories about how multicore is affecting different software domains, so specify what kind of development you do in your answer (e.g. server side, client-side apps, scientific computing, etc).
What are you doing with your existing code to take advantage of multicore machines, and what challenges have you faced? Are you using OpenMP, Erlang, Haskell, CUDA, TBB, UPC or something else?
What do you plan to do as concurrency levels continue to increase, and how will you deal with hundreds or thousands of cores?
If your domain doesn't easily benefit from parallel computation, then explaining why is interesting, too.

Finally, I've framed this as a multicore question, but feel free to talk about other types of parallel computing. If you're porting part of your app to use MapReduce, or if MPI on large clusters is the paradigm for you, then definitely mention that, too.

Update: If you do answer #5, mention whether you think things will change if there get to be more cores (100, 1000, etc) than you can feed with available memory bandwidth (seeing as how bandwidth is getting smaller and smaller per core). Can you still use the remaining cores for your application?

Source: (StackOverflow)

What databases do the World Wide Web's biggest sites run on? [closed]

This question is meant to serve as a list of databases and their configurations that the major web sites use and would be a great reference for anyone thinking of scaling their web site to the size of Twitter, Facebook or even Google.

Please keep your answers to a minimum and be sure to cite any sources used.

EDIT:

Also, please bold both the web-site name and the database for easier scanning.

Source: (StackOverflow)

What is considered a good response time for a dynamic, personalized web application? [closed]

For a complex web application that includes dynamic content and personalization, what is a good response time from the server (so excluding network latency and browser rendering time)? I'm thinking about sites like Facebook, Amazon, MyYahoo, etc. A related question is what is a good response time for a backend service?

Source: (StackOverflow)

WebRTC - scalable live stream broadcasting / multicasting

[ ! ] Question is still open

PROBLEM:

WebRTC gives us peer-to-peer video/audio connections. It is perfect for p2p calls, hangouts. But what about broadcasting (one-to-many, for example, 1-to-10000)?

Lets say we have a broadcaster "B" and two attendees "A1", "A2". Of course it seems to be solvable: we just connect B with A1 and then B with A2. So B sends video/audio stream directly to A1 and another stream to A2. B sends streams twice.

Now lets imagine there are 10000 attendees: A1, A2, ..., A10000. It means B must send 10000 streams. Each stream is ~40KB/s which means B needs 400MB/s outgoing internet speed to maintain this broadcast. Unacceptable.

ORIGINAL QUESTION (OBSOLETE)

Is it possible somehow to solve this, so B sends only one stream on some server and attendees just pull this stream from this server? Yes, this means the outgoing speed on this server must be high, but I can maintain it.

Or maybe this means ruining WebRTC idea?

[ ! ] UP-TO-DATE QUESTION

Solve CPU/Bandwidth - Is there server-less solution (aka multicasting or something similar)?
Solve CPU - Is it possible to encode stream only once and send to peers?
Solve CPU/Bandwidth - Multicasting is definitely possible, but does it actually work in real life (latency, network instability)?

NOTES

Flash is not working for my needs as per poor UX for end customers.

SOLUTION

26.05.2015 - There is no such a solution for scalable broadcasting for WebRTC at the moment, where you do not use media-servers at all. There are server-side solutions as well as hybrid (p2p + server-side depending on different conditions) on the market.

There are some promising techs though like https://github.com/muaz-khan/WebRTC-Scalable-Broadcast but they need to answer those possible issues: latency, overall network connection stability, scalability formula (they are not infinite-scalable probably).

Source: (StackOverflow)

Ruby on Rails scalability/performance? [closed]

I have used PHP for awhile now and have used it well with CodeIgniter, which is a great framework. I am starting on a new personal project and last time I was considering what to use (PHP vs ROR) I used PHP because of the scalability problems I heard ROR had, especially after reading what the Twitter devs had to say about it. Is scalability still an issue in ROR or has there been improvements to it?

I would like to learn a new language, and ROR seems interesting. PHP gets the job done but as everyone knows its syntax and organization are fugly and it feels like one big hack.

Source: (StackOverflow)

EzDevInfo.com

scalability interview questions

Does Scala scale better than other JVM languages?

How to design scalable applications? [closed]

Pros and Cons of Sticky Session / Session Affinity load blancing strategy?

A beginner's guide to SQL database design [closed]

Does the Thundering Herd Problem exist on Linux anymore?

Algorithm for autocomplete?

How does I/O work in Akka?

How am I supposed to interpret the results from Apache's ab benchmarking tool?

Is there some industry standard for unacceptable webapp response time?

Increasing PHP memory_limit. At what point does it become insane?

How are you taking advantage of Multicore?

What databases do the World Wide Web's biggest sites run on? [closed]

What is considered a good response time for a dynamic, personalized web application? [closed]

WebRTC - scalable live stream broadcasting / multicasting

Ruby on Rails scalability/performance? [closed]