EzDevInfo.com

rate-limiting interview questions

Top rate-limiting frequently asked interview questions

How rate limit works in twitter in search API

I want to user Search API for twitter using 1.1 version.

There is a limit for 450 request per applications.

But I have a doubt in this rate limiting. I thought that it means we can make 450 request in every 15 mins of request window.

But I read(but i am not sure about what exactly i read) something like by default it returns 15 status based on search query. But if you query more status in single request it's considered based on the number of statuses.

Do they have a rate limit only for every 15 mins or they have rate limit for a day too?

So I couldn't understand how it exactly works ? Can any one help me with this?

Source: (StackOverflow)

How to limit web services usage in Java application?

Suppose I have a Java server application, which implements a few web services. Suppose also, I can authenticate application users.

Now I would like to add some limits per user to the service usage: e.g. rate limit (requests per second), maximum request size, etc.

Is there any "ready-to-use" library in Java to do that?

Source: (StackOverflow)

iTunes Search API rate limit

Am planning to use iTunes Search API to get App related information - http://www.apple.com/itunes/affiliates/resources/documentation/itunes-store-web-service-search-api.html

Wanted to know if there is any pre-defined Rate/Throttle Limit on the API usage. Couldn't find any info related to this in their Documentation / Stack / Google.

Does anyone have info on this?

Source: (StackOverflow)

rate-limiting a function call in rails per user

Anyone have any idea how I might go about this? Having a pretty hard time finding information online. Best I found is the curbit it gem but I can only think of how to implement that application-wise.

Source: (StackOverflow)

Does nginx HttpLimitReqModule support per hour/day/week?

I seek a solution to do rate limit for http api, for nginx there is already an module HttpLimitReqModule support this feature. But refer to the document, this module only support per second and per minutes. Any solution for per hour/day?

Source: (StackOverflow)

Rate limiting REST requests on Heroku

To avoid abuse I'd like to add rate limiting to the REST API in our Rails application. After doing a bit of research into this it looks like the best practice is to move this responsibility into the web server rather than checking for this in the application itself. Unfortunately this can't be done in my case as I'm hosting the application on Heroku and so have no control over the web server set up.

What should be done in this case to stop abuse of the API?

Source: (StackOverflow)

Examples of HTTP API Rate Limiting HTTP Response headers

One of the Additional HTTP Status Codes (RFC6585) is

429 Too Many Requests

Where can I find examples of HTTP / REST API Rate-Limiting HTTP response headers that are useful with this HTTP response status?

Source: (StackOverflow)

Limiting concurrency and rate for Python threads

Given a number threads I want to limit the rate of calls to the worker function to a rate of say one per second.

My idea was to keep track of the last time a call was made across all threads and compare this to the current time in each thread. Then if current_time - last_time < rate. I let the thread sleep for a bit. Something is wrong with my implementation - I presume I may have gotten the wrong idea about how locks work.

My code:

from Queue import Queue
from threading import Thread, Lock, RLock
import time

num_worker_threads = 2
rate = 1
q = Queue()
lock = Lock()
last_time = [time.time()]

def do_work(i, idx):
    # Do work here, print is just a dummy.
    print('Thread: {0}, Item: {1}, Time: {2}'.format(i, idx, time.time()))

def worker(i):
    while True:
        lock.acquire()
        current_time = time.time()
        interval = current_time - last_time[0]
        last_time[0] = current_time
        if interval < rate:
            time.sleep(rate - interval)
        lock.release()
        item = q.get()
        do_work(i, item)
        q.task_done()

for i in range(num_worker_threads):
     t = Thread(target=worker, args=[i])
     t.daemon = True
     t.start()

for item in xrange(10):
    q.put(item)

q.join()

I was expecting to see one call per second to do_work, however, I get mostly 2 calls at the same time (1 for each thread), followed by a one second pause. What is wrong?

Ok, some edit. The advice to simply throttle the rate at which items are put in the queue was good, however I remembered that I had to take care of the case in which items are re-added to the queue by the workers. Canonical example: pagination or backing-off-retry in network tasks. I came up with the following. I guess that for actual network tasks eventlet/gevent libraries may be easier on resources but this is just an example. It basically uses a priority queue to pile up the requests and uses an extra thread to shovel items from the pile to the actual task queue at an even rate. I simulated re-insertion into the pile by the workers, re-inserted items are then treated first.

import sys
import os
import time
import random

from Queue import Queue, PriorityQueue
from threading import Thread

rate = 0.1

def worker(q, q_pile, idx):
    while True:
        item = q.get()
        print("Thread: {0} processed: {1}".format(item[1], idx))
        if random.random() > 0.3:
            print("Thread: {1} reinserting item: {0}".format(item[1], idx))
            q_pile.put((-1 * time.time(), item[1]))
        q.task_done()

def schedule(q_pile, q):
    while True:
        if not q_pile.empty():
            print("Items on pile: {0}".format(q_pile.qsize()))
            q.put(q_pile.get())
            q_pile.task_done()
        time.sleep(rate)

def main():

    q_pile = PriorityQueue()
    q = Queue()

    for i in range(5):
        t = Thread(target=worker, args=[q, q_pile, i])
        t.daemon = True
        t.start()

    t_schedule = Thread(target=schedule, args=[q_pile, q])
    t_schedule.daemon = True
    t_schedule.start()

    [q_pile.put((-1 * time.time(), i)) for i in range(10)]
    q_pile.join()
    q.join()

if __name__ == '__main__':
    main()

Source: (StackOverflow)

boto ElasticMapReduce throttling and rate limiting

I've run into rate limting from Amazon EMR a few times via boto API with the following:

boto.exception.EmrResponseError: EmrResponseError: 400 Bad Request
<ErrorResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31">
  <Error>
    <Type>Sender</Type>
    <Code>Throttling</Code>
    <Message>Rate exceeded</Message>
  </Error>
  <RequestId>69d74a63-7de3-11e0-aafc-2b540b1e5f42</RequestId>
</ErrorResponse>

The operation is a one-time operation request the state of a jobflow, so there shouldn't be any rate-limiting involved. Has anyone else ran into this issue? Also, there doesn't seem to be much documentation on EC2 and EMR throttling/rate limiting...

Source: (StackOverflow)

Twitter API - Get number of followers of followers

I'm trying to get the number of followers of each follower for a specific account (with the goal of finding the most influencial followers). I'm using Tweepy in Python but I am running into the API rate limits and I can only get the number of followers for 5 followers before I am cut off. The account I'm looking at has about 2000 followers. Is there any way to get around this?

my code snippet is

ids=api.followers_ids(account_name)
    for id in ids:
        more=api.followers_ids(id)
        print len(more)

Thanks

Source: (StackOverflow)

CentOS 7 rsyslog DEBUG logs dropped for C/C++ modules

I am using rsyslog (rsyslog-7.4.7-7.el7_0.x86_64) on CentOS 7 (CentOS Linux release 7.1.1503 (Core)). We have some applications on it which is using syslog framework for logging. We have a lot of logs. At peak, it can be upto 50000 logs in one second. Our system was earlier running on CentOS 6.2 (and rsyslog 5.8) and we never observed any drop. After doing some search, we found that there is rate limiting. We are getting messages like "imjournal: begin to drop messages due to rate-limiting" in /var/log/messages and then "imjournal: 130886 messages lost due to rate-limiting". We tried different ways to disable or tune it without success. We tried the following.

1) Changes in /etc/rsyslog.conf

$ModLoad imjournal # provides access to the systemd journal
$imjournalRatelimitInterval 1
$imjournalRatelimitBurst 50000

Some other info from rsyslog.conf as follows. Didn't change anything here

$OmitLocalLogging on
$IMJournalStateFile imjournal.state

We also saw that there is some rate limiting with imuxsock; but that we understand that that won't be used when OmitLocalLogging is ON

2) Changes in /etc/systemd/journald.conf

Storage=auto
RateLimitInterval=1s
RateLimitBurst=100000

Our application has modules in Java (using SLF4J and LOG4J) and modules in C/C++ (using syslog() call). For the C/C++ modules, we are missing DEBUG logs most of the time. But DEBUG logs of Java modules are apparently fine always.
Version of systemd is "systemd-208-20.el7.x86_64". The application and rsyslogd are on same machine.

Source: (StackOverflow)

Throttling POST requests in rails application

I'm using rack-throttle as a rate-limiting engine in my rails 3 application. I've created my own class based on the Rack::Throttle::Interval to define custom rate limiting logic. I'm checking if the request is made to exact controller and exact action. This works fine if I make GET request. However if i send POST request i get some problems.

class CustomLimiter < Rack::Throttle::Interval
  def allowed?(request)
    path_info = Rails.application.routes.recognize_path request.url rescue path_info = {} 
    if path_info[:controller] == "some_controller" and path_info[:action] == "some_action"
      super
    else
      true
    end
  end
end

Here are my controller actions

def question
  #user is redirected here
end

def check_answer
  #some logic to check answer
  redirect_to question_path
end

My routes

get "questions" => "application#question", :as => "question"
post "check_answer" => "application#check_answer", :as => "check_answer"

EDIT:

The problem is that POST requests are coming to application so that method allowed? is called. But when i call Rails.application.routes.recognize_path i get a Route set not finalized exception. How can i prevent a user from sending a lot of post requests on the exact action of the exact controller with the help of rack-throttle

The middleware is added in application.rb

class Application < Rails::Application
  #Set up rate limiting
  config.require "ip_limiter"
  config.require "ip_user_agent_limiter"
  config.middleware.use IpLimiter, :min => 0.2
  config.middleware.use IpUserAgentLimiter, :min => 2
end

Both IpLimiter and IpUserAgentLimiter are derived from custom limiter

Source: (StackOverflow)

Timed promise queue / throttle

I have a request-promise function that makes a request to an API. I'm rate-limited by this API and I keep getting the error message:

Exceeded 2 calls per second for api client. Reduce request rates to resume uninterrupted service.

I'm running a couple of Promise.each loops in parallel which is causing the issue, if I run just one instance of Promise.each everything runs fine. Within these Promise.each calls they lead to the same function a with a request-promise call. I want to wrap this function with another queue function and set the interval to 500 milliseconds so that a request isn't made after one another, or parallel, but set to that time, on queue. The thing is I still need these promises to get their contents even if it takes a rather long time to get a response.

Is there anything that will do this for me? Something I can wrap a function in and it will respond at a set interval and not in parallel or fire functions one after another?

Update: Perhaps it does need to be promise specific, I tried to use underscore's throttle function

var debug = require("debug")("throttle")
var _ = require("underscore")
var request = require("request-promise")

function requestSite(){
  debug("request started")
  function throttleRequest(){
    return request({
      "url": "https://www.google.com"
    }).then(function(response){
      debug("request finished")
    })
  }
  return _.throttle(throttleRequest, 100)
}

requestSite()
requestSite()
requestSite()

And all I got back was this:

$ DEBUG=* node throttle.js 
throttle request started +0ms
throttle request started +2ms
throttle request started +0ms

Source: (StackOverflow)

Rate Limiting Calls To an Api Using Cache in ColdFusion

Hi I am using ColdFusion to call the last.fm api, using a cfc bundle sourced from here.

I am concerned about going over the request limit, which is 5 requests per originating IP address per second, averaged over a 5 minute period.

The cfc bundle has a central component which calls all the other components, which are split up into sections like "artist", "track" etc...This central component "lastFmApi.cfc." is initiated in my application, and persisted for the lifespan of the application

// Application.cfc example
    <cffunction name="onApplicationStart">
        <cfset var apiKey = '[your api key here]' />
        <cfset var apiSecret = '[your api secret here]' />

        <cfset application.lastFm = CreateObject('component', 'org.FrankFusion.lastFm.lastFmApi').init(apiKey, apiSecret) />
    </cffunction>

Now if I want to call the api through a handler/controller, for example my artist handler...I can do this

<cffunction name="artistPage" cache="5 mins">
 <cfset qAlbums = application.lastFm.user.getArtist(url.artistName) />
</cffunction>

I am a bit confused towards caching, but am caching each call to the api in this handler for 5 mins, but does this make any difference, because each time someone hits a new artist page wont this still count as a fresh hit against the api?

Wondering how best to tackle this

Thanks

Source: (StackOverflow)

Throttle JavaScript function calls, but with queuing (don't discard calls)

How can a function rate-limit its calls? The calls should not be discarded if too frequent, but rather be queued up and spaced out in time, X milliseconds apart. I've looked at throttle and debounce, but they discard calls instead of queuing them up to be run in the future.

Any better solution than a queue with a process() method set on an X millisecond interval? Are there such standard implementations in JS frameworks? I've looked at underscore.js so far - nothing.

Source: (StackOverflow)