rate-limiting interview questions
Top rate-limiting frequently asked interview questions
I want to user Search API for twitter using 1.1 version.
There is a limit for 450 request per applications.
But I have a doubt in this rate limiting. I thought that it means we can make 450 request in every 15 mins of request window.
But I read(but i am not sure about what exactly i read) something like by default it returns 15 status based on search query. But if you query more status in single request it's considered based on the number of statuses.
Do they have a rate limit only for every 15 mins or they have rate limit for a day too?
So I couldn't understand how it exactly works ? Can any one help me with this?
Source: (StackOverflow)
Suppose I have a Java server application, which implements a few web services. Suppose also, I can authenticate application users.
Now I would like to add some limits per user to the service usage: e.g. rate limit (requests per second), maximum request size, etc.
Is there any "ready-to-use" library in Java to do that?
Source: (StackOverflow)
Anyone have any idea how I might go about this? Having a pretty hard time finding information online. Best I found is the curbit it gem but I can only think of how to implement that application-wise.
Source: (StackOverflow)
I seek a solution to do rate limit for http api, for nginx there is already an module HttpLimitReqModule support this feature. But refer to the document, this module only support per second and per minutes. Any solution for per hour/day?
Source: (StackOverflow)
To avoid abuse I'd like to add rate limiting to the REST API in our Rails application. After doing a bit of research into this it looks like the best practice is to move this responsibility into the web server rather than checking for this in the application itself. Unfortunately this can't be done in my case as I'm hosting the application on Heroku and so have no control over the web server set up.
What should be done in this case to stop abuse of the API?
Source: (StackOverflow)
One of the Additional HTTP Status Codes (RFC6585) is
Where can I find examples of HTTP / REST API Rate-Limiting HTTP response headers that are useful with this HTTP response status?
Source: (StackOverflow)
Given a number threads I want to limit the rate of calls to the worker function to a rate of say one per second.
My idea was to keep track of the last time a call was made across all threads and compare this to the current time in each thread. Then if current_time - last_time < rate
. I let the thread sleep for a bit. Something is wrong with my implementation - I presume I may have gotten the wrong idea about how locks work.
My code:
from Queue import Queue
from threading import Thread, Lock, RLock
import time
num_worker_threads = 2
rate = 1
q = Queue()
lock = Lock()
last_time = [time.time()]
def do_work(i, idx):
# Do work here, print is just a dummy.
print('Thread: {0}, Item: {1}, Time: {2}'.format(i, idx, time.time()))
def worker(i):
while True:
lock.acquire()
current_time = time.time()
interval = current_time - last_time[0]
last_time[0] = current_time
if interval < rate:
time.sleep(rate - interval)
lock.release()
item = q.get()
do_work(i, item)
q.task_done()
for i in range(num_worker_threads):
t = Thread(target=worker, args=[i])
t.daemon = True
t.start()
for item in xrange(10):
q.put(item)
q.join()
I was expecting to see one call per second to do_work
, however, I get mostly 2 calls at the same time (1 for each thread), followed by a one second pause. What is wrong?
Ok, some edit. The advice to simply throttle the rate at which items are put in the queue was good, however I remembered that I had to take care of the case in which items are re-added to the queue by the workers. Canonical example: pagination or backing-off-retry in network tasks. I came up with the following. I guess that for actual network tasks eventlet/gevent libraries may be easier on resources but this is just an example. It basically uses a priority queue to pile up the requests and uses an extra thread to shovel items from the pile to the actual task queue at an even rate. I simulated re-insertion into the pile by the workers, re-inserted items are then treated first.
import sys
import os
import time
import random
from Queue import Queue, PriorityQueue
from threading import Thread
rate = 0.1
def worker(q, q_pile, idx):
while True:
item = q.get()
print("Thread: {0} processed: {1}".format(item[1], idx))
if random.random() > 0.3:
print("Thread: {1} reinserting item: {0}".format(item[1], idx))
q_pile.put((-1 * time.time(), item[1]))
q.task_done()
def schedule(q_pile, q):
while True:
if not q_pile.empty():
print("Items on pile: {0}".format(q_pile.qsize()))
q.put(q_pile.get())
q_pile.task_done()
time.sleep(rate)
def main():
q_pile = PriorityQueue()
q = Queue()
for i in range(5):
t = Thread(target=worker, args=[q, q_pile, i])
t.daemon = True
t.start()
t_schedule = Thread(target=schedule, args=[q_pile, q])
t_schedule.daemon = True
t_schedule.start()
[q_pile.put((-1 * time.time(), i)) for i in range(10)]
q_pile.join()
q.join()
if __name__ == '__main__':
main()
Source: (StackOverflow)
I've run into rate limting from Amazon EMR a few times via boto API with the following:
boto.exception.EmrResponseError: EmrResponseError: 400 Bad Request
<ErrorResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31">
<Error>
<Type>Sender</Type>
<Code>Throttling</Code>
<Message>Rate exceeded</Message>
</Error>
<RequestId>69d74a63-7de3-11e0-aafc-2b540b1e5f42</RequestId>
</ErrorResponse>
The operation is a one-time operation request the state of a jobflow, so there shouldn't be any rate-limiting involved. Has anyone else ran into this issue? Also, there doesn't seem to be much documentation on EC2 and EMR throttling/rate limiting...
Source: (StackOverflow)
I'm trying to get the number of followers of each follower for a specific account (with the goal of finding the most influencial followers). I'm using Tweepy in Python but I am running into the API rate limits and I can only get the number of followers for 5 followers before I am cut off. The account I'm looking at has about 2000 followers. Is there any way to get around this?
my code snippet is
ids=api.followers_ids(account_name)
for id in ids:
more=api.followers_ids(id)
print len(more)
Thanks
Source: (StackOverflow)
I am using rsyslog (rsyslog-7.4.7-7.el7_0.x86_64) on CentOS 7 (CentOS Linux release 7.1.1503 (Core)). We have some applications on it which is using syslog framework for logging. We have a lot of logs. At peak, it can be upto 50000 logs in one second.
Our system was earlier running on CentOS 6.2 (and rsyslog 5.8) and we never observed any drop. After doing some search, we found that there is rate limiting. We are getting messages like "imjournal: begin to drop messages due to rate-limiting" in /var/log/messages and then "imjournal: 130886 messages lost due to rate-limiting". We tried different ways to disable or tune it without success. We tried the following.
1) Changes in /etc/rsyslog.conf
$ModLoad imjournal # provides access to the systemd journal
$imjournalRatelimitInterval 1
$imjournalRatelimitBurst 50000
Some other info from rsyslog.conf as follows. Didn't change anything here
$OmitLocalLogging on
$IMJournalStateFile imjournal.state
We also saw that there is some rate limiting with imuxsock; but that we understand that that won't be used when OmitLocalLogging is ON
2) Changes in /etc/systemd/journald.conf
Storage=auto
RateLimitInterval=1s
RateLimitBurst=100000
Our application has modules in Java (using SLF4J and LOG4J) and modules in C/C++ (using syslog() call). For the C/C++ modules, we are missing DEBUG logs most of the time. But DEBUG logs of Java modules are apparently fine always.
Version of systemd is "systemd-208-20.el7.x86_64". The application and rsyslogd are on same machine.
Source: (StackOverflow)
I'm using rack-throttle
as a rate-limiting engine in my rails 3 application. I've created my own class based on the Rack::Throttle::Interval
to define custom rate limiting logic. I'm checking if the request is made to exact controller and exact action. This works fine if I make GET
request. However if i send POST
request i get some problems.
class CustomLimiter < Rack::Throttle::Interval
def allowed?(request)
path_info = Rails.application.routes.recognize_path request.url rescue path_info = {}
if path_info[:controller] == "some_controller" and path_info[:action] == "some_action"
super
else
true
end
end
end
Here are my controller actions
def question
#user is redirected here
end
def check_answer
#some logic to check answer
redirect_to question_path
end
My routes
get "questions" => "application#question", :as => "question"
post "check_answer" => "application#check_answer", :as => "check_answer"
EDIT:
The problem is that POST
requests are coming to application so that method allowed?
is called. But when i call Rails.application.routes.recognize_path
i get a Route set not finalized
exception. How can i prevent a user from sending a lot of post requests on the exact action of the exact controller with the help of rack-throttle
The middleware is added in application.rb
class Application < Rails::Application
#Set up rate limiting
config.require "ip_limiter"
config.require "ip_user_agent_limiter"
config.middleware.use IpLimiter, :min => 0.2
config.middleware.use IpUserAgentLimiter, :min => 2
end
Both IpLimiter
and IpUserAgentLimiter
are derived from custom limiter
Source: (StackOverflow)
I have a request-promise function that makes a request to an API. I'm rate-limited by this API and I keep getting the error message:
Exceeded 2 calls per second for api client. Reduce request rates to resume uninterrupted service.
I'm running a couple of Promise.each
loops in parallel which is causing the issue, if I run just one instance of Promise.each
everything runs fine. Within these Promise.each
calls they lead to the same function a with a request-promise call. I want to wrap this function with another queue
function and set the interval to 500
milliseconds so that a request
isn't made after one another, or parallel, but set to that time, on queue. The thing is I still need these promises to get their contents even if it takes a rather long time to get a response.
Is there anything that will do this for me? Something I can wrap a function in and it will respond at a set interval and not in parallel or fire functions one after another?
Update: Perhaps it does need to be promise specific, I tried to use underscore's throttle function
var debug = require("debug")("throttle")
var _ = require("underscore")
var request = require("request-promise")
function requestSite(){
debug("request started")
function throttleRequest(){
return request({
"url": "https://www.google.com"
}).then(function(response){
debug("request finished")
})
}
return _.throttle(throttleRequest, 100)
}
requestSite()
requestSite()
requestSite()
And all I got back was this:
$ DEBUG=* node throttle.js
throttle request started +0ms
throttle request started +2ms
throttle request started +0ms
Source: (StackOverflow)
Hi I am using ColdFusion to call the last.fm api, using a cfc bundle sourced from here.
I am concerned about going over the request limit, which is 5 requests per originating IP address per second, averaged over a 5 minute period.
The cfc bundle has a central component which calls all the other components, which are split up into sections like "artist", "track" etc...This central component "lastFmApi.cfc." is initiated in my application, and persisted for the lifespan of the application
// Application.cfc example
<cffunction name="onApplicationStart">
<cfset var apiKey = '[your api key here]' />
<cfset var apiSecret = '[your api secret here]' />
<cfset application.lastFm = CreateObject('component', 'org.FrankFusion.lastFm.lastFmApi').init(apiKey, apiSecret) />
</cffunction>
Now if I want to call the api through a handler/controller, for example my artist handler...I can do this
<cffunction name="artistPage" cache="5 mins">
<cfset qAlbums = application.lastFm.user.getArtist(url.artistName) />
</cffunction>
I am a bit confused towards caching, but am caching each call to the api in this handler for 5 mins, but does this make any difference, because each time someone hits a new artist page wont this still count as a fresh hit against the api?
Wondering how best to tackle this
Thanks
Source: (StackOverflow)
How can a function rate-limit its calls? The calls should not be discarded if too frequent, but rather be queued up and spaced out in time, X milliseconds apart. I've looked at throttle and debounce, but they discard calls instead of queuing them up to be run in the future.
Any better solution than a queue with a process()
method set on an X millisecond interval? Are there such standard implementations in JS frameworks? I've looked at underscore.js so far - nothing.
Source: (StackOverflow)