amazon-cloudfront interview questions
Top amazon-cloudfront frequently asked interview questions
I'm using Amazon's CloudFront to serve static files of my web apps.
Is there no way to tell a cloudfront distribution that it needs to refresh it's file or point out a single file that should be refreshed?
Amazon recommend that you version your files like logo_1.gif, logo_2.gif and so on as a workaround for this problem but that seems like a pretty stupid solution. Is there absolutely no other way?
Source: (StackOverflow)
What are the advantages of using Akamai vs. CloudFront? From what I've read, Akamai seems to be more expensive but they seem to have a larger network for their CDN. CloudFront on the other end is newer and Amazon even used Akamai for their e-commerce site when CloudFront was launched in 2008. This might have changed since then which will not surprise me.
I like CloudFront because my application will be hosted on AWS so there might be significant benefits from using CloudFront rather than Akamai. CloudFront seems to be better documented too and their API is easily accessible whereas Akamai isn't. I'm hoping to get pros and cons between choosing Akamai vs. CloudFront. Thanks in advance!
Source: (StackOverflow)
I'm trying to setup 301 redirects on S3 using objects, referenced here http://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html. I've been having some problems and can't seem to figure out what I'm doing wrong.
What I get is a blank page (0 byte file) as if the 'Website Redirect Location' metadata value is not set.
What am I doing wrong?
Also, does this work on AWS CloudFront?
My S3 Console Setup
Couple things to note:
I have this setup for hosting a static site.
I'm using ssl/https with my own cert uploaded and set on the cloudfront distribution.
All the pages seem to work except the redirecting objects.
I've tried setting up routing rules but they didn't seem to work in Cloudfront.
I'm trying to access the redirects both through the cloudfront url and the s3 url (https://s3.amazonaws.com/{bucket}/users/sign_in)
Source: (StackOverflow)
I have an iOS app where I upload objects to an Amazon S3 bucket and want to retrieve from CloudFront distribution.
I am using CloudFront with private distributions for my Amazon S3 bucket and when I generate a signed URL it does not work, in Safari it returns AccessDenied AccessDenied and some random alphanumeric string, My Signed URL I just generated (Expiration date in 24 hours) -- should be expired by now
I read the following site to get all my security credentials in place and I have also setup a private distribution with my S3 Bucket by reading the documentation and I have setup the trusted signers which is basically just my account
I have used code from this site to generate the signed URL
But again I have had absolutely no luck, when I put the link in Safari it returns AccessDeniedAccessDenied and some random alphanumeric string. Why? Any problem? Any step I am not following?
Thanks for any help! I appreciate it, this is important for me as I need to create an app where CloudFront would be very important for speed and international distribution...
Thanks everyone for upvoting, I figured out my problem. I didn't follow the step of create an access origin identity. Now it works like a charm. Woohoo!
Source: (StackOverflow)
I want to serve my compressed CSS/JS from CloudFront (they live on S3), but am unable to work out how to do it via the compressor settings in settings.py, I have the following:
COMPRESS_OFFLINE = True
COMPRESS_URL = 'http://static.example.com/' #same as STATIC_URL, so unnecessary, just here for simplicity
COMPRESS_STORAGE = 'my_example_dir.storage.CachedS3BotoStorage' #subclass suggested in [docs][1]
COMPRESS_OUTPUT_DIR = 'compressed_static'
COMPRESS_ROOT = '/home/dotcloud/current/static/' #location of static files on server
Despite the COMPRESS_URL, my files are being read from my s3 bucket:
<link rel="stylesheet" rel='nofollow' href="https://example.s3.amazonaws.com/compressed_static/css/e0684a1d5c25.css?Signature=blahblahblah;Expires=farfuture;AWSAccessKeyId=blahblahblah" type="text/css" />
I guess the issue is I want to write the file to S3, but read it from CloudFront. Is this possible?
Source: (StackOverflow)
We've setup an angularjs application on cloudfront which has all asset files on s3 storage and from there used via cloudfront for SSL and performance.
We have an identical setup to what is described in this guide
https://rossfairbanks.com/2015/01/30/integrating-angular-s3-cloudfront.html
As we used same post to create our own setup, it seems though that this works on all browsers except Safari.
On Safari, when visiting a url directly on a given path or refreshing any sub pages, the ui-router would redirect user page to landing page.
For staging though, we have same setup but running directly on S3 without CloudFront in middle hence no SSL. but, yet it seems to work even on Safari
so, problem seem to be an issue with CloudFront and Safari to be more specific.
Can anyone advise on what could be the cause? and how we can solve it?
UPDATE: This issue might be related to this bug https://bugs.webkit.org/show_bug.cgi?id=24175
Source: (StackOverflow)
I manage a secured PHP/MySQL web app with extensive jQuery use. Today, a strange error popped up in our app's logs:
JS Error: Error loading script:
https://d15gt9gwxw5wu0.cloudfront.net/js/_MY_WEB_APP_DOMAIN_/r.js
We are not using Amazon's Cloudfront CDN in our app. When I go to the URL that failed to load, these are the only contents:
if(typeof _GPL.ri=='function'&&!_GPL.isIE6){_GPL.ri('_GPL_r')}_GPL.rl=true;
The user's user agent string is:
Mozilla/5.0 (Windows NT 6.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1
Please note: I am not the user who triggered this error. It was one of our thousands of users who triggered it. I do not have control over the client machine.
Does anyone know what's going on here? Is this some sort of XSS attack?
** Update **
It appears I'm not the only one who has discovered this anomaly on their website. I found this report of the same exact behavior, which seems to indicate the code is harmless, but still no answers as to where it came from.
In addition, I found this pastebin with similar code, that appears to be some sort of advertising script. Again, not terribly helpful.
** Update 2 **
More context: The webapp uses several third party jQuery plugins but no third party analytics of any kind. All scripts are hosted on our own server, and an audit of all our code provides no matches for "cloudfront".
This app has been in production for about 4 years, and this is the first and only instance of any activity like this. It has not happened before or since, so I doubt I'll be able to reproduce it.
What I'm interested in is if this is some sort of attack. If it is, I want to know how to plug the hole it's trying to exploit if it's not plugged already.
Source: (StackOverflow)
I'm newbie programmer building a startup that I (naturally) hope will create a large amount of traffic. I am hosting my django project on dotcloud, which is on Amazon EC2. I have some streaming media (Http though, not rmtp) so the dotcloud guys recommended I go with a CDN. I am also using Amazon S3 for storage and so decided to go with Amazon CloudFront as my CDN.
The time has come where I need to turn my attention to caching and I am lost and confused. I am completely new to the concept. The entire extent of my knowledge comes from a tutorial I just read (http://www.mnot.net/cache_docs/) and a confusing weekend spent consulting google. Most troubling of all is that I am not even sure what I need to do for my site.
What is the difference between a CDN and a proxy server?
Is it possible I might want to use a caching service (e.g. memcached, redis), a CDN (CloudFront), AND a proxy server (squid)?
Our site is DB driven and produces dynamically generated lists specific to user locations. Can such a site be cached? (The lists themselves are filterable via AJAX, so the URL might remain the same while producing largely different results. For instance, example.com/some_url/ might generate a list of 40 objects, but only 10 appearing on the page. By clicking on a filter, the user could end up with 10 different objects while still at /some_url/)
What are the best practices for a high traffic, rich content site?
How can I learn about this? Everywhere I look seems to take for granted some basics that I just don't have as a part of my own foundation yet.
I'm not certain I'm asking the right questions. Just feeling very lost. I've now built 95% of my entire site and thought I was just ironing out the details but caching seems like another major undertaking. Any guidance/advice/encouragement would be much appreciated!
Source: (StackOverflow)
I've set up a distribution but I'm a bit confused about the purpose of the CNAME that can be set up in Cloudfront. Assuming my assigned Cloudfront domain is d27fwrff25jcfdafa.cloudfront.net I can assign the "nice" CNAME static.example.com using the AWS Management Console.
I don't understand why I'd want to do this though. Why wouldn't I just create the CNAME in my sites DNS records and point it directly at d27fwrff25jcfdafa.cloudfront.net instead of creating the CNAME in Cloudfront? This is what I've done and it works perfectly but I don't like not understanding stuff.
Alternatively if I only created the CNAME using the Management Console wouldn't I then need to set my nameservers to Amazons so the CNAME can be resolved correctly? I can't find any mention of that step in the documentation so I guess I must be missing something!
Thanks for any help,
Paul.
Source: (StackOverflow)
Short version: How do I make signed URLs "on-demand" to mimic Nginx's X-Accel-Redirect behavior (i.e. protecting downloads) with Amazon CloudFront/S3 using Python.
I've got a Django server up and running with an Nginx front-end. I've been getting hammered with requests to it and recently had to install it as a Tornado WSGI application to prevent it from crashing in FastCGI mode.
Now I'm having an issue with my server getting bogged down (i.e. most of its bandwidth is being used up) due to too many requests for media being made to it, I've been looking into CDNs and I believe Amazon CloudFront/S3 would be the proper solution for me.
I've been using Nginx's X-Accel-Redirect header to protect the files from unauthorized downloading, but I don't have that ability with CloudFront/S3--however they do offer signed URLs. I'm no Python expert by far and definitely don't know how to create a Signed URL properly, so I was hoping someone would have a link for how to make these URLs "on-demand" or would be willing to explain how to here, it would be greatly appreciated.
Also, is this the proper solution, even? I'm not too familiar with CDNs, is there a CDN that would be better suited for this?
Source: (StackOverflow)
Let me start out with a quick introduction to the architecture of a system I'm considering migrating to S3+Cloudfront.
We have a number of entities order in a tree. The leaves of the tree has a number of resources (jpg images to be specific), usually in the order of 20-5000, with an average of ~200. Each resource has a unique URL that is served through our colo setup today.
I could just transfer all of these resources to S3, setup Cloudfront on top of that and be done. If only I didn't have to protect the resources.
Most entities are public (that is, ~99%), the rest af protected in one of many ways (login, ip, time, etc.). Once an entity is protected, all the resources must be protected too, and can only be accessed after a valid authorization has been performed.
I could solve this by creating two S3 buckets - one private and one public. For the private content I'd generate signed Cloudfront URL's after the user was authorized. However, the state of an entity might change from public to private arbitrarily, and vice versa. An admin of the system might change an entity at any level of the entity tree, thus causing a cascading change throughout the tree. One change might cause a change of ~20k entities, multiplied by 200 resources, that would affect 4 million resources.
I could run a service in the background monitoring for state changes, but that would be cumbersome, and changing the ACLs of 4 million S3 items would take considerable time, and while that's happening we'll either have unprotected private content, or public content that we'd have to generate signed URLs for.
Another possibility would be to make all resources private by default. On each and every request made to an entity, we would generate a custom policy granting access, for that specific user, to all resources contained in the entity (by using wildcard url's in the custom policy). This would require the creation of a policy for each visitor, per entity - that wouldn't be a problem though. However, that would mean that our users can't cache anything any longer, as the URL will change for each new session. While not a problem for private content, it would suck for us to ditch all caching for the ~99% of the entities that are public.
Yet another option would be to keep all content private and use the above approach for private entities. For public entities we could generate a single custom policy, per public entity, that all users would share. If we set a lifetime of 6 hours and made sure to generate a new policy after 5 hours, a user would be ensured a policy lifetime of at least one hour. This has the advantage of enabling caching for up to 6 hours, while allowing private content to, possibly, be public for up to 6 hours after a state change. This would be acceptable, but I'm not sure it's worth it (trying to work out the cache/hit ratio of requests currently). Obviously we could tweak the 5/6 hour border to enable longer/shorter cache at the cost of longer/shorter exposure to private entities.
Has anyone deployed a similar solution? Any AWS features I'm overlooking that might be of use? Any comments in general?
Source: (StackOverflow)
When designing an iOS
app that will interact with AWS
(e.g. S3
, CloudFront
, etc), what are the pros and cons of managing the access to these services on the client vs. on the server?
By "managing the access", I mean things like uploading private content to S3, downloading private content via Cloudfront.
Of course, whichever side that handles the access will need to store the AWS
access key and access secret. Security is one of the concerns.
I am equally interested in the impacts of this design choice on the performance and the flexibility of either implementation.
Lastly, is there an argument for implementing a hybrid approach where both client and server interact directly with AWS
, or does the implementation usually go with either one or the other, but not both?
Source: (StackOverflow)
Our current plan for a site is to use Amazon's Cloudfront service as a CDN for asset files such as CSS, JavaScript, and Images, and any other static files.
We currently have 1 bucket in S3 that contains all of these static files. The files are separated into different folders depending on what they are, "Scripts" are JS files, "Images" are Images, etc yadda yadda yadda.
So, what I didn't realize from the start was that once you deploy a Bucket from S3 to a Cloudfront Distribution, then every subsequent update to the bucket won't deploy again to that same Distribution. So, it looks as if you have to redeploy the bucket to another Cloudfront instance every time you have a static file update.
That's fine for images, because we can easily make sure that if there is a change to an image, then we just create a new image. But, that's difficult to do for CSS and JS.
So, that gets me to the Best Practice questions:
- Is it best practice to create another Cloudfront Distribution for every production deployment? The problem here would be that causes trouble with CNAME records.
- Is it best practice to NOT warehouse CSS and JS in Cloudfront because of the nature of those files, and their need to be easily modified? Seems like the answer to this would be NO because that's the purpose of a CDN.
- Is there some other method with Cloudfront that I don't know about?
Source: (StackOverflow)
I use Amazon Cloudfront to host all my site's images and videos, to serve them faster to my users which are pretty scattered across the globe. I also apply pretty aggressive forward caching to the elements hosted on Cloudfront, setting Cache-Control
to public, max-age=7776000
.
I've recently discovered to my annoyance that third party sites are hotlinking to my Cloudfront server to display images on their own pages, without authorization.
I've configured .htaccess
to prevent hotlinking on my own server, but haven't found a way of doing this on Cloudfront, which doesn't seem to support the feature natively. And, annoyingly, Amazon's Bucket Policies, which could be used to prevent hotlinking, have effect only on S3, they have no effect on CloudFront distributions [link]. If you want to take advantage of the policies you have to serve your content from S3 directly.
Scouring my server logs for hotlinkers and manually changing the file names isn't really a realistic option, although I've been doing this to end the most blatant offenses.
Any suggestions would be welcome.
Source: (StackOverflow)
I've tried many, many different configurations, files, encoding, browsers, etc..., but this is the simplest example that demonstrates the problem I am having.
If you paste the url for the sample video for JSPlayer in FF 8.0.1, the video plays inline:
http://video-js.zencoder.com/oceans-clip.webm
If I take that same video and upload it to my s3 bucket, it triggers download instead:
https://s3.amazonaws.com/turingvideos/oceans-clip.webm
-- or --
http
(Permissions are read for everyone on the file and bucket)
So, let's try Cloud Front.
d2yat6m71lu23b dot cloudfront dot net slash oceans-clip.webm (download trigger)
And Cloud Front streaming:
strzsu4h2ax96 dot cloudfront dot net slash oceans-clip.webm (infinite spinner)
The same basic things happen when using an html video tag as well. Works fine from zencoder, borked on anything other than local disk read.
So, what magic is zencoder managing that is completely out of my reach with S3/CloudFront? I'm completely stumped.
Edit:
Setting the content type and disposition to "video/webm" and "inline" did the trick. Thanks for the quick response guys.
Source: (StackOverflow)