还是为了节约成本,计划使用AWS的CloudFront,下面是心酸历程。
S3上原始文件名字Battle Of The Saints I.apk
S3上面可下载URL
https://s3-ap-southeast-1.amazonaws.com/sudops.com/Battle+Of+The+Saints+I.apk
发现S3自动将URL中的空格“ ”转换成了加号“+”,到这里还没错,上面地址是可以下载的。
使用CloudFront后,预期的CloudFront下载地址:
http://sudops.com/Battle%20Of%20The%20Saints%20I.apk
同样没有问题,可以下载
不过从S3的log和CloudFront的log中发现有很多403错误,发现实际下载的URL变了,多了很多乱七八糟的字符,难道是不同的浏览器导致?
S3 log: [10/Jun/2014:10:09:22 +0000] 54.239.196.63 14E072F930D3F2CD REST.GET.OBJECT Battle%25252520Of%25252520The%25252520Saints%25252520I.apk "GET Battle%252520Of%252520The%252520Saints%252520I.apk HTTP/1.1" 403 AccessDenied 231 - 15 - "-" "Amazon CloudFront" - CloudFront log: [10/Jun/2014:08:03:14 +0000] 216.137.54.149 A1709466D371E8D3 REST.GET.OBJECT Battle%252520Of%252520The%252520Saints%252520I.apk "GET Battle%2520Of%2520The%2520Saints%2520I.apk HTTP/1.1" 403 AccessDenied 231 - 16 - "-" "Amazon CloudFront" -
我勒个去,好好的URL居然被四次encode,一个小小的空格“ ”被转成了“%25252520”,怪不得出现403无法访问
http://sudops.com/Battle%25252520Of%252520The%25252520Saints%25252520I.apk This XML file does not appear to have any style information associated with it. The document tree is shown below. <Error> <Code>AccessDenied</Code> <Message>Access Denied</Message> <RequestId>86905FA0B9C543E9</RequestId> <HostId> qAfOvgYqKNl+33vzVykSSmWoBkRBOjpe06YssRMrw3h+9be4U+0lYOvMRseg4+XT </HostId> </Error>
据aws论坛说URL要事先经过两次encode,然后会正常访问:
If your ecommerce platform is deliberately breaking the URL encoding required for transmission of prohibited characters in a URL, you will need to double-URL encode the filenames first, so the ecommerce "solution" decodes them to read 41%2BPYwYkt1L.jpg after it's done its single decoding.
详见:https://forums.aws.amazon.com/message.jspa?messageID=277276
但是如何解释加入到 AWS cloudfront 之后URL被进行了四次转码?难道CloudFront有多级cache,多个region之间的数据存储会增加多次encode?比如S3在新加坡,新加坡的cloudfront先处理一遍,美国region的cloudfront再处理一遍URL,这不科学啊!
于是,规范S3上的URL才是正解,避免不必要的encode,decode。
mark一下,这个地址不错 http://meyerweb.com/eric/tools/dencoder/