Amazon S3 is a “Simple Storage Service” offered by Amazon Web Services (AWS) that provides object storage through web services interfaces (REST, SOAP, and BitTorrent), as well as a secure method of storing files.
But getting to S3 from an EC2 instance that does not allow direct access to the internet can be difficult. First, you have to configure Internet Gateways, NAT Gateways and manage the route-tables to enable the EC2 instance to access public resources. Then, you need to send “signed” requests to access non-public data from S3 buckets.
There’s an easier way.
The solution I detail below provides a way to convert unauthenticated requests from an EC2 instance to authenticated requests using Citrix ADC as a Secure S3 Proxy without editing the routes for the EC2 instance.
Authenticating Requests (AWS Signature version 4)
The basic storage units of Amazon S3 are objects which are organized into buckets. Buckets and objects can be created, listed, and retrieved using REST APIs. These buckets can be made public or accessible only to particular users. If the bucket is made non-public, then the HTTP requests need to be signed.
The following diagram illustrates the process of computing the signature:
A sample GET request for a file, Test.txt from a S3 bucket, TestBucket looks like the following:
GET /TestBucket/Test.txt HTTP/1.1
User-Agent: curl/7.13.1 (x86_64-unknown-freebsd6.3) libcurl/7.13.1 OpenSSL/1.0.1p zlib/1.2.3
Host: 11.12.13.14
Accept: */*
The S3 Proxy needs to rewrite the request transparently to the following:
GET /TestBucket/Test.txt HTTP/1.1
User-Agent: <user-agent>
Host: <hostname>
Accept: */*
x-amz-content-sha256: <hashed content>
Authorization: AWS4-HMAC-SHA256 Credential=<AccessID>/<Date>/<Region>/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=<calculated signature>
x-amz-date:<date>
Signing the HTTP request using Citrix ADC
This solution uses a python script to construct the Authorization header, this python script is invoked by http callout. The python script can be run on any external machine.
Once Citrix ADC receives the response from the python script, it inserts all the headers (x-amz-content-sha256, Authorization & x-amz-date) using HTTP rewrite.
Citrix ADC also replaces the hostname to the S3 domain name.
Configuring the Python Server
- Create a user on AWS and generate access key for the same user [You can also use an existing User]
- Update the ‘bucket policy’ for the S3 bucket so that only the above User can access this S3 bucket [Check Annexure]
- Copy the python script [Annexure] on a Linux server in the same VPC [Any server reachable to the VPX can be used]
- SSH to the Linux server
- Edit the python script to update the host & region
- Set the environment variables for the access-key & secret-key
export ACCESS_KEY=<access_key>
export SECRET_KEY=<secret_key>
- Run the python script in the background
- Kill the SSH session
Configuring the Citrix ADC
- Configure a HTTP domain based servicegroup and bind to an SSL LB vserver (HTTP can also be used)
add nameserver <nameserver>
add server server_s3 <S3 domain name>
add servicegroup sg_s3_aws SSL
bind servicegroup sg_s3_aws server_s3 443
add lb vserver vs_s3 SSL <VIP> 443
bind lb vserver vs_s3 sg_s3
add ssl certkey key1 -cert <cert> -key <key>
bind ssl vserver vs_s3 -certkeyName key1
- Configure httpcallout to send traffic to the Python server
add ns variable var1 -type text(64000) -scope global
add policy httpCallout hc1 -IPAddress <Python Server> -port 8000 -returnType TEXT -hostExpr HTTP.REQ.HOSTNAME -urlStemExpr "\"calc_signature_v4.py\"" -headers url(HTTP.REQ.URL) -scheme http -resultExpr "HTTP.RES.BODY(2000)"
add ns assignment assign_var1 -variable "$var1" -set "sys.http_callout(hc1)"
- Configure rewrite configuration to insert the HTTP headers and replace the Host header
add rewrite action rw_act_insert_s3_header insert_http_header x-amz-content-sha256 "$var1"
add rewrite action rw_act_hostname replace HTTP.REQ.HOSTNAME <S3 domain name>
add rewrite policy rw_pol_assign_var1 TRUE assign_var1
add rewrite policy rw_pol_insert_s3_header TRUE rw_act_insert_s3_header
add rewrite policy rw_pol_replace_hostname TRUE rw_act_hostname
bind lb vserver vs_s3 -policyName rw_pol_assign_var1 -priority 10 -gotoPriorityExpression NEXT -type REQUEST
bind lb vserver vs_s3 -policyName rw_pol_insert_s3_header -priority 20 -gotoPriorityExpression NEXT -type REQUEST
bind lb vserver vs_s3 -policyName rw_pol_replace_hostname -priority 30 -gotoPriorityExpression END -type REQUEST
- Configure integrated caching to speed up the performance [OPTIONAL]
add cache contentGroup cache_cg1 -relExpiry 300
add cache policy cache_pol_s3 -rule TRUE -action CACHE -storeInGroup cache_cg1 -undefAction NOCACHE
bind cache global cache_pol_s3 -priority 100 -gotoPriorityExpression END -type REQ_OVERRIDE
Note: Internet access needs to be provided to the server-side subnet. Please refer to https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Internet_Gateway.html for the same.
Annexure — Complete Citrix ADC Configuration
enable ns feature LB REWRITE IC
enable ns mode USNIP
set cache parameter -memLimit 1000
add nameserver <nameserver>
add server server_s3 s3.eu-central-1.amazonaws.com
add servicegroup sg_s3_aws SSL
bind servicegroup sg_s3_aws server_s3 443
add lb vserver vs_s3 SSL <VIP> 443
bind lb vserver vs_s3 sg_s3_aws
add ssl certkey key1 -cert <cert> -key <key>
bind ssl vserver vs_s3 -certkeyName key1
add ns variable var1 -type text(64000) -scope global
add policy httpCallout hc1 -IPAddress <Python Server> -port 8000 -returnType TEXT -hostExpr HTTP.REQ.HOSTNAME -urlStemExpr "\"calc_signature.py\"" -headers url(HTTP.REQ.URL) -scheme http -resultExpr "HTTP.RES.BODY(2000)"
add ns assignment assign_var1 -variable "$var1" -set "sys.http_callout(hc1)"
add rewrite action rw_act_insert_s3_header insert_http_header x-amz-content-sha256 "$var1"
add rewrite action rw_act_hostname replace HTTP.REQ.HOSTNAME "\"s3.eu-central-1.amazonaws.com\""
add rewrite policy rw_pol_assign_var1 TRUE assign_var1
add rewrite policy rw_pol_insert_s3_header TRUE rw_act_insert_s3_header
add rewrite policy rw_pol_replace_hostname TRUE rw_act_hostname
bind lb vserver vs_s3 -policyName rw_pol_assign_var1 -priority 10 -gotoPriorityExpression NEXT -type REQUEST
bind lb vserver vs_s3 -policyName rw_pol_insert_s3_header -priority 20 -gotoPriorityExpression NEXT -type REQUEST
bind lb vserver vs_s3 -policyName rw_pol_replace_hostname -priority 30 -gotoPriorityExpression END -type REQUEST
add cache contentGroup cache_cg1 -relExpiry 300
add cache policy cache_pol_s3 -rule TRUE -action CACHE -storeInGroup cache_cg1 -undefAction NOCACHE
bind cache global cache_pol_s3 -priority 100 -gotoPriorityExpression END -type REQ_OVERRIDE
Python Script
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
import os
import sys, os, base64, datetime, hashlib, hmac
import requests
# ************* EDIT VALUES AS PER S3 REGION USED *************
method = 'GET'
service = 's3'
host = 's3.eu-central-1.amazonaws.com'
region = 'eu-central-1'
# ************************** END ******************************
access_key = os.environ['ACCESS_KEY']
secret_key = os.environ['SECRET_KEY']
if access_key is None or secret_key is None:
print('No access key is available.')
sys.exit()
#Create custom HTTPRequestHandler class
class KodeFunHTTPRequestHandler(BaseHTTPRequestHandler):
#handle GET command
def do_GET(self):
try:
if self.path.endswith('.py'):
#Get the URL
url1 = self.headers['url']
#Calculate headers
head1 = calc_header(url1)
#send code 200 response
self.send_response(200)
#send header first
self.send_header('Content-type','text-html')
self.end_headers()
#send content to client
self.wfile.write(head1)
return
except IOError:
self.send_error(404, 'file not found')
def sign(key, msg):
return hmac.new(key, msg.encode('utf-8'), hashlib.sha256).digest()
def getSignatureKey(key, dateStamp, regionName, serviceName):
kDate = sign(('AWS4' + key).encode('utf-8'), dateStamp)
kRegion = sign(kDate, regionName)
kService = sign(kRegion, serviceName)
kSigning = sign(kService, 'aws4_request')
return kSigning
def calc_header(canonical_uri):
t = datetime.datetime.utcnow()
amzdate = t.strftime('%Y%m%dT%H%M%SZ')
datestamp = t.strftime('%Y%m%d') # Date w/o time, used in credential scope
endpoint = 'http://' + host + canonical_uri
canonical_querystring = '';
payload_hash = hashlib.sha256(('').encode('utf-8')).hexdigest()
canonical_headers = 'host:' + host + '\n' + 'x-amz-content-sha256:' + payload_hash \
+ '\n' + 'x-amz-date:' + amzdate + '\n'
signed_headers = 'host;x-amz-content-sha256;x-amz-date'
canonical_request = method + '\n' + canonical_uri + '\n' + canonical_querystring \
+ '\n' + canonical_headers + '\n' + signed_headers + '\n' + payload_hash
algorithm = 'AWS4-HMAC-SHA256'
credential_scope = datestamp + '/' + region + '/' + service + '/' + 'aws4_request'
string_to_sign = algorithm + '\n' + amzdate + '\n' + credential_scope + '\n' \
+ hashlib.sha256(canonical_request.encode('utf-8')).hexdigest()
signing_key = getSignatureKey(secret_key, datestamp, region, service)
signature = hmac.new(signing_key, (string_to_sign).encode('utf-8'), hashlib.sha256).hexdigest()
authorization_header = algorithm + ' ' + 'Credential=' + access_key + '/' + credential_scope \
+ ',' + 'SignedHeaders=' + signed_headers + ',' + 'Signature=' + signature
headers = {'x-amz-date':amzdate, 'Authorization':authorization_header\
, 'x-amz-content-sha256':payload_hash}
request_url = endpoint
header = payload_hash + '\r\n' + 'Authorization: ' + authorization_header + '\r\n' \
+ 'x-amz-date: ' + amzdate + '\r\n'
return header
def run():
#ip and port of server
#by default http server port is 8000
server_address = ('127.0.0.1', 8000)
httpd = HTTPServer(server_address, KodeFunHTTPRequestHandler)
httpd.serve_forever()
if __name__ == '__main__':
run()
Sample Bucket Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:::user/"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::/*"
}
]
}
Accessing S3 from an EC2 instance doesn’t have to be complicated. Using Citrix ADC as a Secure S3 Proxy will help you enable a secure way to store files.
Learn more about Citrix ADC on the product documentation page.