In the first part of this, we looked at the services we could use, installed AWS SAM, and creating the first application. This part is about thinking what we need to store, and making some changes to the initial template.yaml to reflect those.
My first objective was to detect changes to the list of top films. To do this, I needed a Lambda function that would go and retrieve the Top 250 filsm from IMDB, but I also decided to store the current IMDB Top 250 in a table, as having the top 250 stored somewhere would allow me to detect when it has changed, when the Lambda function ran again. DynamoDB seemed like the best place for storing this as it is a serverless solution, so I am only paying for what I use.
So I started with a Lambda function, that would read the films from IMDB, and then put write them to a DynamoDB table, whose name would be provided as a variable.
There is one thing you I neeeded to do first, and that is to register with IMDB to get API access. This is free, but I needed to do it so I can get the URL and then plug that into the code below.
import boto3
import json
import os
import requests
print('Loading function')
# client created outside of the handler
region_name = os.environ['REGION_NAME']
dynamo = boto3.resource('dynamodb', region_name=region_name)
table_name = os.environ['TABLE_NAME']
table = dynamo.Table(table_name)
def respond(err, res=None):
return {
'statusCode': '400' if err else '200',
'body': err.message if err else json.dumps(res),
'headers': {
'Content-Type': 'application/json',
},
}
def lambda_handler(event, context):
print("Received event: " + json.dumps(event, indent=2))
url = "https://imdb-api.com/en/API/Top250Movies/k_rrXXXXkx"
payload = {}
headers= {}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text.encode('utf8'))
items_json = json.loads(response.text)['items']
for i in items_json:
#print(i['title'], ',' , i['year'], ',' ,i['imDbRating'])
table.put_item(
Item={
'title': str(i['title']),
'year': str(i['year']),
'rating': str(i['imDbRating']),
'filmID': str(i['id']),
'image': str(i['image'])
}
)
scan_result = "table updated"
return respond(None,res=scan_result)
We then need to save our python in a file, GetTop250.py, and put it into a subfolder called "src". The src folder is references in the template.yaml, which we will get to in a minute.
One issue I faced straight away, was that "requests" isn't a standard module. The workaround for this is to use a layer that contains the module. The easiest way to do this is using Docker, which you should have installed before installing SAM.
So create a file give it the following contents:
FROM amazonlinux:2.0.20191016.0
RUN yum install -y python39 && \
yum install -y python3-pip && \
yum install -y zip && \
yum clean all
RUN python3.9 -m pip install --upgrade pip && \
python3.9 -m pip install virtualenv
Save it as layer.dockerfile, or similar. We will reference the file in a second. Then create a docker image using that docker file.
docker build -f "requests.Dockerfile" -t requests:latest .
Then run and connect:
docker run -it --name requests requests:latest bash
Install your module:
pip install requests -t ./python
Then you need to extract it to your PC.
zip -r requests.zip ./python/
docker cp lambdalayer:requests.zip ./Desktop/
You should now have a ZIP file that contains the module, and we can use this as a layer for AWS Lambda by uplaoding it.
aws lambda publish-layer-version --layer-name "requests" --description "Lambda Layer for Requests" \
--zip-file "fileb://requests.zip" --compatible-runtimes "python3.9"
One thing we still don't have though is our DynamoDB, so that is the next step, but thankfully a little easier. We can put that into out template.yaml.
AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'
Description: Top 250 from IMDB.
Resources:
GetTop250Function:
Type: 'AWS::Serverless::Function'
Properties:
Handler: GetTop250.lambda_handler
Runtime: python3.9
Layers:
- !Ref RequestsLayer
CodeUri: src
Description: Grab the Top 250 from IMDB and load into DynamoDB.
MemorySize: 128
Timeout: 60
Policies:
- DynamoDBCrudPolicy:
TableName: !Ref Table
Environment:
Variables:
TABLE_NAME: !Ref Table
REGION_NAME: !Ref AWS::Region
AutoPublishAlias: live
DeploymentPreference:
#Type: Canary10Percent10Minutes
Type: AllAtOnce
Table:
Type: AWS::Serverless::SimpleTable
Properties:
PrimaryKey:
Name: title
Type: String
RequestsLayer:
Type: AWS::Serverless::LayerVersion
Properties:
LayerName: sam-app-dependency
Description: 'Contains requests'
ContentUri: dependencies/
CompatibleArchitectures:
- x86_64
CompatibleRuntimes:
- python3.9
- python3.8
- python3.7
- python3.6
RetentionPolicy: Delete
Outputs:
myTableName:
Description: 'Name of the DynamoDB Table'
Value: !Ref Table
Export:
Name: !Sub "${AWS::StackName}-Table"
One of the key elements of the file for DynamoDB is here:
Table:
Type: AWS::Serverless::SimpleTable
Properties:
PrimaryKey:
Name: title
Type: String
This is where we tell SAM to create the table with a Primary key called "title", and a type of string. We may of course want to use other columns (and we will), but as this is not a SQL database, we can create them later.
Additionally, we need to set the crud policy for the table, and also set an environment variable.
Policies:
- DynamoDBCrudPolicy:
TableName: !Ref Table
Environment:
Variables:
TABLE_NAME: !Ref Table
REGION_NAME: !Ref AWS::Region
You can see that we have also added the section for the layer that we have created, and lastly, we are going to output the name of the DynamoDB table from the Cloudformation template. This is because we will want to use the table elsewhere, and not have to bother about thinks like names.
So, now, how do we create it in AWS? We can use SAM.
sam deploy --guided
That will walk you through the commands to create it all in AWS, via Couldformation.
Phew. I know it is a lot, but Part II is done. We have a table to store the Top 250, and we also have a Lambda function, with a layer, that goes out to retrieve the Top 250, and then write it to the table. The next step is to have a function that will check for updates, so basically compare what is on the website, with what is in the table.
comments powered by Disqus