Sample Application - New Films in IMDB Top 250 - Part III

Published on 15 September 2022

On to part three! In the first part I talked about what the sample application should achieve, and how I could go about that, and then went through the installation of AWS SAM (the AWS Serverless Application Model). In Part II, I created the first Lambda function that will get the contents of the IMDB Top 250 via a Lambda function, that was created, alongside a DynamoDB table using SAM. Part III will be another lambda function, that will compare the current IMDB Top 250 with what is stored in our DynamoDB table, and look for changes. I will also do the code to notify users via SES.

Similar to the previous function, in this example, I will use a Lambda function (Serverless!), so that it can run on demand. This of course means no requirement on an EC2 instance. The Lambda function will also be created using SAM.

As before, I will need to create the SAM app to wrap this.

PS C:\Users\me\SAM\films\sam-app2> sam init

Our template file is going to be easier, because we are not going to be including DynamoDB here, we already have it defined in our previous app.

In this case, the Lambda function code will need to go to IMDB, get the list, and store it in a Python list (remember to use the URL that I got from registering in the last part). It will then step through the list, and check for any film released in the past couple of years to see if it is already in the table. I did this for several reasons, but mainly to reduce requests against the table as I wouldn't expect a film to get added years after release. It is possible though I guess!

If the code finds a film that isn't in the DynamoDB table, it builds a herestring to go into an email body, and then builds an email, ready to send. We don't have any email address to send to at the moment, but we will deal with that later.

Now in this case, we are using SES to send the emails. We need to configure that first, but verifying the email address that we are going to send from, and then we will also need to ensure that we are using verified emails to send to. To use SES, we access it from the boto3 client.

    ses = boto3.client('ses', region_name=region_name)

First, we need to check if there are any new films. What we will do, is check fi the film was released this year (or the previous year), and then if so, go through our DynamoDB table to check our existing records. if the film is new, we have a new entry, so need to send out notification!

The code to do the check looks like this:

        for i in items_json:

            if (str(i['year']) == year) or (str(i['year']) == str(int(year)-1)):
                #Film was released this year so check if it is in the table
                print(year)
                print(i)
                response = table.get_item(Key={
                        'title': i['title']
                    })
                if 'Item' in response:
                    #item exists
                    print("Item exists, nothing to do.")
                else:
                    print("Item is new! We need to do something with it")

At that point, we need to sent notification to our emails configured in SES. To get the emails from SES, we will use code like this:

    # Get the ist of identities
    identities = ses.list_identities(
        IdentityType='EmailAddress'
    )

We can then step through each "identity" and ensure it is verified:

    # For each identity, check if it is verified
    for i in identities['Identities']:
        verified = ses.get_identity_verification_attributes(
            Identities=[i]
        )

You have to be very careful using SES, to ensure that you are not sending spam. Amazon take it very seriously. At a minimum, you should be making sure any email address you send to has been verified, meaning that someone opted in, by clicking on a link that they received in an email.

We then use ses.send_email to send the actual email:

    response = ses.send_email(
        ..
    )

You can see the full code for the function below.

    import boto3
    import json
    import os
    import requests
    import datetime


    print('Loading function')
    # client created outside of the handler
    region_name = os.environ['REGION_NAME']
    dynamo = boto3.resource('dynamodb', region_name=region_name)
    ses = boto3.client('ses', region_name=region_name)
    table_name = os.environ['TABLE_NAME']
    table = dynamo.Table(table_name)

    # Get the current year
    currentDateTime = datetime.datetime.now()
    date = currentDateTime.date()
    year = date.strftime("%Y")


    def respond(err, res=None):
        return {
            'statusCode': '400' if err else '200',
            'body': err.message if err else json.dumps(res),
            'headers': {
                'Content-Type': 'application/json',
            },
        }


    def lambda_handler(event, context):
        print("Received event: " + json.dumps(event, indent=2))

        url = "https://imdb-api.com/en/API/Top250Movies/k_rXXXXXkx"
        payload = {}
        headers= {}
        response = requests.request("GET", url, headers=headers, data = payload)
        print(response.text.encode('utf8'))
        items_json = json.loads(response.text)['items']

        # Check if the films is already in the table
        # We are only checking films released over the past couple of years as it is 
        # unlikely that an older film will make it.
        for i in items_json:

            if (str(i['year']) == year) or (str(i['year']) == str(int(year)-1)):
                #Film was released this year so check if it is in the table
                print(year)
                print(i)
                response = table.get_item(Key={
                        'title': i['title']
                    })
                if 'Item' in response:
                    #item exists
                    print("Item exists, nothing to do.")
                else:
                    print("Item is new! We need to do something with it")
                    # Define the email to send
                    CHARSET = 'UTF-8'
                    SENDER = 'films@heywoodonline.com'
                    email_body = '''
                    A new film has been added to the IMDB Top 250 list! 
                    
                    The film that has been added is: {}, released {}, and with an average rating of {}.<br><br>
                    <img src={}>'''.format(i['title'],i['year'],i['imDbRating'],i['image'])
                    
                    # Get the ist of identities
                    identities = ses.list_identities(
                        IdentityType='EmailAddress'
                    )

                    # For each identity, check if it is verified
                    for i in identities['Identities']:
                        verified = ses.get_identity_verification_attributes(
                            Identities=[i]
                        )

                        # If it has been verified, we can send an email
                        if 'Success' in verified['VerificationAttributes'][i]['VerificationStatus']:
                            recipient = i
                            response = ses.send_email(
                                Destination={
                                    'ToAddresses': [
                                        recipient,
                                    ],
                                },
                                Message={
                                    'Body': {
                                        'Html': {
                                            'Charset': CHARSET,
                                            'Data': email_body,
                                        },
                                        'Text': {
                                            'Charset': CHARSET,
                                            'Data': email_body,
                                        },
                                    },
                                    'Subject': {
                                        'Charset': CHARSET,
                                        'Data': 'New Top 250 film!',
                                    },
                                },
                                Source=SENDER,
                            )

        print("Received event: " + json.dumps(event, indent=2))
        #scan_result = dynamo.scan(TableName=table_name)
        scan_result = "table updated"
        return respond(None,res=scan_result)

That completes our checkupdates.py file. The other piece we need to do is our template.yaml file. This is a bit smaller than our for our other app, but we need to include permission for SES, and we will also reference our DynamoDB table via a reference.

    # SAM file
    AWSTemplateFormatVersion: '2010-09-09'
    Transform: 'AWS::Serverless-2016-10-31'
    Description: Check IMDB Top 250 for Updates.
    Resources:
      CheckUpdatesFunction:
        Type: 'AWS::Serverless::Function'
        Properties:
          Handler: CheckUpdates.lambda_handler
          Runtime: python3.9  
          Layers:
            - !Ref RequestsLayer
          CodeUri: src
          Description: Check IMDB Top 250 for updates.
          MemorySize: 128
          Timeout: 60
          Policies:
            - SESCrudPolicy: 
                IdentityName: 'heywoodonline.com'
            - DynamoDBCrudPolicy:
                TableName: !ImportValue sam-app-Table
            - Version: '2012-10-17'
              Statement:
                - Effect: Allow
                  Action:
                    - 'ses:SendEmail'
                    - 'ses:VerifyEmailIdentity'
                    - 'ses:ListIdentities'
                    - 'ses:GetIdentityVerificationAttributes'
                  Resource: '*'
          Environment:
            Variables:
              TABLE_NAME: !ImportValue sam-app-Table 
              REGION_NAME: !Ref AWS::Region
    #      Events:
    #        HelloWorldSAMAPI:
    #          Type: Api
    #          Properties:
    #            Path: /hello
    #            Method: GET
          AutoPublishAlias: live
          DeploymentPreference:
            #Type: Canary10Percent10Minutes
            Type: AllAtOnce
      bs4Layer:
        Type: AWS::Serverless::LayerVersion
        Properties:
          LayerName: sam-app-dependency
          Description: 'Contains bs4, requests and lxml '
          ContentUri: dependencies/
          CompatibleArchitectures:
            - x86_64
          CompatibleRuntimes:
            - python3.9
            - python3.8
            - python3.7
            - python3.6
          RetentionPolicy: Delete

Like with our previous SAM app, we save it into the src folder, and from then we can then build and deploy, like we did with our previous component.

sam deploy --guided

That will walk you through the commands to create it all in AWS, via cloudformation.

The next part (IV) will look at bringing it all together with the basic webpage to front it.

comments powered by Disqus