Creating Reddit posts with Lambda - Part I

Published on 28 April 2023

Overview

If you want to create a subreddit, you might want to publish posts automatically. What better way to do it than using serverless, Lambda in fact.

In this post we will create a function that will check for updates from Twitter, and then post the most recent tweet to Reddit. We can use CloudWatch Events to then schedule it.

Prerequities

Firstly though, we need to establish our credentials for progamatically accessing Reddit. To do that, login to reddit using your normal account, and then go to their Apps page.

Then click "Create an app" at the bottom.

Reddit will then give you your client ID and secret, which you will need to put in your Lambda function.

reddit-1

We will also need to use the "praw" and "twitter" python modules, which we can add as a layer, which we will need to create first. I created a layer in a previous post using docker locally, but you can actually do it in Cloud9 really easily. It is probably a lot faster if you don't have access to a docker installation.

To do this, just create a new Cloud9 environment (call it anything, but accept the defaults), start it, and then run the following:

        pip install praw -t ./python
        pip install twitter -t ./python
        zip -r praw.zip ./python 
        aws lambda publish-layer-version --layer-name "praw" --description "Lambda Layer for praw" --zip-file "fileb://praw.zip" --compatible-runtimes "python3.10"

You should then get metadata confirming that it uploaded, and you can then delete your cloud9 environment. Using Cloud9 for this is just so quick and easy.

Then you need to create a new Lambda fuction. When you do, give it a name, choose Python as the runtime, and choose "Create function".

Then we can look at the code.

Code

Firstly, you will need to import the modules we are going to use:

        import os, json, praw, requests
        from twitter import *

Then we need to set the constants. These are using the os module to retrieve these, which means that you will need to configure these in the "configuration" section for your lambda function.

reddit-2

Once you have these edited, you can use the following code to pull them through:

        # Constants
        TWITTER_CONSUMER_KEY = os.environ['twitter_consumer_key']
        TWITTER_CONSUMER_SECRET = os.environ['twitter_consumer_secret']
        TWITTER_ACCESS_TOKEN_KEY = os.environ['twitter_access_token_key']
        TWITTER_ACCESS_TOKEN_SECRET = os.environ['twitter_access_token_secret']
        REDDIT_USER = os.environ['reddit_user']
        REDDIT_PASS = os.environ['reddit_pass']
        REDDIT_CLIENT_ID = os.environ['reddit_clientid']
        REDDIT_SECRET = os.environ['reddit_secret']
        TWITTER_ACCOUNT = "twitter"
        SUBREDDIT = "test"

We have the credentials to connect to Twitter, then the ones to connect to Reddit, and then the name of the Twitter account we are checking for posts, and the name of the subreddit that we will post it to.

You can then connect to twitter and reddit:

        # Connect to twitter
        t = Twitter(auth=OAuth(TWITTER_ACCESS_TOKEN_KEY, TWITTER_ACCESS_TOKEN_SECRET, TWITTER_CONSUMER_KEY, TWITTER_CONSUMER_SECRET))

        # Connect to reddit
        reddit = praw.Reddit(
            client_id=REDDIT_CLIENT_ID,
            client_secret=REDDIT_SECRET,
            user_agent="my script",
            password=REDDIT_PASS,
            username=REDDIT_USER,
            )

Now wea ready to do our handler. First we will get the mort recent tweet, and check that it is an actual tweet, not just a reply to a different tweet. Then we can post it to Reddit:

        def lambda_handler(event, context):
            # TODO implement

            # Get the most recent tweet
            tweet = t.statuses.user_timeline(screen_name=TWITTER_ACCOUNT,count=1)
            
            if tweet[0]['entities']['urls'][0]['expanded_url']:
                response = reddit.subreddit(SUBREDDIT).submit(tweet[0]['text'], url=tweet[0]['entities']['urls'][0]['expanded_url'])

We check the response, and if good, we return a 200:

            if response:
                #post = reddit.subreddit(REDDIT_SUBREDDIT).submit(tweet[0]['text'], url=tweet[0]['entities']['urls'][0]['expanded_url'])
                #print(post)
                #print(submission.id) # prints the id of the submission
                poststatus = 'Successful submission. The ID is ' + response.id
                return {
                    'statusCode': 200,
                    'body': json.dumps(poststatus)
                }
            else:
                return {
                    'statusCode': 400,
                    'body': json.dumps('Something broke..')
            }

Scheduling

Lastly, we need to schedule it to run. While something event driven would be nice, we don't have a source yet. So we use Amazon EventBridge to trigger it. To do that:

  1. Open CloudWatch
  2. Go to Rules
  3. Add a rule, entering a name, and choose a rule type of "Schedule"
  4. Select a type of "Recurring Schedule"
  5. Enter a cron expression (24 hours or simliar)
  6. Select a Lambda function as the target
  7. Choose Next, and finish the schedule.

One thing you will also probably want to consider is disabling retries.

In the next part, we will look at putting the data we take from Twitter into a database so we can then do spome more advanced things with it.

comments powered by Disqus