Source

Making a Reddit + twitter Bot

Hi there pythonistas. I hope you are all fine. In this post I am going to teach you how we can make a Reddit + Twitter bot. What this bot will do is that it will copy post titles and url from any subreddit that you want and then it will post them to twitter keeping the 140 characters length in mind.

Firstly let me tell you what Reddit is. Reddit is a social link sharing site where good links are upvoted and bad links are down voted. So lets start.

Required external libraries

  • Praw - (Reddidt API wrapper in python) pip install praw
  • Requests - (HTTP library) pip install requests
  • Tweepy - (Twitter python API) pip install tweepy

First step - Register Yourself

So first of all you will have to register an app on http://dev.twitter.com/apps and after registering copy the

  • access token
  • access token secret
  • consumer key
  • consumer secret

You will have to edit the permissions for your app under the settings tab and grant your application read and write permission. So now we are ready to move on.

Required imports

So now lets start writing our script. First of all we will have to import the required libraries and set up some basic variables:

import praw
import json
import requests
import tweepy
import time

access_token = 'YOUR ACCESS TOKEN HERE'
access_token_secret = 'YOUR ACCESS TOKEN SECRET HERE'
consumer_key = 'YOUR CONSUMER KEY HERE'
consumer_secret = 'YOUR CONSUMER SECRET HERE'

Initiating connection with Reddit

Now we have to initiate connection with Reddit. Lets define a function to do just that.

def setup_connection_reddit(subreddit):
    print "[bot] setting up connection with Reddit"
    r = praw.Reddit('yasoob_python reddit twitter bot '
                'monitoring %s' %(subreddit)) 
    subreddit = r.get_subreddit(subreddit)
    return subreddit

This method connects with Reddit and gets the subreddit of our choice and then returns that subreddit for us to work further with.

So now we have to define a function that gets the list of posts and there urls from Reddit for us. So lets just do that as well.

def tweet_creator(subreddit_info):
    post_dict = {}
    post_ids = []
    print "[bot] Getting posts from Reddit"
    for submission in subreddit_info.get_hot(limit=20):
        # strip_title function is defined later
        post_dict[strip_title(submission.title)] = submission.url
        post_ids.append(submission.id)
    print "[bot] Generating short link using goo.gl"
    mini_post_dict = {}
    for post in post_dict:
        post_title = post
        post_link = post_dict[post]   
        # the shorten function is defined later        
        short_link = shorten(post_link)
        mini_post_dict[post_title] = short_link 
    return mini_post_dict, post_ids

First of all we are declaring a dictionary to hold the post title and link and after that we are making a list to hold the unique ids of every post that we grab. This is used to track which posts we have already grabbed.

After that we are looping over the posts and appending values to the dictionary and the list. If you use twitter very frequently then you know that how disgusting long links look like so in order to tackle that we are using goo.gl to generate short links for us. That’s the next thing we have done in the above function. We loop over the post dict and make a short link for every link and append it to a new dictionary which is mini_post_dict.

Now lets define a function which will actually shorten the links for us. So here it is:

def shorten(url):
    headers = {'content-type': 'application/json'}
    payload = {"longUrl": url}
    url = "https://www.googleapis.com/urlshortener/v1/url"
    r = requests.post(url, data=json.dumps(payload), headers=headers)
    link = json.loads(r.text)['id']
    return link

The above function contains a header and a payload which we are going to send to google and after that google will return the short link in the form of json.

Twitter hates more than 140 characters

If you use twitter regularly then I am sure that you know that twitter does not like tweets that are more than 140 characters. So in order to tackle that lets define a function that will truncate long tweets to short ones.

def strip_title(title):
    if len(title) < 94:
        return title
    else:
        return title[:93] + "..."

In the above method we will pass a title and the above method will check that whether the title is 93 characters or more. If it is more than 93 characters then it will truncate it and append three dots at its end.

Dealing with duplicate posts

So now we have started to shape our final script. There is one thing that we have to keep in mind. No one likes duplicate posts so we have to make sure that we do not post same tweets over and over again.

In order to tackle this issue we are going to make a file with the name of posted_posts.txt. When ever we grab a post from Reddit we will add it’s ID to this file and when posting to twitter we will check whether the post with this ID has already been posted or not.

Lets define two more functions. The first one will write the IDs to file and the second one will check whether the post is already posted or not.

def add_id_to_file(id):
    with open('posted_posts.txt', 'a') as file:
        file.write(str(id) + "\n")

def duplicate_check(id):
    found = 0
    with open('posted_posts.txt', 'r') as file:
        for line in file:
            if id in line:
                found = 1
    return found

Make a function for twitter will ya

So now lets make our one of the main function. This function is actually going to post to twitter.

def tweeter(post_dict, post_ids):
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth)
    for post, post_id in zip(post_dict, post_ids):
        found = duplicate_check(post_id)
        if found == 0:
            print "[bot] Posting this link on twitter"
            print post+" "+post_dict[post]+" #Python #reddit #bot"
            api.update_status(post+" "+post_dict[post]+" #Python #reddit #bot")
            add_id_to_file(post_id)
            time.sleep(30)
        else:
            print "[bot] Already posted"

Firstly we setup connection with twitter by using the credentials we defined in the beginning. After that we loop over the post_dict and post_ids. Then we check for duplicate posting. If it is not previously posted then we post it and add the id of the post in the posted_posts.txt file. After posting we wait for 30 seconds so that we do not spam twitter with tweets.

Wheres the main function bud

So lets define our last function. This function will co ordinate with all other functions. Here is the code for that last function:

def main():
    subreddit = setup_connection_reddit('python')
    post_dict, post_ids = tweet_creator(subreddit)
    tweeter(post_dict, post_ids)

So now we are ready just add this little line at the end as well:

if __name__ == '__main__':
    main()

This checks whether the script is directly executed or is it imported. If it is directly executed only then the main() function is executed.

Complete code

Here is the complete script:

import praw
import json
import requests
import tweepy
import time

access_token = 'YOUR ACCESS TOKEN HERE'
access_token_secret = 'YOUR ACCESS TOKEN SECRET HERE'
consumer_key = 'YOUR CONSUMER KEY HERE'
consumer_secret = 'YOUR CONSUMER SECRET HERE'

def strip_title(title):
    if len(title) < 94:
        return title
    else:
        return title[:93] + "..."

def tweet_creator(subreddit_info):
    post_dict = {}
    post_ids = []
    print "[bot] Getting posts from Reddit"
    for submission in subreddit_info.get_hot(limit=20):
        post_dict[strip_title(submission.title)] = submission.url
        post_ids.append(submission.id)
    print "[bot] Generating short link using goo.gl"
    mini_post_dict = {}
    for post in post_dict:
        post_title = post
        post_link = post_dict[post]           
        short_link = shorten(post_link)
        mini_post_dict[post_title] = short_link 
    return mini_post_dict, post_ids

def setup_connection_reddit(subreddit):
    print "[bot] setting up connection with Reddit"
    r = praw.Reddit('yasoob_python reddit twitter bot '
                'monitoring %s' %(subreddit)) 
    subreddit = r.get_subreddit(subreddit)
    return subreddit

def shorten(url):
    headers = {'content-type': 'application/json'}
    payload = {"longUrl": url}
    url = "https://www.googleapis.com/urlshortener/v1/url"
    r = requests.post(url, data=json.dumps(payload), headers=headers)
    link = json.loads(r.text)['id']
    return link

def duplicate_check(id):
    found = 0
    with open('posted_posts.txt', 'r') as file:
        for line in file:
            if id in line:
                found = 1
    return found

def add_id_to_file(id):
    with open('posted_posts.txt', 'a') as file:
        file.write(str(id) + "\n")

def main():
    subreddit = setup_connection_reddit('python')
    post_dict, post_ids = tweet_creator(subreddit)
    tweeter(post_dict, post_ids)

def tweeter(post_dict, post_ids):
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth)
    for post, post_id in zip(post_dict, post_ids):
        found = duplicate_check(post_id)
        if found == 0:
            print "[bot] Posting this link on twitter"
            print post+" "+post_dict[post]+" #Python #reddit #bot"
            api.update_status(post+" "+post_dict[post]+" #Python #reddit #bot")
            add_id_to_file(post_id)
            time.sleep(30)
        else:
            print "[bot] Already posted" 

if __name__ == '__main__':
    main()

Save this file with the name of reddit_bot.py and make a file with the name of posted_posts.txt and then execute the python script from the terminal. Your output will look something like this:

yasoob@yasoob:~/Desktop$ python reddit_bot.py
[bot] setting up connection with Reddit
[bot] Getting posts from Reddit
[bot] Generating short link using goo.gl
[bot] Posting this link on twitter
Miloslav Trmač, -1 for Structured Logging http://goo.gl/sF8Xgm #Python #reddit #bot

And after some time your posted_posts.txt file will look something like this:

1mb4y4
1mb867
1mb4hl
1mbh3t
1mbni0
1m9bod
1mbhpt
1mbhnc
1mbcp2
1m9d2t
1maeio
1m9bi5
1m8tgr
1m86e4
1ma5r5
1m8fud
1mdh1t
1mbst4

Goodbye

I hope you enjoyed today’s post as much as I enjoyed writing it. I hope to see you in future with some more tutorials. Do follow my blog to give me some support and get regular updates. Goodbye till next time.

Newsletter

×

If you liked what you read then I am sure you will enjoy a newsletter of the content I create. I send it out every other month. It contains new stuff that I make, links I find interesting on the web, and occasional discount coupons for my book. Join the 5000+ other people who receive my newsletter:

I send out the newsletter once every other month. No spam, I promise + you can unsubscribe at anytime

✍️ Comments

Stuart

Again, awesome stuff mate. Love real world applicable examples to learn from.

alekz.p

Syntax highlighter?

Yasoob
In reply to alekz.p

Hi there Alekz.p. Thanks for replying. The problem is that this blog is currently on wordpress. I can only use gist to use syntax highlighting but by embedding gist in the current theme the code font becomes TOO big so I left it. I don’t want to change the theme because there is no other free good looking theme like this. In near future I am going to shift it to python preferably Pelican. Lets see what becomes of this blog in the future.

motivation

Hello there, You’ve done an incredible job. I will definitely digg it and personally recommend to my friends. I’m sure they will be benefited from this website.

pirater un compte de facebook gratuit

Hey, I think your site might be having browser compatibility issues. When I look at your blog site in Opera, it looks fine but when opening in Internet Explorer, it has some overlapping. I just wanted to give you a quick heads up! Other then that, excellent blog!


I’m a reader of this blog for awhile now and it always delivers. I am hoping to see more posts in the style of your last one.

André

Hi, very nice post!!! But I’m getting the following error:

----------- error --------------------------------
[bot] setting up connection with Reddit
[bot] Getting posts from Reddit
[bot] Generating short link using goo.gl
[bot] Posting this link on twitter
Python + FFMPEG : read and write any Audio/Video format with just a few lines of code. http://goo.gl/uxpUF7 #Python #reddit #bot
Traceback (most recent call last):
  File "C:/Users/Karma/Dropbox/ANDRE - AULAS/INFORMATICA/INE 5201/2013 - 1/PYTHON CODES/reddit_bot.py", line 114, in 
    main()
  File "C:/Users/Karma/Dropbox/ANDRE - AULAS/INFORMATICA/INE 5201/2013 - 1/PYTHON CODES/reddit_bot.py", line 96, in main
    tweeter(post_dict, post_ids)
  File "C:/Users/Karma/Dropbox/ANDRE - AULAS/INFORMATICA/INE 5201/2013 - 1/PYTHON CODES/reddit_bot.py", line 107, in tweeter
    api.update_status(post+" "+post_dict[post]+" #Python #reddit #bot")
  File "C:\Python33\lib\site-packages\tweepy-1.4-py3.3.egg\tweepy\binder.py", line 153, in _call
    raise TweepError(error_msg)
tweepy.error.TweepError: Twitter error response: status code = 401
-----------------------------------------------------

I already check my tokens and all…

Any clue ??

Thanks a ton!!!

Jaxom

Hey, this is great, and it works perfectly!

Though, I must ask, is there a way to use this with multiple subreddits?

Randy Olson (@randal_olson)

Hi, great post! It saved me some time writing my own Twitter bot. A couple things:

  1. Can you please post this code to www.github.com as a code repository with a license? As-is, without a license, we legally cannot use, modify, nor share your code. The added advantage to putting it on GitHub is that we can contribute code to this bot in the future and make it better for everyone.

  2. I have a bug fix to suggest. Currently in the function tweet_creator(subreddit_info), the bot gathers all the IDs to tweet, gets the goo.gl shortlink, then later down the pipeline checks if the ID has already been tweeted. This is inefficient on multiple levels, but also bad because it generates goo.gl shortlinks for posts that may end up not getting submitted (if the bot has already submitted them). As a fix, I suggest moving the duplicate check code up to the for loop that originally gathers the submissions in tweet_creator(subreddit_info).

Cheers,

r

Yasoob
In reply to Randy Olson (@randal_olson)

I will make the repository soon. When I make it I will let you know and BTW thanks for taking interest in contributing.

Jake Camp

Hey there, how would I change the things the bot is searching on reddit?

Yasoob
In reply to Jake Camp

It just opens a subreddit and posts the links it finds there. If you want to use any other subreddit just replace ‘python’ in this line to the name of any other subreddit.

subreddit = setup_connection_reddit('python')

alive vitamins

Everyone loves what you guys are usually up too. This kind of clever work and coverage! Keep up the very good works guys I’ve incorporated you guys to my personal blogroll.

n3uplas

Hi, this is great stuff and it works.

But what if I wanted the links on twitter to point to the Reddit post itself, instead to the submission URL?

Stu

Hey, not sure if this is checked as it’s a little old but I’m having a couple of errors. Is anyone able to help? If so, I understand that I’m using a newer version (3.4.3) and have updated the “print” commands to include ( ).. but I’m just not sure what I’m missing here. Error screenshot here: http://i.imgur.com/5yUlV4i.png I’m new to python but not to programming. Any help or guidance would be appreciated.

c3325422

Hey, not sure if this is checked as it’s a little old but I’m having a couple of errors. Is anyone able to help? If so, I understand that I’m using a newer version (3.4.3) and have updated the “print” commands to include ( ).. but I’m just not sure what I’m missing here. Error screenshot here: http://i.imgur.com/5yUlV4i.png I’m new to python but not to programming. Any help or guidance would be appreciated.

Yasoob
In reply to c3325422

By looking at the error message I am sure that the link shortening service is not working correctly. The JSON response does not include the ‘id’ node. You can debug it from there yourself or let me know and I can try to free up some of my time and make it whole again. :)

Say something

Send me an email when someone comments on this post.

Thank you!

Your comment has been submitted and will be published once it has been approved. 😊

OK