Monday, March 16, 2015

Twitter bot python implementation. birdy twitter api


Here is a little python implementation of twitter bot using awesome birdy api made for python.
I really like this api, it's simple flexible and powerful, just what you need to build simple twitter bot at your convenience.

I'll not go into detail much about the code, but here is what it does, twitter bot, more like proof of concept parses specific page (here I parse links of latest news at politika site, and post them as a link to twitter.

It's simple, class urlHTMLParser parses the page, when it finds a tag, it parses the page and collects all links related to latest news (topics and links) then using birdy twitter api we simply tweet about it in intervals we defined bellow.

Before testing the code, you have to obtain CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET
you can obtain this information here.
Note: most of the code is for parsing as you'll see, if you want to tweet some predefined tweet, that would look like so: 

        try:

            tweet = 'this is my simple tweet'

            # sending tweet ...
            response = client.api['statuses/update'].post(status = tweet)

            print 'Tweet has been send.'
        except TwitterApiError as e:
            print e
And here is the full code, doing some parsing and then tweeting about it, check it out.

#!/usr/bin/python
from birdy.twitter import UserClient
from birdy.twitter import TwitterApiError
from HTMLParser import HTMLParser
import requests
import re
import time

class urlHTMLParser(HTMLParser):
    links = []
    def handle_starttag(self, tag, attrs):
        if tag == 'a':
            dic = dict(attrs)
            if(dic.has_key('href') and dic.has_key('title')):
                if(re.search(r'najnovije-vesti', dic['href'], re.IGNORECASE) != None):
                    self.links.append({ 'href' : dic['href'], 'title' : dic['title'] })
                
            return
        else:
            return
    def handle_endtag(self, tag):
        return
    def handle_data(self, data):
        return
    def getLinks(self):
        return self.links

parser = urlHTMLParser()

client = UserClient(CONSUMER_KEY,
                    CONSUMER_SECRET,
                    ACCESS_TOKEN,
                    ACCESS_TOKEN_SECRET)

#response = client.api['users/show'].get(screen_name ='rdjordje')
#print response.data['description']
#print response.resource_url
#print response.headers


# politika request page - najnovije vesti 
# for proxy catch exception
try:
    req = requests.get('http://www.politika.rs/vesti/najnovije-vesti/index.1.sr.html')
except ProxyException as e:
    print e


if req.status_code == 200:
    print 'Status code: [OK]'
    print 'Parsing source page ...'
    parser.feed(req.text)
    parser.close()
    print 'Parsing complete, total latest news found: [', len(parser.getLinks()), ']'
    for l in parser.getLinks():
        try:

            print 'Sending tweet...'
            print 'Tweet : '
            tweet = l['title'] + ' - ' + l['href'] + ' #kratkinjuz'
            print tweet

            # sending tweet ...
            response = client.api['statuses/update'].post(status = tweet)

            print 'Tweet has been send.'
        except TwitterApiError as e:
            print e
        
        # sleep for 1 minute[s]
        print 'Waiting for 5 minute[s]...'
        time.sleep(60 * 5)
        


Ok, this is the full code doing something useful. (relatively) If you modify a code little bit, you can parse any relevant pages you want and post some content on twitter.

Nice thing would be to run crontab periodically instead of time.sleep.

Also nice idea would be to write a script that will check last activity for all your followers and take action in accordingly.  

No comments:

Post a Comment