Andrew Norman: July 2012

Thursday, July 26, 2012

Looking ahead

As the previous post suggests, I've been thinking about Twitter bots today, and in particular how to post lengthy text in 140-character chunks. A diversion was to think about how to "read ahead" - I wanted to be able to say "if the next chunk of text will take me over the 140-character limit, don't read it in". But I can't do that without reading it in (I could of course read the whole file into an array, then use pop() and append() to treat it as a stack, but I want to avoid having to read in the whole file if possible). A solution is below:

#!/usr/bin/python

from tempfile import mkstemp
import shutil
import os

ifpath = "a.txt"

fh, ofpath = mkstemp()
ofile = open(ofpath, "w")
ifile = open(ifpath)

lines = []
length = 0

while True:
    previous = ifile.tell() 
    line = ifile.readline()
    if length + len(line) < 140:
        lines.append(line)
        length += len(line)
    else:
        ifile.seek(previous)
        print lines
        break

for line in ifile:
    ofile.write(line)

ifile.close()
ofile.close()
os.remove(ifpath)
shutil.move(ofpath, ifpath)

This opens a.txt for reading (as ifile), and creates a temporary file (ofile). There's then an infinite loop - each time round, I store the current file location as previous, read in a new line, and check to see if that would take it over the limit. If not, I add the new line to the stored array. If it will go over the limit, I use ifile.seek() to go back to the file location before I read in the new line, and then print out the lines stored so far. All that remains is to write the unread lines (including the one I read in and decided not to use) to a temporary file, remove the original a.txt, and replace it with the temporary one.

Wednesday, July 25, 2012

Twitter Updating

Revised version of the code I originally wrote about here. The main difference is that I've included a counter so it doesn't post more than two tweets every time it is run (and since it only runs once an hour, this shouldn't make it too antisocial even when it hasn't run for a while and there are a lot of items in the RSS feed).

#!/usr/bin/python

import tweepy
import feedparser
import urllib
import urllib2

url = "http://www2.le.ac.uk/departments/engineering/news-and-events/blog/RSS"

oauth_file = open("access_token.txt", "r")
oauth_token = oauth_file.readline().rstrip()
oauth_token_secret = oauth_file.readline().rstrip()

consumer_file = open("consumer_keys.txt", "r")
consumer_key = consumer_file.readline().rstrip()
consumer_secret = consumer_file.readline().rstrip()

bitly_file = open("bitly.txt", "r")
bitly_username = bitly_file.readline().rstrip()
bitly_apikey = bitly_file.readline().rstrip()

bitly_base = "http://api.bit.ly/v3/shorten?"
bitly_data = {
    "login" : bitly_username,
    "apiKey" : bitly_apikey,
    "format" : "txt",
    "longUrl" : ""
    }

already_done = []
done_file = open("done.txt", "r")
for line in done_file:
    already_done.append(line.rstrip())
done_file.close()

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(oauth_token, oauth_token_secret)

api = tweepy.API(auth)
feed = feedparser.parse(url)

count = 0
for item in feed["items"]:
    url   = item["link"]
    title = item["title"]
    if url not in already_done and count < 2:
        bitly_data["longUrl"] = url
        to_shorten = bitly_base + urllib.urlencode(bitly_data)
        result = urllib2.urlopen(to_shorten).read()
        api.update_status(title + " : " + result)
        already_done.append(url)
        count = count + 1

done_file = open("done.txt", "w")
for url in already_done:
    done_file.write(url + "\n")
done_file.close()