URL Fetch API, MiniDom (Google App Engine)
Fetching stuff with the URL Fetch API is simple (especially if one has faith that the source is there and it will deliver inside GAE time limits):
from google.appengine.api import urlfetch from xml.dom import minidom def parse(url): r = urlfetch.fetch(url) if r.status_code == 200: return minidom.parseString(r.content)
As is accessing the resulting DOM with MiniDom. Here the source is an Atom feed:
import time dom = parse(URL) for entry in dom.getElementsByTagName('entry'): try: published = entry.getElementsByTagName('published')[0].firstChild.data published = time.strftime('%a, %d %b', time.strptime(published, '%Y-%m-%dT%H:%M:%SZ')) except IndexError, ValueError: pass …
Categorised as: snippet