Tag Archives: xml

Jaiku Planet Venus Filter

Just started exploring Jaiku and, coincidentally, Planet Venus. One of the cool things about Jaiku is that it aggregates your other web presences (like your blog, twitter, del.icio.us, and flickr posts) and integrates them into Jaiku presence stream. The down side of this is that it's not as good at that as Planet Venus, and then if you use Planet Venus to create a aggregation of your web presences and you include Jaiku, then you've got annoying duplication.

So, I'm not much of a Python programmer, but I wrote this Planet Venus filter that looks at each entry, and if it detects that it's a Jaiku presence update, it only includes it if it originated via Jaiku. In other words, it filters out all the duplicates.

If you understood any of that, you may find this helpful. If not, nevermind.


"""
For jaiku presence entries, only retain entries that originate from jaiku
(as opposed to grabbed via web feeds)
"""
import sys, xml.dom.minidom
entry = xml.dom.minidom.parse(sys.stdin).documentElement
entry_id = entry.getElementsByTagName('id')[0].firstChild.data
for node in entry.getElementsByTagName('link'):
if node.getAttribute('rel') == 'alternate':
entry_link = node.getAttribute('href')
break
if entry_id.find('jaiku.com/presence') > 0 and (entry_id != entry_link):
sys.exit(1)
print entry.toxml('utf-8')

(Updated to fix a bug on line 11.)

Leash and Feedparser

Maybe I'll write up something a little more formal in the future. For now, I just want to publish this in case it's useful to someone.

Les Orchard posted a blurb that indicated that he was looking for a PHP class to perform HTTP requests with conditional GET support. Well, a while ago I was looking for that, too. Because I was working on a replacement for Magpie RSS (see below), I decided to use Snoopy as my HTTP client. I then wrote a brief extension, Leash, to provide a cache-enabled front end to Snoopy. Leash automatically caches the HTTP results, the time of the request, and the Last Modified and Etag HTTP headers. When you request a page you've previously requested, Leash first checks to see if the cached copy is older than the maximum cache age you've specified (or the default of 1 hour), and if the cache is too old, Leash performs a conditional GET. The latest version of Leash (which I bundle with Snoopy) is in my Subversion repository.

Also in that repository is my replacement for Magpie RSS. I always liked Magpie, but it didn't quite work for me and I also wanted an OPML parser. So I wrote one. Actually, first I wrote a generic PHP XML parser. Then I wrote the OPML parser and Feed parser.

Sorry, but I currently don't have time for documentation. Or support. That probably makes this of very limited utility to all but the most daring. If you're a PHP junkie, you'll probably be able to peruse the code and get the gist. And here's an example of how I'm using it to help manage my podcasts.

Optimal OPML Browser Update v. 0.4

I've completed a major rewrite of my OPML browser, Optimal.

I didn't manage to document all of the changes, they were so numerous. Highlights are:

  • Object-oriented reimplementation, making it more portable to other applications
  • Same code may be used as a WordPress plugin -- replaces the OPML Renderer plugin for WordPress
  • RSS items now include descriptions, so you can browse the feeds without subscribing or visiting the home page
  • New widget generator to generate HTML to include Optimal on your web site

More... | Download

Optimal OPML Browser Update v. 0.4pre1

I'm still working on a significant rewrite of my OPML browser, Optimal, but I've decided to release the current working version in the meantime because it addresses a couple of significant usability comments I've had.

Specifically:

  1. There are now links to expand/collapse all nodes, and
  2. There is a new query string parameter, depth, which allows you to specify the initial expansion state.

Optimal Screen Cap 0.4pre1