Tag Archives: programming

Pull Requests: Assign or Request Review?

If you're feeling confused about GitHub's new review requests feature (as opposed to the existing ability to assign people to issues and pull requests) here's my take.

When review requests were first rolled out, they were not very useful because there was no way to see a list of pull requests you had been requested to review. Thankfully, that deficiency has been fixed!

Now, review requests unleash a great way to manage workflow on pull requests.

When you create a pull request, you add a review request for one or more reviewers. In addition, assign the pull request to the reviewer to signal that you are awaiting their action.

When the reviewer finishes their review, they should either merge the pull request (LGTM!) or assign it back to you (or someone else) for action -- respond to a comment, answer a question, revise code, etc.

Then when you respond to feedback, you assign the pull request back to the reviewer, and the cycle repeats.

There is additional explicit signaling in the review request workflow that you can use, too.

So, improve your pull request workflow by:

History v. State

I want to reprogram the way I think about the state of my data models.

Think of a blog post. Before I publish it, it's unpublished. After I publish it, it's published. If I unpublish it, it's unpublished again. Maybe I edit it and republish it. Published again.

I (and a lot of programmers, I think) tend to think of a thing like a blog post as having a state. Maybe in the MySQL database, blog posts are in a posts table having a state or status column. It's an ENUM type, probably.

But state is really just whatever the most recent change represents. Like git -- state is like HEAD. And too often, I don't think about saving the history of states when I should.

So, resolved: consider whether any new data model having a state or status should instead (or also) have a history.

Easily prune your ssh known_hosts file

At some point, you've probably seen this message when you try to log in to one of your servers:

ssh failure message

This is really common when you have Amazon EC2 instances behind Elastic IPs because the IP address stays the same (and probably the hostname, too), but as new instances replace old instances, the new instances' ssh keys are probably different.

But if you look carefully, you'll see that the failure message tells you how to resolve this problem:

Offending RSA key in /Users/[username]/.ssh/known_hosts:5

That means that line no. 5 of the known_hosts file contains the problematic key. So, assuming that you are sure this is NOT in fact a security breach, you can remove that line.

It's a bit of a pain-in-the-butt to manually edit this file, though. You can use sed to do it easily, but if you're like me and you don't use sed all the time, you need to look at the man pages every time you want to use it. That's why I wrote this quick bash script to do it automatically.

Drop that in your PATH and make it executable. Then you can simply type ssh-purge-host 5 to remove line 5 from your known_hosts file.

Hope that's useful!

A Gotcha Using Node.js + Request In a Daemon

I have a Node.js program running as a daemon on a Linux VPS. Periodically, it polls a list of URLs using request. When it first starts, everything runs smoothly. But after running for a while, it starts getting 400 errors, and the longer it runs, the more URLs return 400 errors.

I could not understand what was going on. My code was basically structured like this:

Given that code, we know the req object is initialized with each function call. So, how could this script degrade over time?

Well, I finally tracked it down: COOKIES!

Yup, request has cookies enabled by default. So, I think what was happening was that cookies were being set (presumably, top domain-level cookies having the same name at different URLs or subdomains on the same domain) but the values in request's cookie jar were not being returned properly. That means the remote host was getting invalid cookies -- hence the 400 response for a "Bad Request."

I haven't yet spent the time to figure out if this is a bug in request. It's on my TODO list.

In the meantime, I've disabled cookies in the req object:
var req = { url: url, timeout: options.timeout, jar: false };

It's now working as expected.

Notes On Creating A Multi-user Feed Aggregator

Some time ago, I answered another user's question on Stack Overflow about database design for a multi-user feed aggregator. I also received an email from a developer asking for additional input, which I shared. But I thought I should put my response here, as well, for posterity's sake if nothing else. Note that my comments here assume MySQL as the database but should apply to any SQL database.

Basically, my emailer asked what to do about the fact that the posts table will get huge very quickly if we have multiple users and a row for each post for each user. It's actually a pretty basic relational database scenario, but if you start your project as a single user application and later decide it's going to be multi-user, you may not realize that you probably need to completely redesign your database.

So I've posted the schemae I use for my multi-user feed aggregator (a private project):

I've also posted a sql command that you can run in a cron job to remove read posts that are more than 14 days old:

As an aside, I'm amazed by how many people are writing new aggregators. Is it a common programming class exercise to write a feed aggregator or something?

Fun with River2

I decided to install Dave Winer's River2 to supplement my usual feed reading. Now that I can access it via its smart use of Dropbox, it should be good for feeds that I don't feel like I need to see every headline.

One of the things I love about River2 is that it's an app that runs in the OPML Editor, which means that it is endlessly hackable and (apropos to this post) you can fix your own bugs.

So here's a bug report. And fix. (Actually, it could be a workaround for a bug in another application, as I explain below).

  1. What I was doing: From the Tools > River2 > Pages menu, I selected a page to view (any one, it's the same bug no matter which page).
  2. What I expected to happen: I expected the selected page to open in my default web browser, Pale Moon (a Windows-optimized build of Firefox)
  3. What actually happened: Nothing. Not even an error dialog.

I immediately suspected that the problem was the communication between the OPML Editor and the Pale Moon browser. After all, there was a major bug for the longest time in Firefox's DDE implementation that required a workaround.

Bottom line: the OPML Editor's DDE implementation expects that the DDE service name is the same as the name of the executable with the filename suffix removed. So, for Excel, the service name is "excel," and for Firefox it's "firefox." But the service name is determined by the application, and the Pale Moon developers decided that its service name would be "Pale Moon," not "palemoon." A simple patch to system.verbs.builtins.webBrowser.openURL resolves the problem.

if string.lower (id) contains "palemoon" { // 2/11/11; 12:09:06 AM by DJM
 ddeName = "Pale Moon";
  return (webBrowser.callBrowser (ddeName, "WWW_OpenURL", s+",,0,0,,,,"))}

The function webBrowser.callBrowser expects ddeName to be the name of the executable, from which it attempts to remove the ".exe" suffix. Luckily, if the function is passed any string without an ".exe" suffix, it just accepts the passed string as the DDE service name.

Here's the full context:


That ",,0,0,,,," nonsense is part of the DDE message that Pale Moon expects:

Pale Moon DDE

Delete Empty Folders

I recently found that I had a lot of empty folders in my MP3 folder after a wayward ripping session. So I whipped up this quick DOS one-liner to remove all empty folders.

From a command prompt, just change to the folder containing all the empty folders and enter the following:

FOR /f "tokens=*" %G IN ('dir /ad /b /s') DO rd /q "%G"

The command "rd /q" will be executed on every folder, but "rd" only deletes empty folders -- "rd" does not delete non-empty folders.

XMPP vCard Python Script

Couldn't find a script to update my Jabber/XMPP vCard photo (a/k/a avatar), so I wrote one. It requires xmpppy (a/k/a python-xmpp). It should work with gTalk, but I have not tested it.

Credit to pastebin for some code snippets.

Hope this saves someone some time and effort.

'''vcard.py - Update your XMPP vcard photo with the image you provide

Usage: vcard.py image_file jid password

from xmpp import JID, Client, Iq, Presence, NS_VERSION, NS_VCARD
import sys
import os
import time
from base64 import encode, decode
from hashlib import sha1

    print >>sys.stderr, __doc__

NS_VCARD_UPDATE = 'vcard-temp:x:update'
NS_NICK = 'http://jabber.org/protocol/nick'

def hash_img(img):
    return sha1(img).hexdigest()

def base64_img(img):
    return img.encode('base64')

def get_img(file):
        fh = open(file, 'rb')
        img = fh.read()
        return img
    except Exception, e:
        print >>sys.stderr, e

def get_mime_type(file):
        ext = file[-4:]
        if ext == '.png':
            mime_type = 'image/png'
        elif ext == '.gif':
            mime_type = 'image/gif'
        elif ext == '.jpg' or ext == '.jpeg':
            mime_type = 'image/jpeg'
            raise ValueError, "Wrong mime-type detected. Check file suffix."
    except ValueError, e:
        print >>sys.stderr, e
    return mime_type

def send_vcard(conn, base64_img, mime_type, nick):
    iq_vcard = Iq(typ='set')
    vcard = iq_vcard.addChild(name='vCard', namespace=NS_VCARD)
    vcard.addChild(name='NICKNAME', payload=[nick])
    photo = vcard.addChild(name='PHOTO')
    photo.setTagData(tag='TYPE', val=mime_type)
    photo.setTagData(tag='BINVAL', val=base64_img)

def send_presence(conn, status, hash1, nick):
    presence = Presence(status = status, show = 'xa', priority = '-1')

if __name__ == '__main__':
    img = get_img(file)
    if not conn:
        raise Exception, 'failed to start connection'
    if not auth:
        raise Exception, 'could not authenticate'
    send_vcard(cl, base64_img(img), get_mime_type(file), j.getNode())
    send_presence(cl, 'Updated vCard Image', hash_img(img), j.getNode())

On Bootstrapping

On Friday, Dave Winer released a terrific thought-piece-of-a-podcast on how journalists need to learn about bootstraps. In his most recent podcast with NYU's Jay Rosen, he and Jay discussed the topic, as well, but I want to focus on bootstrapping, the metaphor.

bootstrapDave offered the well-worn phrase "haul yourself up by your bootstraps" as the mental image we should have when we use the bootstrapping metaphor. Imagine that you're wearing your boots, you grab your bootstraps and pull on them. Well, the best outcome I can imagine is that you'd fail to accomplish anything. If you could accomplish anything, I think all you'd do is pull your feet out from under yourself. But the phrase is supposed to connote (I think) strength by self-determination and self-motivation. That's why MBA-types say they're going to "bootstrap" their start-ups when what they really mean is that their start-ups will be self-funded at the outset.

But I recall learning that before bootstraps became merely decorative, they actually served a useful purpose: namely, to strap your boots to the top of heavy items so you could carry them. Maybe it's apocryphal, but here's MY mental image of bootstrapping: a person on a horse, laden with a pack on its rump, and a heavy wooden storage box strapped to each of the rider's boots.

And that more closely matches what bootstrapping means to me: taking advantage of what's already up-and-running and using that existing momentum to get something else moving.