Thumbnails from WebThumb

|

Josh Eichorn has been a name I’ve known for a while, mostly due to his HTML_Ajax efforts a couple of years ago, so it was no surprise when I found out he was behind the pretty neat web service called WebThumb. Although a tad rough around the edges (and obviously a PHP app), it’s a cool idea and very cheap in comparison to similar services, and free for the first 100 requests per month… or something like that.

Ok wait, let’s be honest - It’s not that great a service. I mean, when I started writing this 20 seconds ago I was planning on singing WebThumb praises at the top of my lungs, but something just doesn’t feel right. Looking at the site trying to remember the URL, I realized it feels pretty half-assed. Almost web2.0 or something (as if that was a good thing). And honestly it’s an abysmally slow service to boot. I’d never wait 30+ seconds for a generated image from any pay-to-play service by any means. Bottom line though: It works. And plus - Ross Poulton did the hard part for me already.

When I saw how easy it would be to add thumbnail headers to my link roll I simply couldn’t resist. The first thing I did was take Ross’s code and extend it to support WebThumb’s excerpt feature, because I knew I didn’t want a full screenshot. Instead I’m grabbing a small pseudo random portion of the image to use as a slightly abstract thumbnail. Here’s the function to add into Ross’s previous example:

def get_excerpt(url, output_path, offset=(0, 0), size=(400, 100)):
    request = """
    <webthumb>
        <apikey>%s</apikey>
        <request>
            <url>%s</url>
            <excerpt>
            <x>%d</x>
            <y>%d</y>
            <width>%d</width>
            <height>%d</height>
            </excerpt>
        </request>
    </webthumb>
    """ % (WEBTHUMB_APIKEY, url, offset[0], offset[1], size[0], size[1])

    h = httplib.HTTPConnection(WEBTHUMB_HOST)
    h.request("GET", WEBTHUMB_URI, request)
    response = h.getresponse()

    type = response.getheader('Content-Type', 'text/plain')
    body = response.read()
    h.close()
    if type == 'text/xml':
        # This is defined as 'success' by the API. text/plain is failure.
        doc = xml.dom.minidom.parseString(body)

        for node in doc.getElementsByTagName("job"):
            wait = node.getAttribute('estimate')
            key = ""
            for node2 in node.childNodes:
                if node2.nodeType == Node.TEXT_NODE:
                    key = node2.data

        # We're given an approx time by the webthumb server,
        # we shouldn't request the thumbnail again within this
        # time.
        time.sleep(int(wait))

        request = """
        <webthumb>
            <apikey>%s</apikey>
            <fetch>
                <job>%s</job>
                <size>excerpt</size>
            </fetch>
        </webthumb>
        """ % (WEBTHUMB_APIKEY, key)

        h = httplib.HTTPConnection(WEBTHUMB_HOST)
        h.request("GET", WEBTHUMB_URI, request)
        response = h.getresponse()
        try:
            os.unlink(output_path)
        except:
            pass
        img = file(output_path, "wb")
        img.write(response.read())
        img.close()
        h.close()
        return True
    else:
        return False

That should return a 400x100 pixel image starting at the top-left corner when run with no kwargs. I added the following custom save method to my Link model to grab the image from WebThumb the first time you save a link:

def save(self):
    if not self.thumbnail:
        from webthumb import get_excerpt
        from django.template.defaultfilters import slugify
        from os import path

        upload_to = [field.upload_to for field in Link._meta.fields if field.name is 'thumbnail'][0]
        self.thumbnail = '%s/%s.jpg' % (upload_to, slugify(self.url))
        get_excerpt(self.url, '/'.join((settings.MEDIA_ROOT, self.thumbnail)), (20, 30), (190, 65))
    super(Link, self).save()

Note the hackish way I determine the upload_to value for the thumbnail field (which is an ImageField btw). Pretty funny. Anyway, that’s enough to get you up and running, but has plenty of room for improvement (error checking anyone?)

Comments

Post a comment: