Dealing with horrid APIs so you don't have to
My band has music on SoundCloud, photos on Flickr, and videos on Vimeo, and we want to feature them on http://rawfunkmaharishi.uk/. Up until now, this has been managed by curating, by hand (or very shonky scripts), bits of YAML to feed into Jekyll, but this gets old quickly, especially when you run into things like SoundCloud's decision to only expose the track ID deep inside the embeddable iframe code.
But this is dumb. It's 2015 and everything has an API, so let's build a robot to do this stuff properly!
The great metadata shift
Up until now, the Hand-Crafted YAML (which sounds like a thing you may be able to buy at Boxpark) approach has allowed me to be a bit lax with the metadata for our media - some of it's been stored on the various services, some purely in my YAML. In order to make this robot universal, I've had to fill in all the metadata at the places where the files live, which feels like the Right Thing anyway.
Moving the hacks upstream
Moving the metadata is not 100% foolproof, however: for example, we have
photos on our Flickr
account which were not taken by us, but by our friend Kim. But the Flickr API has no
way of knowing this, so I've added a tag to those pictures which looks
photographer:kim and then I'm looking for and extracting
that in this gem. Similarly, for the SoundCloud music, I'd like to tag
them with a recording location (and now an engineer's name) but this is
not supported, so I'm nailing those into the Description field as
Am I going to regret these decisions? Almost certainly.
gem install purdie
git clone https://github.com/rawfunkmaharishi/purdie/ cd purdie bundle rake rake install
You need to create a
_sources directory in your Jekyll project,
containing files with one-URL-per-line, like this:
It also resolves sets/albums on all of the supported services, so this kind of thing will work:
Purdie maps each input file onto an output file, replacing any extension with .yaml, something like:
Mixing up different services in the same input file makes no sense to Purdie. Don't do this
If a URL appears multiple times in a resolved list, only the first appearance will be propagated to the output file
You also need a .env file with the relevant credentials in it:
FLICKR_API_KEY: this_a_key FLICKR_SECRET: this_a_secret SOUNDCLOUD_CLIENT_ID: this_a_client_id VIMEO_BEARER_TOKEN: this_is_bearer_token
And then you can run
fetch is the default task (in fact currently the only task), so
purdie will work) and it will dump out YAML files into
flickr.yaml pictures.yaml soundcloud.yaml vimeo.yaml
ready for Jekyll to consume.
You can supply your own
_config/purdie.yaml file to specify a
# Flickr photos are happy to have a null title default_title: Raw Funk Maharishi # Map Flickr users to better names photographer_lookups: pikesley: sam # Specify output files per-service services: Flickr: output_file: "_outfiles/photos.yaml"
(see this for some other things you can tweak)
Tread carefully for now, because my metadata hacks aren't fully documented, and I may have inadvertently nailed-in some Raw Funk Maharishi-specific stuff (although I've tried hard not to).
There's no reason I couldn't support other services. There's some introspection magic at the heart of all of this which means that as long as each service is represented by a class that:
- includes the
- sports a
::matcherclass method which returns a string which will pick a URL out of an input file, and
- has a
#distillmethod which takes a URL representing an item on the service and returns a hash of metadata, see e.g.
- and optionally a
::resolveclass method which takes a set or album URL for the service and returns a list of URLs for individual items
then this should all Just Work. There's definitely a blog post in this, because Ruby introspection and metaprogramming is just mind-bogglingly powerful (and dangerous).
And of course, known issues are here.
Because Bernard Purdie is even more amazing than Ruby introspection.