Thomas Reynolds, Creator of Middleman / Technical Director at Instrument • Wikia, Friday, Apr 1, 2016
Thomas Reynolds is the Technical Director at famed Portland digital agency Instrument. He’s also the creator of Middleman, one of the most widely used static site generators for large enterprise sites like MailChimp, Sequoia Capitol and Vox Media.
With the release of v4.0 of Middleman, Reynolds did a major overhaul to the tool, which had been stable at version 3.0 for almost three years. Here, he discusses his reasons for the update, and walks through the changes.
Full transcript is below
Chris Bach - Hey everyone, welcome to the fifth edition of Static Web Tech Meetup. Thanks so much for coming. We’re really excited today because we convinced Thomas Reynolds to fly down from Oregon where he spends most of his time being the technical director at one of the most famous digital agencies in the world right now, Instrument. But he is also the creator of Middleman, a static site generator and the weapon of choice for some really big enterprise solutions. And I won’t talk more about it because that’s what he’s here to do. There’s a new version out and he’ll go through it, and we’ll talk about the stack. And we’ll talk about- well he’ll take a Q&A afterwards. Really excited thank you so much for coming, and Thomas Reynolds.
Thomas Reynolds - Alright. Hi everybody. You hear me okay? Alright lovely. Alright so I thought I’d start out with a show of hands. Who actually has used Middleman? Lovely, alright. So most of this talk I’m gonna go kinda a little bit into the weeds on some of the new features. If you haven’t used Middleman or if I just completely like, talk right past you because I’ve been looking at this for way too long, just flag me down immediately, don’t let me go too far. I want this all to make sense. And with that, let’s do it. So I’m Thomas. My Twitter handle, @tdreyno and Award Winning Fjords is where I infrequently write inflammatory things, but I’m getting better about it over time. And of course, I am the creator of Middleman. Middleman as you know is a static site generator. It’s written in Ruby and it’s been around for a really, really, really long time. So people seem to like it and it’s been pretty great.
Over the last year and a half or two I’ve kind of shepherded the rest of the company into using Middleman without imposing too much on them, and it’s been really great to see people make really huge and amazing projects without my influence at all. So some of those projects you may not know are actually static sites. Google Design is a project we did for them last year. It is where they house all their material design documentation and is also a medium-like publishing platform where they can put articles about design and videos and stuff like that. So this whole thing, it was actually using a beta Middleman before version four came out, which is really nice of them to let us do that. We recently relaunched Stumptown Coffee from Portland, Oregon if you’re familiar at all. Also, again we’re able to pair something which was a E-Commerce site on one side through an API with static for most of the front end, which makes the site super fast.
One of our other teams that I don’t work with directly at Instrument we also just relaunched a site for eBay’s new mobile app over the holidays. Again straight static site, super fast, it’s on all the CDN’s. It’s actually in something like 40 something languages, which was a piece of cake and I’m really proud of this site. I also got to do the front end for this site too so it’s really great. And we also just redid Sequoia Capital’s site, which is really, really beautiful now, and again super fast. This is one of the more complicated builds and we’re slowly writing some technical information about it because it’s pretty cool how it pairs with other non-static platforms for a little bit extra internet activity.
So what is a static site generator? You can skip this slide. It’s a thing you jam data in it and out the other side comes HTML. And hopefully it does that fast, and it allows you to host your site anywhere you want in the world. And not have to worry about security problems or, you know, caching problems. Well, there are still caching problems. So since there’s a thousand of these things now, everyone is making one. Pick your language, there’s probably four or five static site generators being built in it now.
So what makes Middleman different? For one, we’re really old. I wrote the first commit to Middleman seven years ago. I was basically spending my days before Instrument coding a lot of email templates and that is not very fun so I automated it and basically built some horrible pipeline that started with ImageReady, if anyone remembers that. Scripting ImageReady, running it through a Ruby pipeline, and ending up with emails coming out the other side.
As I went though my career I basically just carted this pile of Ruby scripts around behind me, and eventually I gave it a name and put it on RubyGems. And that was seven years ago, which I am very proud of because open source fatigue is real, and trying to stick with it for all that time has been really fun and has also been super rewarding to now see, like so many people using it that I would never even expect. And a lot of people I, you know, sites and stuff I actually use are built using Middleman so that’s really, really great.
Another thing about Middleman is version three, which was the current stable version before version four obviously, was out and stable for three years. That’s huge. It’s a simple tool, and I like to work with simple tools. So like I said data in, HTML out. There’s no real reason to keep reinventing the wheel every single year. There’s a lot of hype, and I’ll go into some reasons why I actually did reinvent a couple wheels for version four, but for the most part I’m proud of the fact that you’re able to build your company on this. You’re able to build your agency on this, and it’s just going to keep working, hopefully. And it’s not going to be the hype train every single year. That said, I may switch over to like, a react versioning thing and just jump to version 22 or 23. Depending on where Jekyll ends up I’ll just go a version higher like do the Playstation approach.
And yeah, so over the course of this process, there’s been 1,174 closed issues. And that’s a badge of pride, not a badge of shame. There’s just- software has a lot of bugs. All software.
Another pretty differentiating fact about Middleman or at least the way I approach using Middleman is that it’s code first. A lot of systems…you know, they take configuration in the form of JSON or YAML or something like that, or they take configuration in the form of convention. So if you put your files in the right place they come out in a place that you would expect. Or if you are able to make the world’s largest JSON file you can kind of organize things into a system that makes sense to you.
I don’t agree with this, I actually think that you know, it doesn’t matter how proficient you are with programming, you should build or write a couple, for loops and you’re going to get way more power out of those for loops than trying to, I don’t know. I don’t know if anyone’s ever has been using Webpack recently, but Webpack config is this crazy set of like special characters inside of strings to make pipelines and it’s pretty fun.
I don’t believe in that, so Middleman is code first. That means when we want to make new features we don’t just expose extensions and new features. We also expose the APIs that we build to use those features to everyone.
So Middleman version four. So after three years of stability I got tired of looking at the Ruby code base that I had been looking at for four years previous. So I starting taking some of the things that I learned as a developer in that time and trying to clean up the code base and making it a little easier for other people to work in it.
So basically just started picking features off, pick low hanging fruit, bad code, the code smell, got RuboCop in there to basically standardize all the Ruby into a single style guide, and then just started cleaning things up.
We went ahead and removed any functionality, with the help of a couple of code contributors, that was duplicated. So there used to be this way you could create a proxy, which is like our way of generating a file from a template. There used to be like two ways to do that. We removed one and now just the proxy method remains. So we did that in a bunch of places. Just rip stuff out that was duplicated and slightly different.
We also took out a bunch of features that you’ve never heard of such as Asciidoctor. Asciidoctor is gone, I’m sorry. It is now in its own extension. And then I just started using it at work. So we started using it on projects, and iterating until finally it got stable again. A lot of, pretty much all the code was touched, but we kept the original test suite for the stable version. Test suites aren’t magic, there are new bugs and regressions definitely introduced, but for the most part it should work pretty much the same as it always has with some new features and stability on top.
Alright, so I’m gonna run through a litany of new features. So we added the concept of environments, basically stolen straight from Rails. This is, you know, formally you could develop on a site or you could build a site and that was basically it. That’s not how we actually work so we added environments so you can have staging, you can have QA, you can have production, and you can fine-tune your config for your site in each one of these environments. Then, when you want to build for one of these targets, you just pass the environmental flag there with the “dash e” and it’l load up that specific config.
So this, you know, config can get really gnarly with all these if statements and environmental variables and this tries to make that way, way simpler. So our continuous integration server uses this. Our automated tests use this, and it makes it really nice to have a whole bunch of intermediate things.
You could also use this for content. So if you wanna have like, your environment could be version one and version two. You could build them separately and maybe they have different data sources on the way or maybe, you know, change some URLs as you migrate from versions to other versions.
We went ahead and we moved our new project templates to GitHub. They used to be in RubyGems which meant you had to do a lot of work in Ruby to install them and then get them in your path, and then find out where they are. So now they’re just Git repo-staged, Git-cloned down. This is basically how NPM and Bundler work anyway. So everyone’s pretty much okay with their libraries just shelling out to the git command.
So now if you’ve got a template for a reusable project that you want to make just through it up on GitHub, make it a public project. You can just pass the GitHub shorthand account name, repo name and it will pull it down for you. T
here are also- they’re not just static files anymore. So they can be evaluated as Thor tasks which is kinda like a Rig task. It has a little bit extra undergrounded stuff to it. Middleman uses it for its whole command line for historical reasons, so that’s why we’re using it here. And basically it allows you to script new projects. The current default new Middleman project now will ask you things like “Hey are you using Sass? “Do you want to use Compass? Do you wanna use LiveReload?” It will build that config for you based on those preferences rather than just doing a one size fits all. So that is available now to anyone who wants to customize their own templates.
We were running into a problem where people would write a loop inside of their config based on some external piece of data so they would say, “I have a data file that’s got a list of people in it and for each person I want to make a new person, well you know, personname.HTML.” And then they go back and they edit the data file and then they wouldn’t get a new person HTML out the other side. And that was really confusing to people, but the way it actually worked was you have a config file, we parse it when you start the thing, and then we just throw our hands up. And if you change the conditions of what information went into that config file, you have to restart the server to make it actually see those changes.
So in version four we re-architected a bunch of stuff in the core into this thing called collections, and basically what it means is there are now external sources such as the files you have on disk or the data you have in your data folder or now these kind of like live objects. When anytime they change they send an event and it looks into your config and if it sees that you are ever using “people” from “people.YAML” and you’ve touched the file “people” it will rerun the inside of your for loops. This should just basically mean you don’t have to restart your server as often, but does let you do really weird things like you can implement categorization or paging, or… a lot of like the blog functionality by just writing plain Ruby in your config. Say, you know, “look at the site map, tell me all the pages that are CSS, look in them, see which ones reference some URL, group those together and concatenate them. You can do that now and it’ll be up to date throughout the whole process. A lot of people don’t need this functionality, but if you are having to restart your server over and over again it’s pretty welcome.
As of about a week ago, someone had a really good idea, which is “Why can’t I set all these config values from the command line because my continuous integration server would really like that.” So this is in the current release now. Any config that Middleman can do can be passed as a command line argument.
We’ve been working through a bunch of stuff that relates to localization, which Middleman is quite good at compared to some other systems. And one thing we had a problem with was basically how do you refer to the thing that generates a file and a bunch of different locales. So you have like, a page which is the about page, and you wanna say like, “Okay link me to the about page in Spanish.” Or like “Link me to the about page in the default language.” And so what we ended up coming up with is an optional system that allows you to add a unique ID to any object in the system and then link directly to it using the normal template link helper. So whenever you’re building proxies or using the page command, you can pass an ID that becomes a unique key and then you can use that whenever you are building URLs. So it makes it really easy to like loop over all the locales and just generate a link for every language in every locale.
Got a little ambitious with the way URLs are handled, which works 95% of the time. Basically, previously we were taking the kind of Rails approach where we have these Ruby helpers that are like link_to or image_tag, or something that will generate a path to an asset. And then when we wanted to do a feature like add a hash to the end of every single file as we, you know, for caching reasons, we would hack, basically, into that helper and rewrite its output different string content.
Our continuous integration server has these super deeply nested paths so let’s us build these builds that are portable without people having to remember to go up three directories or four directories depending on where the files are.
As part of our refactor, we had some bad programming in the old version. Where basically, Ruby at the time was really a fan of singletons so we had this shared context that whenever you started Middleman it put it all on this one place and everyone can look into it. And that worked pretty well. But it wasn’t multi-threaded, we couldn’t run in parallel, and basically if you happened to have a variable the same name as one of my functions you would override it. And then we had all these bug reports of like, why did I override this setting from some other file in a loop? So we went in, we gutted that. Every single executable piece of user code now has a like a nice little sandbox that it lives in.
As a side effect you now have to add methods to these contexts if you’re an extension developer. So if you want to add a new command that is available in config.rb you can say expose to config. Or if you want something to show up in a template, you can use the helpers method as currently or you can, in your extension expose specific methods to the template. So this is actually a backwards incompatibility with a lot of the current extensions. It shouldn’t necessarily touch that much user code, but there it is.
Pro is, now we can run all of our builds parallel. This is a kind of like a 30, 40% speed boost. This is on by default in v4, and it’s working pretty well. A lot of it is still I/O bound, and just Ruby slowness in general, but this is a freebie.
So Middleman has this source folder, and that’s where you put all of your contents. I mean whenever a file changes it updates the thing, so you can go ahead and LiveReload or you can refresh, and you’ll get the updated contents.
Kind of abstracted that a bit, and now you can say my Middleman source is a combination of any arbitrary number of folders that exist anywhere. So you could say source is a combo of my actual source folder and maybe these Bower files over here, and maybe some shared, mounted Dropbox disc over here. And it’ll just layer them all on top of each other and lets you build one site out of all those things. So that lets you have shared libraries that you may wanna have, like on a whole bunch of different sites. You don’t wanna copy them in or use GitHub submodules. You can just point them all at a shared directory on your server and it’ll work just like the normal source folder.
Finally as part of like, the general clean up that we did, I went through and added Contracts which is a library in Ruby that adds kind of like a design by contract typing to Ruby. So where as formally it was just a complete, you know, free for all, Ruby is really permissive, and tries to coerce types back and forth without even asking you. Now every public Middleman method has a type signature, and so if you’re writing an extension against it, or if you are even writing code in your config.rb you pass a string where expected to function, you’ll get a nice error message that tells you “Hey we expect you to function, maybe you should fix that up.”
So this had been really, really great. I did a talk on it in the YouTube here. I’ll make sure these slides are available for everyone later. But it’s been really great to find all the bugs in your code you didn’t even know you had. I think within like a couple hours to turning it on I had already found a whole bunch of even internal method calls that were future bugs waiting to happen because of some type coercion stuff.
So yeah. Middleman 4 is out. It’s 4.1.4 as of today. I don’t have very many plans for the next version. I’m just going to keep using it at work, and as I come into new problems I’m going to continue to try to solve them. So we’re gonna aim to get back to that two to three years of stability with these new features, and let people keep building their stuff on it.
We still, as of now, do not have an official release of sprockets support for version 4. That is the project which is the Rails asset pipeline. It’s a rough library to work with, and I don’t use it anymore so I stopped working on it. It’s not in there yet, but we have a new maintainer for it. It looks like MailChimp’s gonna be taking over maintainer so we should have a full, polished up asset pipeline support for version 4.
That’s probably the biggest regression if you’re using just like, out of the box Middleman, and expecting it to work like Rails would. I don’t use it that way so it was really hard to keep the motivation up to support what is a really painful library.
And then, yeah. Basically what we can do to make it better is everyone else can help me out a little bit. Open source is super hard and I only have this one Mac laptop, and I don’t know how things are working on Ubuntu or every single new version of Windows. So keep submitting bug reports. Version 4 has been out for about three months now. We’ve nipped most of the super obvious regressions in the bud, but if you see some or if you try to upgrade your extension or your app to it, reach out. Super responsive.
And then we have the community forums and the Middleman site. So if you know what you’re doing if you’re already at the forefront of static there’s a lot of new people and there’s a lot of designers dipping their toes in who just need to help figure out the basics of what they’re looking at.
I made this slide because I was getting super excited that I was about to get to a million downloads. But since this afternoon, and now we actually went over there, I was gonna beg you to all open up your laptops give me a couple of hits, but I don’t need it! We’re good.
[Audience Member] Alright.
[Thomas] So, yeah. So Yeah. You know, outside of Middleman and more for like, this group, it’s actually really exciting to have groups like this right now, and it’s really exciting to be able to do a large portion of my work as static development which is really great. So I think we’re moving towards something, and I don’t entirely know what that new stack or platform is, but it looks kinda like this. So I commit to GitHub, I get website. And so like right now, we’re using netlify a lot for that. Makes that pipeline super simple. We also have some internal tools for that, but hopefully it will be more things like this that just, you know, take out all those build steps for you.
We’re still trying to solve the problem that WordPress I guess solved which is, clients really love to have a nice interface to put their data into. So we have these YAML files, we have these JSON files, they’re not super fun to look at, and our clients, or at least my clients aren’t going to go into GitHub and make pull requests for it. We’re trying to solve the problem of how do we do that.
So right now I’m using Contentful a lot. We’re also using some horrifying Squarespace scraping for some other projects to get data out of Squarespace and to YAML. So it’s really super pretty for our clients, and then we don’t have to go there. But I think that’s like a completely open question still is how do we, you know, they don’t care they just want, like I said, git in, website out. They just wanna form field in, website out. Those are the two problems I think we need to solve. And trying to see if we can come up with something that looks as good as Squarespace, but it’s actually like a joy to work with, it really works with static.
That’s it. Thanks guys.
[Host] Let’s do some questions.
[Thomas] Do it. Yeah, so the question is whether incremental builds which just regenerate the files that need to changed on demand are probably something that’s gonna be necessary in the future? I think, yeah probably.
Like, right now we’re just too agnostic because basically we’re saying like “Hey we’ll shell out to Node.js and if it gives us a string back we’re gonna make that a template.” Or like, you wanna use Go templates, whatever it’s So there’s not like anything that let’s us- Everything does its own dependency management differently too. So like Liquid template, or dependencies and partials are completely different from ERB ones, different from HAML ones.
The ecosystem is a little too broad to do that. That said, it would be probably pretty simple to just, instead of using our built in partial method, make you own, like tracking partial. And then you can actually build a look up of what file you’re in and where you end up on the way out.
So I think it’s definitely super useful. It’s not the most common use case, but you know, patches welcome basically is I think my answer on that one. It would definitely improve things a lot. You’re still not gonna be able to get out from under things like localization or string changes or layout changes that are gonna basically have to cause an entire rebuild of the site. But for tiny little HTML fixes on the page, that would save you a bunch of time.
The question was basically whether something like Contentful or a hosted CMS like what are the flaws currently in the current ecosystem with that? Contentful is completely agnostic. You just upload your data model and you put content into fields and you can get that back out later to build your site out of. I think that approach is always gonna be not the smoothest because it’s just like you’ll get like these random building blocks, but you don’t have a nice design through line.
Something like Squarespace is super nice because it’s like this is for blogging. And then the fields are things that you do in blogging, and they have a nice UX so that it works the way you expect it. So I actually think, you know, you have this Contentful thing which is like, data stored on the internet. And then you have Middleman which processes it. I think there’s probably another layer, which is these like, domain specific tools. So there’s something like, you know, Contentful, but like a blog editor interface between the two. So rather than using their form fields you use something that looks like WordPress.
Or, you know, if you have really specific contents publishing or like things where you’re using a lot of images and galleries, there would be some other piece of application or software that would, you know, give you nice UI and UX over what is basically just an arbitrary data source. I’m basically hoping the space gets so crowded that like every single niche has a product that tries to fill it.
Yeah so the question was basically, whether generic build tools like Gulp and Grunt and Rake - like whether they served the same purpose as like a static site generator? And I think it’s kinda the same answer to the previous question like, they’re the baseline tool and then you need domain specific tools on top of that. So you’re always gonna have- You know, you’re gonna be working with Sass, and whether you wanna write that same damn Gulp task every single time to do the same thing, you can do that, but eventually someone is going to package it up. And that’s just what we call a static site generator. Like, you can build that- There are plenty that are actually just built entirely- I think Brunch is one that’s I think a lot of just Gulp or Grunt plugins, I don’t remember.
And then, I think Octopress is another good example. Octopress is like built on top of Jekyll so it’s another line down. So you have the build tool, and then you have the thing that just makes websites, and then you have the thing that makes blogging websites. So I think we’ll just keep trying to make a nice pyramid of technologies and have more and more specific ones at the top.
You can also build, if you have the tools in place, you can use another piece of routing software like Apache or something to route people between two versions of the same page. But, as it’s static there’s not really a lot of control you have once it’s been built out.
So yeah, the statement was version four’s internals are using a lot more immutable data structures, and that is correct. We’re using the Hamster library, which is some persistent data structures for Ruby. I just want to play with them. And hopefully everything comes out the same the other side. People were in their code. They were looking at a data object and then they were changing a value and wondering why it didn’t save to the original source file. And so now it yells at you. I guess. So I think that’s really the only problem that the immutable structures solve. Hopefully people are not mutating their data too heavily, but it happens.
So the question was have I tried the Netlify CMS? I’ve looked at it, it’s I’m excited to play with it more. I don’t have any projects that would need it now, and our stack is really spun up with Contentful for a lot of stuff. But, yeah I wanna see more tools.
Yeah, so the question was basically, are clients asking for static or are they asking for the features that static provides and we now have a solution for them? Does that sound about right?
[Man] Yeah, just like if it’s easier to pitch like, for arguments of performance, are they more receptive to this than before? Like arguments that concerns the infrastructure rather than just the design-
So for the most part, no. No one cares until like, you sign a contract and you actually get to speak to someone technical. It’s basically people who sell marketing sites don’t know the difference. They’re paying for you to make a site that’s stable and fast and secure. And how you do that is more for you to work out on the side.
Middleman’s been better for us just to have a single pipeline for everyone that’s able to learn the same kind of tool. We still do Gulp and Grunt, and Webpack and Node and React and all these things, but if it’s just gonna be like, knockout what is a traditional website, then it’s really great to have a shared platform for that.
All the stuff on the side is stuff that I have to care about. I don’t want the server to go down. I don’t want to get hacked. You know, I want to be fast, I don’t want to have to pay for this cloud server all the time. So I get all those benefits without even having to like, thing about it because I just deploy the CDN and go to bed. It will never crash and it’ll never be hacked. It’s been great for me as a technical director to not have to worry, but the clients don’t notice as much.
Yeah, so the question is, what drives the pace of new releases for Middleman? Basically, yeah. In my conception it’s pretty much feature complete and has been since that 3.0 release. It provides the things you would get from competitors. So all the stuff that’s on top of this is niche features I need to do in my day job, and, you know, kind of power features that a small subset of people use. But, for the most part you know going from taking templates- Probably the biggest thing it has that it’s had forever is localization. That a lot of these frameworks don’t have. So the fact that it does the normal thing plus localization and gets you a nice, like fast site up the other end. Like, that’s kind of enough for me. That’s like one tool’s job. So if there are more features that can be glommed on I tend to want them to live in another step in the pipeline, or in like, an external feature.
Like, for Sequoia we actually did build a whole bunch of custom extensions that added additional functionality outside of the core. So for me as long as the extension API is solid and it’s working for people, then I’m okay with those kind of weird things going there.
Vox has stuff where they run- A lot of their CMS stuff is run through shared Google Docs that Middleman slurps in and builds with. And like, so they host their own Google integration plugins for Middleman. That doesn’t ever need to come into core, but like, they’re able to go ahead do that kind of stuff. Same thing with Contentful. I haven’t talked to the tech people on their side at all. Like, they just have a Middleman plugin that works pretty well.
So yeah, as far as like who’s involved with it. It is pretty much me. I have a couple of other people at work who- What happens with open source is when you are working on it you care very much. And so you submit a ton of bugs, submit a ton of patches, and then when you are not working on it, you just go off to your life, unless you’re the maintainer.
So people come in and out. So I have a bunch of like really great maintainers, but when they’re inbetween projects it’s probably just me. And honestly, you know, I’d say maybe 90 minutes of every day is simply bug triaging, and supports and pointing people to the correct documentation.
So it’s a large investment of time like from a community management standpoint. So I’m trying to encourage more people. I’m trying to have these teams that are now using it in a large capacity, and empower them to make their own extensions. To handle specific use cases.
[Host] Alright, are there any more questions?
Thanks everyone. Awesome.
[Host] Thanks so much. Thank you Thomas Reynolds. That was an awesome presentation.
Alright. Thanks everybody.
So yeah this is our third… static site generator presentation I think out of the five meetups we’ve had. I mean that’s a very central build tool for this new approach and this new stack. Actually what the biggest public directory for static site generators is called Staticgen.com, and that’s built in Middleman so there’s that. We maintain that. And, well yeah. Once again, thanks so much for coming. That was great. That was a great Q&A session. Looking forward to a- I guess it will be a few years before we see 5.0 right?
But you guys also in your work and the agency bring out some really awesome sites. I think my favorite is probably Sequoia right now. I mean for that type of site that just tells such a beautiful story and it’s super performant, so that’s a real joy. But yeah, I’m looking so much forward to seeing your upcoming projects and pushing the envelope for what can be done for this new stack. So thanks for that. Thanks for coming, and thank you everyone for coming. And see you next time.