Building our site: From Django & Wordpress to a static generator (Part I)

Discover our collaboration with TNC, using edge AI to automate fisheries monitoring at sea. Read more

Blog

blog

Tue, Sep 20, 2016

Authors

Alan Descoins

Chief Executive Officer (CEO)

We recently announced the redesign of our website and blog. So far, it has been a great success. The site is a lot faster, SEO is better than ever (signaled by the growth of organic traffic), user bounce rates are down, the amount of visited pages went up as did the duration of the sessions. Hurray! :-)

However, the change was much deeper than just a visual revamp. We decided to revise the tech stack that has been powering the site for more than 6 years. It was a good opportunity to create the site we always wanted, and learn in the process.

In this post and the next ones, we will talk about the process we went through and try to answer some questions:

What did we have before?
Why did we change to a static site?
Why did we pick Hugo as our site generator?
What things did we learn about Hugo in the process? How things were done, some tips & maybe a few hacks.
How did we use Hugo together with Webpack for our asset bundling?
How did we migrate and not lose SEO? The urls have changed, how did we handle the redirections?
How is this deployed?
How did we automate the deployment so a single git push origin master is enough?

In what follows, we will focus on the first 3 questions.

Our stack until 2016: Django and Wordpress

Historically, we have been users of the Django framework for the development of web applications. We like it a lot overall, it is easy for beginners to learn, comes with batteries included and allows for a fast development cycle. In the past couple of years, ongoing development in the Javascript community has pushed Django to the backend (thank you, React.js and others!). However, the fact that Django templates are not widely used anymore for complex web applications does not mean that the framework lost its usefulness. For small applications with limited interactivity in the frontend, templates might still make sense.

Every iteration of our website was built with Django and its server-side templates. We had models for most of the content (such as team members, customers, testimonials, etc.), so it could be edited through the Django Admin interface.

How our site looked back in 2012.

Our stack was:

Gunicorn as the WSGI server.
Nginx in front of gunicorn, and to serve static assets.
Varnish as a caché in front of nginx. Not like it was strictly necessary, but we wanted to be prepared in case we made the news and got many concurrent visitors (it was hosted in a small Amazon EC2 instance!).
PostgreSQL as our choice database.

Our first blog was built using a Django app called Zinnia. It did what it was supposed to, but required us to invest developer time to do basically anything. For the second iteration of our blog (in 2014), we switched to Wordpress since it was much better supported by the community and had plugins available for nearly everything.

The drawbacks

We had this running for several years, but there were several things that never left us quite satisfied with this stack:

Consequences of having servers:
- The need to configure alerts to detect downtime before our customers do :smiley:
- Needed to apply security patches and upgrades.
- Needed to ssh and see what was going on if the site was down for some reason. Only a handful of developers in the team had permissions to do this and they were generally busy with other tasks.
The overhead of Wordpress:
- Needing to have separate users and login in order to post.
- Keeping plugins up to date. Fix something that broke was painful.
- Sometimes formatting was lost by simply switching from WYSIWYG to text-only and back :-/. Sadly, this meant that some of our code samples had errors in them.
- Keeping the theme working when plugins were updated.
- Security vulnerabilities. You can never trust the plugins.

There is a saying that the shoemaker's son always goes barefoot. We have configured servers hundreds of times, set up automated alarms with CloudWatch, Datadog, Pingdom and plenty others. When you are serious about an app/site, they are a must.

The fact is:

We wanted to avoid servers altogether and not need to configure any alarms for anything related to our website/blog.

We have mostly static content; we are not really an interactive application, so what is the real advantage of using Django for the backend?

A developer's dream: go static

I could imagine a scenario in which our site is just a bunch of .html files in a folder somewhere, just like in the good old days. Could we build a 100% static website?

Static site generators are not new. Jekyll is the most popular one, being the de-facto generator in the GitHub Pages service. There are many others, written in several languages.

The idea really appealed to all the developers in the team. Be able to write in Markdown, commit to git (you get version control for free!) and have some system that auto-deploys when stuff is pushed to a certain branch. Could we convince the few non-techies that we have that this was the way to go? ;)

Requirements for the new site

Everything was making sense, so it was time to start working. However, there was this fear that in the middle of the development of the new site and blog, we would find out that some feature was unsupported or very difficult to achieve with the static site generator of our choice. If this happened, we would either end up migrating our code to another generator (adding to our time and costs) or, with great sadness, we would have to reconsider our decision to go static.

To avoid the unfortunate scenario, the smart thing to do was to write down the requirements for the new site and blog. This way we could detect potentially problematic stuff in advance.

URL

The site should be hosted under https://tryolabs.com, and the blog in https://tryolabs.com/blog/.

In our previous site, the URL of the blog was /blog. Using the same domain instead of the blog subdomain is (arguably) better for SEO.

We want to have pretty urls and nothing "nasty" ending in .html.

Sections and content

The main website should have sections such as Our Work, Team, What We Do, etc. No big deal, but for building the site we have to keep in mind that some of the sections in the website make reference to the blog:

From our blog in the home page shows the latest 2 blog posts.
Each team member's info in the team page shows the last blog post authored by the team member.

Blog

Apart from the standard style of blog with the main page listing all posts in chronological order and paginated, we had the following requirements:

Posts have one category and can have several tags assigned to them.
We have pages to list all posts with a certain tag or category.
- Each of these pages is paginated, just like the blog home page.
Posts have an author assigned to them.
- Each author has his own page with a picture and mini biography.
- Each of these pages has the list of posts by said author, paginated.
Blog should have Search capabilities.
Every url of the blog is nested under /blog prefix. For example:
- /blog/categories/machine-learning/
- /blog/tags/website/
- /blog/authors/alan-descoins/page/2/
- etc.
The url of the blog posts must be something like /blog/YYYY/MM/dd/post-slug/. If possible, we would like to keep the format of the urls of the Wordpress site.

Choosing the static site generator

Having seen several statically-generated blogs (like, every single site in GitHub pages) we knew many things could be done already. What we were not so sure about were the following requirements:

Tags and categories for posts.
Category and tag pages. Paginated.
Author pages with mini bio and list of posts. Paginated.
Search functionality.

It was time to dig a little into how we would handle these.

Evaluating Jekyll

Because of its big ecosystem, Jekyll is where we started looking.

We looked for sites generated with Jekyll to see if we could find something similar enough to what we were going to build. From our findings, the pagination seemed to be the biggest problem. In particular, the pagination by author/tag/category. With Jekyll you can do tags and categories but these were pages that listed them all, instead of separate pages for tags/categories with pagination support.

We were glad to find out that the StackExchange blog is open source and built with Jekyll, so we looked here. It made clear that Jekyll can pull off some pretty complex sites; and this seemed very close to what we needed. It has tag and author pages with pagination! How did they pull it off? With the +400 lines of Ruby code of the custom pagination plugin they built. We didn't like this, modifying this plugin to adapt it to our reality was a bit too much.

Looked like Jekyll might not be the best fit. Maybe there was some other static generator with built in support for this?

Trying out Hugo

When I first found out about Hugo, I was impressed and hoping that some day I would find an excuse to try it out. We are talking about a static generator that is shipped as single executable (built in Go), which needs no runtimes and no plugins -- contrast to Jekyll which depends on the Ruby ecosystem. Cross-platform, fast (~1 ms build time per page), with live reload for easy development. Hopefully the features are on a par with this.

We decided we had to give Hugo a go (pun intended).

The first thing we noticed was that, contrary to Jekyll, Hugo has native support for the notion of taxonomies. These taxonomies give us a way to classify content however we like. For example, we can add a taxonomy for a series of related posts.

The default taxonomies are categories and tags, which is just what we needed. Could we also build a taxonomy for author and also have posts grouped by author? Turns out it is trivial to do. Provided the correct templates exist, Hugo will automatically create pages for all the content under each taxonomy, and these will be paginated. This killed requirements 1 and 2.

For requirement 3 we needed to store the authors pages with a profile picture, a small biography, and some other data. Reading the documentation, it looked like it might be a good fit for data files where we can have each author as a toml file. There is even a GitHub issue open as of Hugo 0.16 in order to standardize this, so we should have even better support in the future. Ok, requirement 3 was officially killed.

There was only one thing left before actually starting an implementation: how do we handle the Search functionality in static sites? We found there were two options:

Use some search-as-a-service provider. The most common ones are Algolia and Google Custom Search.
Implement our own indexing and Search using a client-side full-text search library, such as lunr or elasticlunr.

We would leave this choice for later; at this stage we only needed to know whether it was possible. The answer was yes: requirement 4 was not an issue anymore.

Conclusion

In hindsight, the decision to use Hugo has proved very successful. The entire site builds in under 400 ms, and every developer in the office can have Hugo properly installed within a few seconds for local development.

Thank you to spf13 and all the people who built Hugo. You have created an amazing tool!

In the next post we will dig deeper into how the site is actually generated, and how we solved some of the problems that came up. We will present a modern workflow for a static site, using Hugo and Webpack. If you don't want to miss it, make sure you subscribe to our newsletter.

Wondering how AI can help you?

hello@tryolabs.com