Sphinx to Nikola
As you can likely tell, my website looks different now. For a very long time I have been using Sphinx with a custom theme for my personal website. It has served me rather well, but over the time I have been pushing it into ways that are not really how it is designed. Foremost it is a documentation generator. As such it has a hierarchical structure and does not support blog posts or RSS feeds. Some of my content is “timeless” like the study material, but other things slowly outdate and would rather fit into a blog structure. There are extensions to Sphinx that try to add these features, but I rather decided to move it to Nikola instead.
In this post I will describe how I have made the transition and what was needed to get the content moved over.
There is a nice overview of static website generators that lists many static site generators. There are many to choose, but I want to stick with one that is fairly popular and won't have it's support dropped soon. Also it should be mature, I want to keep going with it for a while into the future. In the past I already looked at Pelican and Tinkerer but found them rather cumbersome. And I have looked at Nikola in the past already. What put me off was that it only had two levels of hierarchy for the pages. This seemed to be very limiting at the time, but now I see it much more relaxed and just moved.
I am looking forward to now have it easier to write blog posts and have a better suited tool for that task such that I can concentrate solely on writing the content.
In this post I will show you how that transition was made. It took me three days to get everything converted. I had 194 posts and 34 pages that needed to be converted.
Structure and navigation
Sphinx has an tree of documents that can be arbitrarily deep. One defined an
index page and uses the toctree
directive to add child pages to it. I have
done so before, I had a hidden toctree
which included all the other
structural pages like “Programming”, “Computer” or “Studies”. There I would
then have additional toctree
directives that include all the other files.
This way Sphinx does not really have a site navigation but rather a document
structure. For a manual to be rendered to HTML, PDF or EPUB, this is just
great; for a personal website it is a bit clunky already. The top navigation
was a real hack in the template. I accessed the complete structure tree and let
it render the top level. I had to adapt the CSS classes manually such that the
current one would render at active.
Nikola approaches this by having posts and pages. The posts are the blog posts, they have one category and multiple tags. They are ordered chronologically. Using the navigation I can manually set what I want. By just pointing to the URLs of the category indices I have lists of posts for that category, without any hacks.
Sphinx did not care about the location of the source files, I just had to
manually link them together for the overall tree structure. With Nikola I have
a directory pages
and posts
. The posts get all thrown into the blog
structure, and the pages are compiled but I have to manually link to them. The
blog posts themselves have meta data which contains their category and tags. I
use the YAML metadata and it looks like this:
--- title: Derivation of the Euler-Lagrange-Equation date: 2013-06-12 11:27+0200 category: Science tags: Physics ---
Sphinx uses the image
, figure
and download
directives to tether files
into the document. The images can reside anywhere, but I have just had them in
the same directory as the text. My travel report about China was located at
travel/2019-06-china/index.rst
and the images are all in the same directory.
This makes it easy for me to see all the files that belong to one post.
Nikola rather wants to have these in a separate top-level directory images
.
This can easily be changed by adding the posts
directory to the image folders
in the conf.py
:
IMAGE_FOLDERS = {'images': 'images', 'posts': 'posts'}
And as the posts can reside anywhere, I can just have the article file at
posts/wuhan-beijing-china-2019/main.md
and put all the images into that
directory. This way I have all the pictures together with the article and do
not need to worry about directory paths when including the images in the source
with Markdown's ![]()
.
The same can be done with the files as well. One just has to disable the separate copying of sources, otherwise there are two rules copying the original files and gives a clash.
COPY_SOURCES = False FILES_FOLDERS = {'files': '', 'pages': 'pages'}
This allows me to keep the PDF documents from my studies also close to their
documents. With both of these in place I can keep the directory structure that
I have from Sphinx and the URLs just change slightly. Also the posts can still
be called index.md
, such that the file name stays the same.
Theme
I did not quite like the default bootblog4
theme. There is the bootstrap4
theme which was more suitable for a mixed blog and site like I have. But the
color and fonts are the same as with every other bootstrap webseite, so I
wanted to just have it a bit differently.
I was very happy to learn that there is the concept of bootswatch and that it allows me to just exchange the subtheme without having to change anything myself. So I was just able to install a different subtheme via this command:
nikola subtheme -s flatly
And in the configuration file I needed to switch the theme:
THEME = "custom"
Convert reStructuredText to Markdown
Nikola supports different markup languages. I have been using reStructuredText for my old website and I could have just continued using that. But I have been using Markdown for everything else (technical reports, R notebooks, code documentation, personal diary) and therefore wanted to make the jump at some point. Nikola uses Pandoc for the Markdown conversion, so a lot of non-original Markdown features are supported as well. The ones that are still missing are these:
- Figure with caption
- Citations
Even custom directives are supported with Markdown, which is quite nice.
To convert the reStructuredText files to Markdown I just used Pandoc:
pandoc --atx-headers --columns=79 index.rst -o index.md
There are a few things that needed to be done, for instance the classes at the
fenced code blocks or unescaping a few '
and "
symbols.
The R-Markdown articles come out as Markdown. Before I needed to convert them to reStructuredText, now I can just leave them there. That makes it even more convenient as I can have a single R-Markdown file which contains the whole post.
For the captions I tried to enable the implicit_figures
option, but that
somehow did not work properly. I also tried to use the
figureAltCaption extension but
that did not do the trick either. So I just used regular expressions to
transform the Markdown code into the HTML code that would come out anyway. In
Vim I used the following hardly readable snippet to convert.
:%s#\v\!\[([^]]+)\]\(([^)]+)\)#<figure> <img src="\2" /> <figcaption>\1</figcaption> </figure>#
References and URLs
All my internal references have been using Sphinx doc
directive, which is not
supported by Nikola. So I needed to go through all of them and update the
links.
Also the URLs of all site have changed. As I know where they have been and
where I moved them, I did not want to embarrass myself with broken bookmarks
and external links. In my .htaccess
there are already a bunch of
redirections, now I just add another set of them.