Log Analysis with GoAccess
Several years ago I used Matomo to analyze the traffic that I have on my website. This is a PHP application that uses either JavaScript or a pixel image to track the user's action on the website. It gave me a list of the most popular pages and also told me how people got to the “page not found” page. I could go ahead and improve already popular articles and fix redirections.
There have been somewhat frequent updates to Matomo which I had to install due to security. I did not want to have unpatched PHP stuff on my web server. My website has been static and zero maintenance at that time, so I got annoyed. At tomes I updated Matomo more often than I actually looked at the data. Then came the GDPR and I ditched Matomo because I was not entirely sure what sort of privacy statement I would need on my personal website.
This left a gap and I have thought about filling it. I wanted something that did not require me to update stuff on the server regularly. So I thought about Google Analytics as I would just have to add their little bit of tracking JavaScript. But then all the data would go to Google and it would be even worse than Matomo regarding the GDPR. My web hoster provides logs processed with Webalizer, which look terribly old fashioned. Luckily I have found GoAccess and can download the logs via FTP.
Now I can just have a look at the logs myself via a command line tool or export it as HTML. This is an example of the some recent logs after uploading the new Nikola-based website:
I can also look at all the URLs that give a 404 by opening that panel. This
quickly allows me to see that there are a few redirects missing still. I have
therefore added the following (and more) to my .htaccess
file.
Redirect permanent /studies/ /pages/studies/ Redirect permanent /studies/bsc_physics/index.html /pages/studies/ Redirect permanent /studies/index.html /pages/studies/ Redirect permanent /studies/msc_physics/index.html /pages/studies/
I like this tool because it gives me access to the information that I need without having to burden the users. Users are not tracked with cookies, you don't have to worry about being spied on. And also you don't have to wait for additional JavaScript to load, the site should stay really fast. The only thing that I know is the user agent, and that is something that does not really tell me much either if you are just using Firefox or Google Chrome.