Hi, I am Martin Ueding, a physicist (Dr., M.Sc.), machine learning researcher and software developer.

Although I am German, most of the content on this website is in English as both science and programming is communicated mostly in English. Lately I have written a lot of traffic policy and cycling, which is in German.

Since 2011, I studied Physics at Bonn university. I have finished my Bachelor degree in 2014 and my Master thesis in 2017. My dissertation was finished in 2020. See the studies section for the study related material.

At the age of 13 I started programming with C. Then I looked into HTML and CSS, started to use PHP and then MySQL. Looking for something to write software with a nice user interface with, I came to Java. Then I looked into more languages like Bash, Python, JavaScript, GNU Octave, VimScript, Fish. From then on, I tried to do most things in either Python 3 or C++11, so that I only have two languages, which I know well. For my work I now use R and also picked up the Wolfram Language, personally I looked into Haskell. See my portfolio.

My most popular project are the thinkpad-scripts which I wrote to get all the screen and digitizer features working effortlessly on my ThinkPad. It is a collection of Python modules that take care of docking and screen rotation.

For several years now, I have been almost paperless. The main challenges are papers I get from other people and hand written notes. The former can be faced with a scanner, the latter was more interesting. I own a Wacom Tablet since long before I started to become paperless, but I never had a good software for note taking. Since I did not find one at first, I wrote jscribbe. After I was almost done with that, I discovered Xournal which I now use most of the time.

You can also find me on other platforms:

On Twitter one can often see people who have more to say than the character limit allows in a single tweet. They usually then reply to their own tweet and create a thread by doing so. One can feel that the medium isn't made for short articles. This is why I like my blog, I can just write as much text as I want. I can include more than four images. And I can structure the paragraphs like I want. People can share the URL as one tweet, or send me an e-mail discussing the whole post.

In order to take the rough edges off twitter threads, there are multiple apps to unroll these threads. This usually then looks like this:

The application replies with a link to an unrolled thread. One can then read it on a website where each tweet is turned into a paragraph. But one has to answer to the original tweet and people often don't check whether anyone else has used the app already. So one finds many more such posts:

# Lacking Uncertainty Estimations in Natural Language Processing Papers

My university background is physics, which is an empirical science. All measurements or derived quantities must be quoted with an error estimate. This should ideally include both a statistical and a systematic error. If you take a look at the paper from my thesis, you will find this table:

It contains the comparison of a certain quantity (pion scattering length) from various other publications and our result. There are uncertainties given, either as a combined value or as separate statistical and systematic ones. One result by the ETMC even has asymmetric systematic errors quotes. You can look at these numbers and figure out whether the result from that work lies within the error budget of the other works. The raw table isn't as meaningful as a plot, but you could create a plot from the table. And one can see that our result has a one standard deviation confidence interval of $[-0.0567, -0.0395]$, so it easily encompasses all previous work. Also it has the largest error of all results, so it is the least precise addition to the field. This is okay as it was just an auxiliary result and we weren't aiming for precision there.

In my new field, natural language processing (NLP), I cannot say the same thing. There are no error estimates whatsoever! And it really annoys me, you cannot really derive any conclusion from the data. Take for instance the paper on BERT. They show a table where they compare two variants of the BERT model with other language models in various tasks:

# Papiercontainer an der Siegburger Str.

An der Siegburger Straße standen bisher drei Altpapiercontainer auf Höhe der Integrierten Gesamtschule Bonn-Beuel. Karte von OpenStreetMap.org:

Dort ist aber ein geteilter Geh- und Radweg (Zeichen 241), sodass Autofahrer dort nicht parken dürfen. Sie taten es natürlich regelmäßig trotzdem, obwohl dort zwei Fahrstreifen für den Autoverkehr vorhanden sind. Immer wieder hatte ich also solche Situationen wie diese hier.

Der Autofahrer war sich sogar nicht zu blöd mir zu erklären, dass er auf der Fahrbahn gar nicht halten dürfe, und daher eben auf dem Radweg parken dürfte. Irgendwo müsste er ja schließlich parken um sein Altpapier zu entsorgen. Ich verabschiedete mich mit »Sie bekommen dann Post!« und fuhr weiter.

Auf Twitter habe ich mit Leuten diskutiert und herausgefunden, dass man auf der Fahrbahn tatsächlich nicht halten darf.

Ich habe immer wieder die gleichen Diskussionen beim Thema Verkehr. Daher habe ich jetzt einfach mal ein paar Dinge als Argumentationshilfe gesammelt.

# Obsidian Markdown

A while ago I have written about note-taking software. At the time the most promising candidates were Joplin and just flat Markdown files. A colleague has told me about Obsidian. It is an awesome note taking application as it satisfies all the needs that I have: It uses plain Markdown files to store the notes, there is on siloization into an opaque database. But it provides a file explorer to quickly switch between notes. It also has some gadgets like a link graph, but I haven't used these yet.

Other great features is HTML import from the clipboard, which makes copying snippets from webseites much easier. It also supports split windows, which can serve as editor, preview and outline panes in any combination. It also manages attachments by copying them into the notes directory. And one can have multiple note directories, which are called “vaults”.