Posts about Science

I am a physicist, so naturally I also have things to share in this area. Here you can find articles about physics, but also about mathematics and statistics. Sometimes I also look at financial matters, these sometimes end up in this category.

Local Erosion with Cars

I've read an interesting book by a traffic planning professor from Vienna, Hermann Knoflacher. In “Virus Auto” he describes how the car makes cities ugly and not worth living in. There are a bunch of examples how a little town is connected to interstates, the people living there are told that it will create new jobs, bring wealth to the people and so on. The reality is different, however. The interstate connection will likely drain the little town.

People have been travelling the same time per day for centuries. This means that the effective range is what you can reach within around 45 minutes (one way). If you are by foot, you can only reach the grocery store in your town. That will be able to sell to all the people in the town. But once you have a car, you can also reach the one in the next slightly larger town. And this is where the trouble begins. Say the other supermarket is larger, has a better offering and slighty lower prices. You might start to shop there, because it is so convenient with the car. Soon other people will do the same, and the local supermarket will see a plummeting in customer base. The prices need to be adjusted, driving more customers away.

Building a connection to the interstate will mean that the effective range of the people increases further. This may sound like a good idea, but the other thing is that it will slowly kill everything locally.

I've taken a simple model. There is a little town and a slightly larger city. Both have a supermarket. The people from the city will always buy in their supermarket. The people from the town might shop locally or at the city supermarket, depending on the cost of going there. The fixed costs of the supermarkets will be distributed to all customers. If many people go to the city supermarket, its prices will fall, whereas the town supermarket will need to increase the prices.

Read more…

Writing a PhD Thesis

In September 2017 I have finished my Master's degree in Physics. I was offered a PhD position by my supervisor and gratefully accepted the opportunity. During the master thesis I started to work with lattice simulations, supercomputer programming and was getting into it. Although I already did not want to persue a full career in science, I still wanted to do a bit more research in physics before leaving for the industry.

The thing is that in Germany usually only 25 % of the employees in institutes have permanent positions. And these are usually people who do administration part time. These people are full time professors, administrators of machine shops, computer clusters or something else. The majority of positions are temporary contracts. The rationale seems to be that research benefits from the exchange of ideas, and if people move around the instituions, knowledge is spread. This completely ignores the fact that these are people, eventually wanting to start a familiy and the like. One usually does not get a consecutive contract at the same institution and have to move, often somewhere within the EU. The real chance for a permanent position would be to have it mixed with something permanent, we have IT administrators that are part-time administrators and part-time researchers/teachers. But then it is not really a career in research, it is merely something in academia. All these things set my long-term route, but I did not want to leave research at that moment. So I took the opportunity to do research for another three years.

At the beginning of a PhD the topic is not clear cut. My advisor had a few ideas, and I mostly started working with my valued PhD student coworker Markus to work on his project. There I helped to refactor a C++ code which did tensor contractions. Over the time I learned more of the code, had more and more ideas on improving it. Together we worked on it a lot, it was a really great time. Also I helped to improve the analysis that he needed for his data. Over time it became our analysis, I wrote most of it in the early days. He explained some of the mathematical theory behind it, I implemented a bunch of statistical transformations. On some days we sat there until very late to make some plots really readable, pretty and informative. As the project came to a conclusion, he finished up his dissertation and eventually handed it in. I was super happy for him to finish, and also sad to see him go and move to a different city.

Read more…

Fahrgeräuschresonator zwischen Häusern

Wenn ein Haus mit der Front parallel zur Straße steht und gegenüber noch ein Haus ebenfalls parallel steht, ergibt sich ein wunderbarer Resonator. Im Bild sind die beiden grauen Blöcke die Häuser, das rote ein Auto auf der Fahrbahn zwischen den beiden Häusern.

Der Abstand der Hauswände ist ungefähr 15 m. Im ersten Stock sind die Fenster auf vielleicht 5 m Höhe. Damit hat man einen Winkel von 33° von der Fahrbahn direkt zum Fenster. Die Strecke, die der Schall zurücklegt ist dann 9.0 m. Aus der anderen Richtung mit Reflexion an der Hauswand ist der Steigungswinkel nur noch 13°. Die Gesamtstrecke für den Schall ist dann 23.0 m.

Wir haben also einen Gangunterschied von 14 m. Bei einer Schallgeschwindigkeit von 330 m/s sind die Resonanzfrequenzen dann Vielfache von 23.5 Hz. In einem Schallspektrum müsste man dann so Interferenzlinien sehen, wie sie beim Doppelspaltexperiment vorkommen.

Mit der Android-App Spectroid habe ich dann einfach am Fenster das Schallspektrum aufgenommen, während ein Auto vorbeigefahren ist. Die Zeit verläuft nach oben, unten ist alt, oben ist neu. Zur Seite sind die Frequenzen aufgetragen. Links sind die tiefen Frequenzen, rechts die hohen. Je heller es ist, desto stärker war diese Frequenz zu dem Zeitpunkt vertreten.

In der Ellipse sieht man, wie es erst lauter und dann wieder leiser wird. Das Auto nähert sich, und fährt wieder weg. Und dann ist da noch bei 43 Hz, also dem doppelten der grob abgeschätzten Resonanzfrequenz, ein signifikanter Beitrag. Es ist auch zeitlich beschränkt auf die Zeit, während der das Auto genau zwischen den Häusern war.

Man kann hier also gut eine Interferenz von Wellen in einem Resonanzraum zwischen zwei parallelen Häusern beobachten. Den Effekt kann man auch ohne Spektralanalyse wahrnehmen: Es wummert unangenehm, wenn ein Auto vorbeifährt.

CO₂ Footprint of my PhD Thesis

As part of my Master and PhD theses I have used a lot of computer time on supercomputers in Jülich, Stuttgart, Bologna and the cluster in Bonn. I want to estimate the magnitude of CO₂ that this has released.

It is a bit hard to say how many core hours I have used exactly as I have already used data that existed already. Let's take like 5 Mh to pick a number. Then on JUWELS with the dual Intel Xeon Platinum 8168 with 48 cores that is around 100 kh. Each of the CPUs has a TDP of 205 W. Then there is network, file system, backup. Perhaps 750 W per node? And then there is cooling, which roughly takes the same on top, so 1.5 kW per node. That makes 150 MWh of electricity used. In Germany it seems that we would have to take 0.4 kg/kWh of CO₂. This would then give a little over 60 t of CO₂.

Read more…

Clusting Recorded Routes

I record a bunch of my activities with Strava. And there are novel routes that I try out and only have done once. The other part are routes that I do more than once. The thing that I am missing on Strava is a comparison of similar routes. It has segments, but I would have to make my whole commute one segment in order to see how I fare on it.

So what I would like to try here is to use a clustering algorithm to automatically identify clusters of similar rides. And also I would like find rides that have the same start and end point, but different routes in between. In my machine learning book I read that there are clustering algorithms, so this is the project that I would like to apply them to.

Incidentally Strava features a lot of apps, so I had a look but could not find what I was looking for. Instead I want to program this myself in Python. One can export the data from Strava and obtains a ZIP file with all the GPX files corresponding to my activities.

Read more…

Are Clothespins Worth Using?

I've been using clothespins all along. I know other people who do as well, and some who never use them. While discussing this over dinner, it seems there are two stances that people take:

  1. Pins are not worth using at all. The clothing dries as fast as it does without them, perhaps insignificantly slower. The time needed to work with the pins does not make up for the benefit of having the laundry done faster.

  2. Pins clearly must do a difference as the clothing is just in two and not four layers.

Well, I am clearly in the second team. But this is a hypotheses that one can test and negate. So apply the scientific method! As a setup I took four pieces of underwear and two t-shirts. Then I put half of them on the dryer with pins, the other just folded in half. Every now and then I measured their weight with a kitchen scale.

Read more…

Mehrwertsteuersenkung und Veränderter Grundwert

Bei ALDI gibt es wegen der Mehrwertsteuersenkung aktuell 3 % auf alles. Mediamarkt hatte manchmal auch so Aktionen, bei denen es 19 % Rabatt unter dem Motto »Mediamarkt schenkt die Mehrwertsteuer« gibt. Interessant ist ja eigentlich, dass bei den Rabatten die Preise sogar noch weiter gesenkt werden als nötig.

Sei der Nettopreis $N$, dann ist der Bruttopreis $B$ bei einer Mehrwertsteuer $m$ gegeben durch $B = N \cdot (1 + m)$. Im Normalfall ist $m = 0.19$ und daher haben wir $B = 1.19 \cdot N$. Möchte man die Mehrwertsteuer erlassen, so muss man den einen Rabatt geben, der $1/1.19 \approx 0.8403361$ entspricht. Das ist aber ein Rabatt von $1 - 0.8403361 \approx 0.1596639$, also knapp unter 16 %. Würde Mediamarkt den Kunden aber nur 16 % Rabatt geben, wären wahrscheinlich viele empört. Also gibt es noch weitere 3 % Rabatt für alle, die in Prozentrechnung nicht so fit sind.

Read more…

Number Sequence Questions Tried with Deep Learning

As part of IQ tests there are these horrible number sequence tests. I hate them with a passion because they are mathematically ill-defined problems. A super simple one would be to take 1, 3, 5, 7, 9 and ask for the next number. One could find this very easy and say that this sequence are the odd numbers and therefore the next number should be 11. But searching at the The On-Line Encyclopedia of Integer Sequences (OEIS) for that exact sequence gives 521 different results! Here are the first ten of them:

Sequence Prediction
The odd numbers: $a(n) = 2n + 1$. 11
Binary palindromes: numbers whose binary expansion is palindromic. 15
Josephus problem: $a(2n) = 2a(n)-1, a(2n+1) = 2a(n)+1$. 11
Numerators in canonical bijection from positive integers to positive rationals ≤ 1 11
a(n) = largest base-2 palindrome m <= 2n+1 such that every base-2 digit of m is <= the corresponding digit of 2n+1; m is written in base 10. 9
Fractalization of (1 + floor(n/2)) 8 or larger
Self numbers or Colombian numbers (numbers that are not of the form m + sum of digits of m for any m) 20
Numbers that are palindromic in bases 2 and 10. 33
Numbers that contain odd digits only. 11
Number of n-th generation triangles in the tiling of the hyperbolic plane by triangles with angles 12

So there must be an additional hidden constrain in the problem statement. Somehow they want that the person finds the simplest sequence that explains the series and then use that to predict the next number. But nobody ever defined what “simple” means in this context. If one would have a formal definition of the allowed sequence patterns, then these problems would be solvable. As they stand, I deem these problems utterly pointless.

Since I am exploring machine learning with Keras, I wondered whether one could solve this class of problems using these techniques. First I would have to aquire a bunch of these sequence patterns, then generate a bunch of training data and eventually try to train different networks with them. Finally I'd evaluate how good it performs.

Read more…

Default Standard Deviation Estimators in Python NumPy and R

I recently noticed by accident that the default standard deviation implementations in R and NumPy (Python) do not give the same results. In R we have this:

> x <- 1:10
> x
 [1]  1  2  3  4  5  6  7  8  9 10
> sd(x)
[1] 3.02765

And in Python the following:

>>> import numpy as np
>>> x = np.arange(1, 11)
>>> x
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
>>> np.std(x)

So why does one get 3.02 and the other 2.87? The difference is that R uses the unbiased estimator whereas NumPy by default uses the biased estimator. See this Wikipedia article for the details.

Read more…

Card Trick Explained with Combinatorics

Don't ask me why mind works that way, but for some reason I reacalled a card trick that a neighbor kid showed me when I was little. At the time I found it impressive that such things can even work. And today I could not really recall how the trick worked from the performer, but how it appears to the audience.

The general idea is this: You have regular playing cards and take a selection of 20 unique ones. Then they get paired up and shown to the audience alone. Each audience member picks one such pair and remembers them without telling the performer. Then the performer blindly stacks all those pairs and puts down the cards in a seemingly weird pattern with four rows and five columns. Each audience member indicates the rows (or row) that their pair is located at. The performer then tells them which cards they have picked.

As the performer does not necessarily have seen the pairs beforehand, he does not know which card belongs to which other card. Knowing only the row or rows seems a bit too little information. But then the solution just hit me while I continued to look at the trees outside: There are 10 pairs. And there are 4 possibilities to chose a single row and 6 possibilities to choose two different rows. So one only needs to make sure that there is only one pair which has this particular combination.

So let us go through it from the performer's perspective. The actual printing on the cards does not matter for us, we just need to know that the cards are paired up. I indicate this with the same fill color.

Read more…