Posts about Science (old posts, page 4)

I am a physicist, so naturally I also have things to share in this area. Here you can find articles about physics, but also about mathematics and statistics. Sometimes I also look at financial matters, these sometimes end up in this category.

Default Standard Deviation Estimators in Python NumPy and R

I recently noticed by accident that the default standard deviation implementations in R and NumPy (Python) do not give the same results. In R we have this:

> x <- 1:10
> x
[1]  1  2  3  4  5  6  7  8  9 10
> sd(x)
[1] 3.02765


And in Python the following:

>>> import numpy as np
>>> x = np.arange(1, 11)
>>> x
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
>>> np.std(x)
2.8722813232690143


So why does one get 3.02 and the other 2.87? The difference is that R uses the unbiased estimator whereas NumPy by default uses the biased estimator. See this Wikipedia article for the details.

Number Sequence Questions Tried with Deep Learning

As part of IQ tests there are these horrible number sequence tests. I hate them with a passion because they are mathematically ill-defined problems. A super simple one would be to take 1, 3, 5, 7, 9 and ask for the next number. One could find this very easy and say that this sequence are the odd numbers and therefore the next number should be 11. But searching at the The On-Line Encyclopedia of Integer Sequences (OEIS) for that exact sequence gives 521 different results! Here are the first ten of them:

Sequence Prediction
The odd numbers: $a(n) = 2n + 1$. 11
Binary palindromes: numbers whose binary expansion is palindromic. 15
Josephus problem: $a(2n) = 2a(n)-1, a(2n+1) = 2a(n)+1$. 11
Numerators in canonical bijection from positive integers to positive rationals ≤ 1 11
a(n) = largest base-2 palindrome m <= 2n+1 such that every base-2 digit of m is <= the corresponding digit of 2n+1; m is written in base 10. 9
Fractalization of (1 + floor(n/2)) 8 or larger
Self numbers or Colombian numbers (numbers that are not of the form m + sum of digits of m for any m) 20
Numbers that are palindromic in bases 2 and 10. 33
Numbers that contain odd digits only. 11
Number of n-th generation triangles in the tiling of the hyperbolic plane by triangles with angles 12

So there must be an additional hidden constrain in the problem statement. Somehow they want that the person finds the simplest sequence that explains the series and then use that to predict the next number. But nobody ever defined what “simple” means in this context. If one would have a formal definition of the allowed sequence patterns, then these problems would be solvable. As they stand, I deem these problems utterly pointless.

Since I am exploring machine learning with Keras, I wondered whether one could solve this class of problems using these techniques. First I would have to aquire a bunch of these sequence patterns, then generate a bunch of training data and eventually try to train different networks with them. Finally I'd evaluate how good it performs.

Card Trick Explained with Combinatorics

Don't ask me why mind works that way, but for some reason I reacalled a card trick that a neighbor kid showed me when I was little. At the time I found it impressive that such things can even work. And today I could not really recall how the trick worked from the performer, but how it appears to the audience.

The general idea is this: You have regular playing cards and take a selection of 20 unique ones. Then they get paired up and shown to the audience alone. Each audience member picks one such pair and remembers them without telling the performer. Then the performer blindly stacks all those pairs and puts down the cards in a seemingly weird pattern with four rows and five columns. Each audience member indicates the rows (or row) that their pair is located at. The performer then tells them which cards they have picked.

As the performer does not necessarily have seen the pairs beforehand, he does not know which card belongs to which other card. Knowing only the row or rows seems a bit too little information. But then the solution just hit me while I continued to look at the trees outside: There are 10 pairs. And there are 4 possibilities to chose a single row and 6 possibilities to choose two different rows. So one only needs to make sure that there is only one pair which has this particular combination.

So let us go through it from the performer's perspective. The actual printing on the cards does not matter for us, we just need to know that the cards are paired up. I indicate this with the same fill color.

Fit Range Determination with Machine Learning

One of the most tedious and error-prone things in my work in Lattice QCD is the manual choice of fit ranges. While reading up on Keras, deep neural networks and machine learning and how experimental the whole field is, I thought about just trying the fit range selection with deep learning.

We have correlation functions $C(t)$ which behave as $\sum_n A_n \exp(-E_n t)$ plus noise. The $E_n$ are the energies of the state $n$, the $A_n$ are the respective amplitudes. We are interested in extracting the smallest of the $E_n$, the ground state energy. We use that for sufficiently large times $t$ the term with the smallest energy dominates the expression. Without loss of generality we say $E_0 < E_1 < \ldots$ and formally write $$\lim_{t \to \infty} C(t) = A_0 \exp(-E_0 t) \,.$$

By taking the effective mass as defined by $$m_\text{eff}(t) = - \log\left(\frac{C(t)}{C(t+1)}\right)$$ we get $m_\text{eff}(t) \sim E_0$ in the region of large $t$. There are more subtleties involed (back-propagation, thermal states), which we will ignore here. The effective mass is expected to be constant in some region of the data where $t$ is sufficiently large such that the higher states have decayed; yet the exponentially decaying signal-to-noise-ratio is still sufficiently good. An example for such an effective mass is the following.

Simple Captcha with Deep Neural Network

The other day I had to fill in a captcha on some website. Most sites today use Google's reCAPTCHA. It shows little image tiles and asks you to classify them. They use this to train a neutral network to classify situations for autonomous driving. Writing a program to solve this captcha would require obscene amounts of data to train a neutral network. And if that would already exist, autonomous cars would be here already.

The captcha on that website, however, was of the old and simple kind:

It is just six numbers (and always six numbers), the concentric circles and some pepper noise. These kind of captchas are outdated because one can solve them with machine learning. And as I am currently working through “Deep Learning with Python” by François Chollet and was looking for a practise project, this captcha came as inspiration at just the right moment.

Physics in Star Trek: Enterprise

I've always enjoyed the science fiction genre, and there are many books shows available. Especially I like works where the physics are credible. The Enceladus series by Brandon Q. Morris is such a work. Also The Expanse show seems pretty great in that regard.

Recently I have watched Star Trek: Enterprise and loved the plots, the characters and their development, the recurring arch enemies and the general uplifting spirit. But from the physics side I needed to chuckle quite often. Some people just take the science to be fictitious and don't bother; but I prefer credible science fiction and complain a lot.

First off: Why does always something explode on the bridge when they get hit? On a navy warship the bridge is exposed, that could happen as well. But they have a CIC which is a bunker inside the ship. Enterprise does not have a window on the bridge, so why is it located at the edge of the hull? In The Expanse, the MCRN Donnager seems to have a combined bridge and CIC well protected in the ship. In the fight nothing explodes in the CIC. And even the railgun hit is unspectacular, as it should.

Risk Analysis for Risk

In 2011 I was on vacation with friends, we played a lot of Risk. Somehow we ended up having fights of hundreds of armies against each other. Since with every dice rolling you can only eliminate up to three armies, you need a lot of rounds until a battle is settled. While my friends were occupied in a battle of 200 against 150, I used my freshly acquired Python skills to write a program to do the dice rolling, risk-auto-dice.

The program does exactly what the players do until one player runs out of armies:

• The attacker rolls up to three dice, the defender up to two. The number of dice cannot exceed the number of armies. In the game a player can choose to use less dice, the program uses the maximum amount.

• The results are ordered descending and paired up. For every pair which is unequal, the person with the lower number loses one unit.

Three And Five Liters

There is this classic riddle where you are given two containers, one has a capacity of three liters and the other five liters. Your task is to extract exactly four liters. There are no further markings on the containers.

To measure two liters, it is quite straightforward: Fill the five liter container, transfer to the three liter container until the latter is full. Then there will be two liters remaining in the five liter container. To obtain one or four liters takes more steps. I wondered what results are possible with this setup. Can you achieve any amount of liquid in both of the containers?

Physik in »Mass Effect«

Zur Zeit spiele ich die Mass Effect Trilogie. Im Gegensatz zu anderen Science Fiction Universen stolpere ich häufig über die Physik.

Gute Beispiel für glaubwürdiges Science Fiction sind:

Stargate

Hier kommt die Technologie von den Antikern, die ursprünglich auf der Erde gelebt haben und die Technik hinterlassen haben. Das Stargate kann ein Wurmloch aufbauen und stabilisieren. So wirklich ausgeschlossen ist das nach meinem Verständnis der allgemeinen Relativitätstheorie (ART) nicht.

Die Raumschiffe legen große Distanzen im Hyperraum zurück, in dem unsere vierdimensionale Raumzeit eine Oberfläche ist. In diesem höherdimensionalen Raum gibt es dann Abkürzungen, weil die normale Raumzeit hier gekrümmt ist. Je nach dem, wie das exakt gemacht ist, ist das noch kein Widerspruch zur aktuellen Physik, Stringtheorie hat ja auch mehr Dimensionen.

The Expanse

Dies ist letztlich »nur« mit Ingenieursleistung zu schaffen. Die Arbeit von Ingenieuren schätze ich sehr; ich meine, dass es keine neue Physik für diese Serie braucht. Es existiert einzig ein hocheffizienter Antrieb und ein Serum, das Menschen Beschleunigungen von 40 Erdbeschleunigungen aushalten lässt. Das ist insgesamt sehr plausibel.

In Mass Effect gibt es aber so ein paar Dinge, bei denen ich echt lachen musste.

Half a Cube

In lattice QCD you have a qubic lattice. It has certain symmetries:

• Rotation around an axis perpendicular to a face. This goes with 90, 180 and 270 degrees.
• Rotation around an axis along a face diagonal. This goes with 180 degrees.
• Rotation around an axis along the volume diagonal. This goes with 120 and 240 degrees.
• There is also the inversion symmetry.

Taking all of them together you will get 48 elements, the octahedral symmetry group.

The question then was how the symmetries break down when we make a volume diagonal a special direction. In other words: What symmetries does a cube have when it is cut perpendicular to a volume diagonal? We know from group theory that it only has the rotations with 120 degrees left, but can one see that visually?