When a neural network is trained, one expects that it will at least be able to correctly produce the expected results when shown the input it was trained on. For input not shown during training, one hopes that it will be able to generalize and produce coherent output. However, when shown something completely unexpected, then the network might produce bizarre results.

Just the other day I was training a neural network to understand the behavior of people waiting in line in a store. The results produced by the network for unseen inputs was all right — but something weird happened when the lights went out at the end of the day. There was nobody in line at that time, of course, but when lights went out, then the images that reached the neural network were completely different. So all of a sudden the neural network was reporting a full line, even though no one was there.

I don’t know anything about the methods employed by Google Translate, but I suspect it might have some similar problems. There’s this subreddit, TranslateGate, which collects examples of weird translations produced by Google’s tool. You feed it utter garbage, and it often produces coherent garbage. This must be because it always produces some output, so if you give it garbage, it will produce something. I assume that the output is somehow coherent because their networks were trained to produce coherent output.

It’s weird anyway, though. If you type “dog” 20 times (separated with a space) and tell it to translate from Igbo to English, it produces this output:

Doomsday Clock is three minutes at twelve We are experiencing characters and a dramatic developments in the world, which indicate that we are increasingly approaching the end times and Jesus’ return

Chilling, huh? This must be something like this: for some reason the neural network starts to produce something about apocalypse, and then it just keeps going. But you can produce other weird things too, and they’re not always eschatological. Translating sequences of “ga gu” of varying lengths from Somali to English, we get this:

Length Translation
1ga gu
2go to bed
3go to bed
4keep your car
5Get your car safely
6Keep your car safely
7your caregiver
8–9your personal safety record
10–11your visit to your home country
12your child's health care needs
13–14your child's day care home
15your visit to the United States
16the effects of the disease in the early stages of life
17–18the effects of the disease in the early stages of the epidemic
19–20the impact of the floods on the road to snow

And from Swahili to English:

Length Translation
1no
2do not go
3it is not too late
4there is no end of it
5–6it is not too late
7it is not too late for the day
8–9It is the end of this world
10–11it is the end of the age of the end of the ages
12the end of this world is near the end of the ages
13the end of the world is near the end of the ages of the ages
14it is the end of the age of the end of the ages of the ages
15–16at the end of the age of the greatness of the kingdom of heaven.
17at the end of the age of the greatness of the greatness of the kingdom of heaven.
18the end of the age of the greatness of the greatness of the kingdom of heaven.
19at the end of the great tribulation of the greatness of the greatness of the kingdom of heaven.
20at the end of the age of the end of the age of the greatness of the kingdom of heaven.

This doesn’t work with more prominent languages. It could be that the neural network just didn’t have that much material to be trained with. It could also be that the results reflect the training material. For Somali, immigration, health, safety; for Swahili, religion.