Luc Gommans/ blog

661 MiB of mobile data used across 4 years, and other Nokia statistics

Written on 2021-05-19

As a teen I had a Nokia 6230i. You know: color screen, voice recognition, GPRS data, proprietary earbud connector, no-screwdriver-needed replaceable cover in case you sat on it and the screen cover cracked -- or just in case you wanted a red cover instead of a silver one.

It kept some statistics that I found interesting. I think I used the phone from roughly 2006--2010.

It would start GPRS only when needed, like when opening the browser. This browser had no CSS support, let alone JavaScript. But I could still control my desktop at home through LogMeIn because it just showed a picture of the screen and depending on which part of the picture you clicked, it would click there on the desktop. This must have used either a lot of tiny images with individual links, or image maps (remember those? This HTML course is where I learned about them).

Now for the statistics:

A number of speech labels were defined for mainly family members. I sound like a little girl. They might be my oldest recordings; I certainly don't know of any older ones and these are non-exportable. In case you don't know what I'm talking about: it allowed you to record a short clip for each contact, for example "call mom", and with a hotkey you could then have it listen to and recognize what you said.

I'm curious how it worked. Probably there's a patent about it, this was a 2006 bigcorp after all. Surely this device couldn't do machine learning, so maybe it did some frequency analysis with FFT? Or a simple ML after all? I don't know whether a small neural net gives proportionally worse results than a big one or if it's logarithmic and a very small net basically just generates random results. The feature definitely worked better than random; the detection was clearly functional, even if it wasn't good at ignoring noise and the match needed to be fairly precise. And you couldn't record more than one clip per contact, so no variations! Be exact, human!

Update: FFT seems to indeed be the answer. Thanks to Alex Hajnal for sharing his experience coding voice recognition on an Atari ST!