As a teen I had a Nokia 6230i. You know: color screen, voice recognition, GPRS data, proprietary earbud connector, no-screwdriver-needed replaceable cover in case you sat on it and the screen cover cracked -- or just in case you wanted a red cover instead of a silver one.
It kept some statistics that I found interesting. I think I used the phone from roughly 2006--2010.
Now for the statistics:
In total I downloaded 569 934 801 B and uploaded 124 195 005 B. Yep, it shows exact byte values. The total active time was 866 hours, 23 minutes, 18 seconds. I probably paid something like 500 euros for all of this (10 euros per month across four years).
Average download speed: 182 bytes per second. Note that this also includes idle time, like while reading webpages. But I can tell you that a lot of time was also spent on waiting for pages to load, with the maximum speed of some 8 kilobytes per second! Most pages were small and just text, sure, but the roundtrip time alone...
Money spent: 58 cents per hour, or 92 cents per mebibyte. And this was with truly unlimited data for €10/month, mind you.
I called for 13.37 hours with the device, of which 5.353 hours incoming and 8.012 hours outgoing calls.
A number of speech labels were defined for mainly family members. I sound like a little girl. They might be my oldest recordings; I certainly don't know of any older ones and these are non-exportable. In case you don't know what I'm talking about: it allowed you to record a short clip for each contact, for example "call mom", and with a hotkey you could then have it listen to and recognize what you said.
I'm curious how it worked. Probably there's a patent about it, this was a 2006 bigcorp after all. Surely this device couldn't do machine learning, so maybe it did some frequency analysis with FFT? Or a simple ML after all? I don't know whether a small neural net gives proportionally worse results than a big one or if it's logarithmic and a very small net basically just generates random results. The feature definitely worked better than random; the detection was clearly functional, even if it wasn't good at ignoring noise and the match needed to be fairly precise. And you couldn't record more than one clip per contact, so no variations! Be exact, human!
Update: FFT seems to indeed be the answer. Thanks to Alex Hajnal for sharing his experience coding voice recognition on an Atari ST!