Don't forget to mention https://en.wikipedia.org/wiki/Apple_Lisa
and its successor Classic Mac OS.
Don't forget to mention that the entire graphics subsystem (QuickDraw) was _entirely_ written in assembler because their Pascal couldn't produce fast enough code.
Yeah, but in (IT) prehistoric times, this was also more worthwhile, since both the compilers and the processors were respectively less able and imposed stiff penalties. As said in the oberon/modula2 thread, in 386/486 times I even did the date time routines in assembler (though I have to mention that my main app at the time was a log processing one, with a datetimestamp on every line).
Just avoiding multiply and division (e.g. shifts or AAM or strength reduction changing multiply with repeated addition) and manual register allocation were usually the bulk of the gains.
However for graphics, assembler is still the norm. Most of my whole image conversions are done in SSE. A simple form, I don't try to optimize it too much, just try to do the core transformation in straightforward SSE.
E.g. a primitive for
8-bit image rotate