Greetings!  Just a brief progress note: 100 MFLOPS -> 230 MFLOPS with
prefetch alone for complex double precision (transpose).  Hopefully
I'll be able to contribute something toward all level2 precisions for
PIII soon.

