[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Altivec matmul kernel (attachment)
Just back from travel, will take a while to catch up . . .
>Here's a question for the group:
>Altivec fp instructions execute in one of two modes:
>In "Java" mode, denormalized results are handled correctly, and
>multiply-add instructions have a 5-cycle latency.
>In "non-Java" mode, denormalized results may not be handled correctly,
>and multiply-add instructions have a 4-cycle latency. All other
>computations are IEEE compliant. My matmul kernel gets about 150-200
>Mflop speed bump (1650 to 1850, roughly) when going from Java mode to
>Should I let the user handle Java vs. non-Java mode, or should I turn
>off Java mode explicitly? (The submitted version doesn't touch the Java
I strongly believe that by default you should be IEEE compliant. Lack of
compliance is why I don't furnish 3DNow! prebuilts, and I default to building
the athlon stuff not using it. People using the numerical libraries really
need to be able to count on correct behavior . . .
So, I would like to see it default to "java-mode", with a special flag being
required to get the performance boost associated with the non-IEEE stuff.
I think the default kernel should explicitly turn on the IEEE compliance;
we can provide directions or setup help for getting one that doesn't have it.
A 10% drop in performance is withstandable to get the correct answer . . .