[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Several of the Level 1 BLAS have a strong dependance on the speed of real
absolute value for their performance.  This operation should be a 1-cycle
bit level operation (mask off the sign bit), but ANSI C supports bit operations
on integer only, so ATLAS is unable to employ this operation (I made something
work with a bunch of casts, but by the time all that was done, it was slower
than an if), and so must instead substitute an if of some sort, which of
course implies a branch, which implies poor performance.

My guess is that there are system-dependant ways to make fabs() one cycle
nonetheless, and I'm hoping some of you know or can easily discover them.
Anyway, I want to ask anyone who can figure out to do fabs() without an if
to post to the list.  The solution can be as nonportable as you want;
I figure in-line assembler may be required, but hopefully it can be used
with a C macro.  Here's an example macro for double precision:

   #define ATL_dabs(x) ( (x) >= 0.0 ? (x) : -(x) )

If anyone can do it without the if, I think we can speedup quite a few
routines . . .

Any pointers appreciated,