[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ATLAS Developer release 3.3.0 available

I sent this mail originally more than 4 hours ago, but my mail account at
UT has not been working all day, and the timings I mailed from home 4 hours
later has already shown up on the archive.  Probably this will result in
duplication when UT gets it's mail act together, but I include my ATLAS
3.3.0 announcement below.



I have finally got a new developer release out the door.  The main thing
is that it has Camm & Peter's SSE2 stuff in it, so we can clock in at
around 2Gflop on the P4 for DGEMM.  Peter may note that I am using NB=80
for single precision, rather than the more optimal NB of 112.  112 is
asymptotically better, but even at N=3000, it is only 3% better than NB=80 for
the full GEMM, while NB=80 gets twice the performance for N=20-200 . . .

The new developer release represents everything that has been submitted,
with the exception of the parallel make functionality, which should be
in the next one.  

This release also includes some code from me, speeding up small case real
LU and Cholesky, and some improvements to complex TRSM.  All this goodness
is available at:

By the way, my mail is presently not working, so don't be too surprised if
you send me mail and don't hear back immediately: I will have to scope the
developer release to make sure this email goes out . . .