[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: subarch builds



Camm,

>Hi Clint!  Well, it looks like I've got a basic subarch package
>going.  For i386, for example, subarch specific builds are under
>/usr/lib/{xmm,3dnowext}, etc, which ldso will check first if the cpu
>is right.  
>
>This gives an install on a 'portable' /usr file system, but I'd like
>to see if I could reduce the disk usage somewhat.  Right now I just
>build everything for each subarch.
>
>Is there a way to separate out the arch specific kernels into a
>separate so, and keep the rest of the atlas library centralized?
>I.e. something like libatlas_kernel with all the specific goodies in
>it, and libatlas for the driver/generic routines?

I'm afraid not.  ATLAS was designed for speed, at pretty much the expense
of most everything else.  A lot of things that would be runtime queries
are instead compile time constants because of the obsession with performance.
For instance, one of the most important arch-dependant parameters is the
blocking factor, NB.  This guy is a compile-constant throughout the library,
which saves time (a lot of loops can be fully unrolled by the compiler,
no need to do lookups, etc), but it makes it so the entire Level 3 BLAS
are tied to the parameter . . .

The only tuning that I can really think of that is well isolated in this
way is CacheEdge (a parameter who's setting varies with L2), which is
referenced in only a few routines, leaving the bulk of the library
unchanged.  This means you could easily have the best lib for a 256K and
512K machine share 99% of their code, but I don't see PII/P4 or so on
being so easily merged.  There are several archs with significant overlap
(PIII/PII), but these overlaps are idiosyncratic, and I'm not sure it would
be worth spending the time tracking them down . . .

Let me know if I answered the question you asked :)
Clint