[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SSE-enabled level 2



Greetings!  Good to here everything worked out.  A few questions:

1) There still seems to be some noticeable hit for symv wrt gemv, most
   probably due to the very different data access patterns, as I
   understand it.  Is there a way around this?
2) Any idea of what a new SSE sgemm based sgemv would do?  Gemm based
   routines won out in the original atlas, if memory serves.
3) From what I can tell, these codes are all memory bandwidth bound.
   Any idea of how to calculate a memory bandwidth peak for a given
   machine?  Both sgemv and dgemv seemed to max out at ~ 1 GB/s on our
   100Mhz bus, corresponding to a transfer rate of 10 bytes per clock,
   which seems a bit of an odd value.

Take care,

R Clint Whaley <rwhaley@cs.utk.edu> writes:

> Guys,
> 
> I just got Camm's SSE-enabled Level 2 incoporated.  I include timings on my
> 500Mhz PIII (256K on-chip L2) laptop below.
> 
> Cheers,
> Clint
> 
> GH   : Greg Henry's Intel BLAS
> ATL  : ATLAS without Camm's kernels
> AT+C : ATLAS with Camm's SSE-enabled GEMV/GER
> 
> M=N=500 Trans=N, alpha=1.0, beta=1.0
> 
>                                 HEMV                 GERU    HER   HER2   GERC
>                          GEMV   SYMV   TRMV   TRSV    GER    SYR   SYR2
>                         =====  =====  =====  =====  =====  =====  =====  =====
> dGH       PIII500  500  107.7   51.2   49.3   91.0   49.7   26.2   50.5
> dATL      PIII500  500  105.4  138.8   98.3   93.3   43.8   41.6   80.5
> dAT+C     PIII500  500  131.5  120.0   98.5   94.7   43.8   41.6   78.5
> 
> sGH       PIII500  500  139.1   64.5   68.5  113.0   51.1   50.1   92.3
> sATL      PIII500  500  139.6  153.1  128.6  117.4   75.1   73.7  111.6
> sAT+C     PIII500  500  222.9  159.6  166.9  161.7   91.2   88.4  171.5
> 
> zGH       PIII500  500  137.9  125.0   78.0   59.3   74.9   62.3  107.4   73.4
> zATL      PIII500  500  138.1  208.9  142.2  134.8   75.1   73.9  115.0   74.9
> zAT+C     PIII500  500  224.9  184.4  198.9  185.3  115.7  110.2  140.5  115.7
> 
> cGH       PIII500  500  194.4  188.6  104.7   97.0   99.1   99.5  169.9   98.4
> cATL      PIII500  500  230.3  232.2  212.3  197.8  100.8  100.0  145.4  100.6
> cAT+C     PIII500  500  421.7  417.7  353.1  328.8  217.7  209.9  269.1  218.3
> 
> 

-- 
Camm Maguire			     			camm@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah