[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

single precision results: holy GFLOP, batman!



Here are some results for single precision.  Frankly, the numbers are
awe-inspiring.  The idea of getting a 3Gflop LU on a PC kind of makes
my head spin.  It makes you rethink how big a problem needs to be
before you parallelize it: this baby could solve a 10K sLU in something like
3 minutes . . .

I compare the 1.5Ghz P4 using SSE with our 1Ghz Athlon using 3DNow!.  This
is not a fair comparison, in that SSE is IEEE compliant, and 3DNow! is not
(i.e., even if the Athlon won for performance, I'd still recommend the P4
since it gets the right answer as well).  Also, remember that the Athlon is
not using the best memory.

All that said, the amazing thing is that the P4 is *more* than 1.5 times faster
than the Athlon (1.5 is how much faster its clock is, obviously).

Wow,
Clint

ATH : 1Ghz Athlon, SDRAM                    $1269
P4  : 1.5Ghz Pentium 4, Rambus              $2109

             100    200    300    400    500    600    700    800    900   1000
           =====  =====  =====  =====  =====  =====  =====  =====  =====  =====
ATH  sMM  1860.5 2162.2 2160.0 2133.3 2205.9 2160.0 2286.7 2275.6 2278.1 2298.9
P4   sMM  2500.0 3674.1 3240.0 3584.0 3571.4 3600.0 3811.1 3657.1 3645.0 3703.7

ATH  sLU   556.0  824.1  983.3 1151.0 1223.7 1348.3 1384.4 1420.9 1471.5 1480.4
P4   sLU   606.5 1153.8 1529.5 1703.5 1808.9 2054.6 2284.2 2200.1 2428.0 2467.3


            1200   1400   1600   1800   2000   2200   2400   2600   2800   3000
           =====  =====  =====  =====  =====  =====  =====  =====  =====  =====
ATH  sMM  2304.0 2325.4 2320.7 2328.1 2332.4 2330.0 2315.6 2349.7 2351.6 2364.3
P4   sMM  3676.6 3683.2 3673.5 3679.5 3661.3 3665.4 3671.7 3657.9 3670.9 3658.5

ATH  sLU  1599.0 1618.0 1727.5 1719.6 1819.6 1787.5 1887.9 1865.3 1961.2 1904.3
P4   sLU  2616.5 2688.8 2785.1 2922.1 2913.3 2994.2 3040.6 2074.5 3093.2 3118.8

                          GEMM   SYMM   SYRK  SYR2K   TRMM   TRSM
                         =====  =====  =====  =====  =====  =====
ATH      s500           2173.9 2272.7 1789.3 2381.0 2000.0 1351.4
P4       s500           3571.4 3125.0 2636.8 3333.3 2941.2 2500.0