[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: simple question about ATL_mmJIK.c


>NBmm0(MB, NB, KB, ATL_rone, pA, KB, pB, KB, beta, pC, ldpc);
>pA += NBNB;
>pB += NBNB;
>I assume that what's going on here is that pA and pB are pointers to some
>starting offset within the matrices A and B.  When NBmm0 is
>called, the kernel will be executed, and the sub region of size NBxNB of
>matrices A and B will be in the L1 cache, which the kernel then operates
>on at a higher speed than if the data weren't in cache.  After returning,
>the pointers are advanced by NB*NB entries, and the process will repeat
>with another chunk of a matrix.  So, my simple questions are:

Actually, they are not already in the cache, but A, at least, will be kept
there for this operation.  ATLAS/doc/atlas_over.ps describes this in more

>1. NBNB I assume is the the number of *entries* in the sub matrix, so NB
>squared.  If I wanted to get the number of bytes this takes up, then this
>would be NB*NB*sizeof(TYPE).  Is this correct?

>2. It seems like instead of processing the sub matrices like you normally
>handle a 2D array, that ATLAS is mapping the 2D space into just a strictly
>linear space.  Is this true?

Yes.  This is the block-major format discussed on page 7-8 of
ATLAS/doc/atlas_contrib.ps.  This note along with the provided
ATLAS/doc/atlas_over.ps exist to explain at least some of these ideas.