[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: simple question about ATL_mmJIK.c
>NBmm0(MB, NB, KB, ATL_rone, pA, KB, pB, KB, beta, pC, ldpc);
>pA += NBNB;
>pB += NBNB;
>I assume that what's going on here is that pA and pB are pointers to some
>starting offset within the matrices A and B. When NBmm0 is
>called, the kernel will be executed, and the sub region of size NBxNB of
>matrices A and B will be in the L1 cache, which the kernel then operates
>on at a higher speed than if the data weren't in cache. After returning,
>the pointers are advanced by NB*NB entries, and the process will repeat
>with another chunk of a matrix. So, my simple questions are:
Actually, they are not already in the cache, but A, at least, will be kept
there for this operation. ATLAS/doc/atlas_over.ps describes this in more
>1. NBNB I assume is the the number of *entries* in the sub matrix, so NB
>squared. If I wanted to get the number of bytes this takes up, then this
>would be NB*NB*sizeof(TYPE). Is this correct?
>2. It seems like instead of processing the sub matrices like you normally
>handle a 2D array, that ATLAS is mapping the 2D space into just a strictly
>linear space. Is this true?
Yes. This is the block-major format discussed on page 7-8 of
ATLAS/doc/atlas_contrib.ps. This note along with the provided
ATLAS/doc/atlas_over.ps exist to explain at least some of these ideas.