The algorithms for vectorization outlined above can be used on parallel computers. For instance, variables on a wavefront can be processed in parallel, by dividing the wavefront over processors. More radical approaches for increasing the parallelism in incomplete factorizations are based on a renumbering of the problem variables. For instance, on rectangular domains one could start numbering the variables from all four corners simultaneously, thereby creating four simultaneous wavefronts, and therefore four-fold parallelism (see Dongarra, et al. , Van der Vorst ) . The most extreme case is the red/black ordering (or for more general matrices the multi-color ordering) which gives the absolute minimum number of sequential steps.
Multi-coloring is also an attractive method for vector computers. Since points of one color are uncoupled, they can be processed as one vector; see Doi , Melhem , and Poole and Ortega .
However, for such ordering strategies there is usually a trade-off between the degree of parallelism and the resulting number of iterations. The reason for this is that a different ordering may give rise to a different error matrix, in particular the norm of the error matrix may vary considerably between orderings. See experimental results by Duff and Meurant  and a partial explanation of them by Eijkhout .