Modeling LU factorization (Application)
Conducted byEmmanuel Jeannot
DescriptionModeling the LU factorization is an important challenge as it can help to understand the scalability of the algorithm and helps the user to compute both the best block size and the optimal processor grid-size. Modeling the LU factorization has been addressed in several prior works.. IN the ScalaPACK users guide, a very crude model is proposed. It is simply used for showing the scalability of ScaLapack. It does not use the processor grid shape whereas it is well known that it has a great impact on the performance. Domas et al. propose a very fine model with multiple parameters for the environment (more than 30!). They use this model to compute the optimal grid size. However, they do not show how to compute these parameters. In the LAPACK Working Note 43 the authors propose a simple model with only three parameters for the environment (the bandwidth and the latency of the network and the flop rate of the processor). However, our experiments show that the value of these parameters depends on the phase of the algorithm. For instance a better floprate is achieved when updating the trailing sub-matrix than when updating the panel. It is important to note that none of the above models take into account the SMP clusters architecture where each processor can be grouped on a multi-processor node.
- Nodes involved: 100
- Sites involved: 1
- CPU bound: yes
- Memory bound: yes
Tools usedScalapck, MPI
ResultsWe have proposed a model that has three environment parameters per subroutine. As it takes into account the grid-size, it is able to derive the optimal grid shape in most of the cases. Moreover, we propose a simple and fast method to compute the model parameters. Finally we show how to enhance our model for multiprocessor clusters.
Shared by: Emmanuel Jeannot
Last update: 2007-03-08 18:36:28