As my readers know, I have built up a CUDA machine on my desktop for few bucks to have lattice QCD at my home. There are a couple of reasons to write this post and the most important of this is that Pedro Bicudo and Nuno Cardoso have got their paper published on an archival journal (see here). They produced a very good code to run on a CUDA machine to do SU(2) lattice QCD (download link) that I have got up and running on my computer. They are working on the SU(3) version that is almost ready. I hope to say about this in a very near future. Currently, I am porting MILC code for the computation of the gluon propagator on my machine from the configurations I am able to generate from Nuno and Pedro’s code. This MILC code fits quite well my needs and it is very well written. This task will take me some time and I have not too much of it unfortunately.
Presently, Nuno and Pedro’s code runs perfectly on my machine (see my preceding post here). There was no problem in the code but I just missed a compiler option to make GPUs communicate through MPI library. Once I corrected this all runs like a charm. From a hardware standpoint, I was unable to get my machine perfectly working with three cards and the reason was just overheating. A chip of the motherboard ended below one of the video card resulting in an erratic behavior of the chipset. I have got a floppy disc seen by Windows 7 when I have none! So, I decided to work just with two cards and now the system works perfectly, is stable and Windows 7 sees always four GPUs.
Nuno sent to me an updated version of their code. I will make it run as soon as possible. Of course, I know that this porting will be as smooth as before and it will take just a few minutes of my time. I suggested to him to keep up to date their site with the latest version of the code as this is evolving with continuity.
Another important reason to write this post is that I am migrating from my old GeForce 9800 GX2 cards to a couple of the latest GeForce 580 GTX with Fermi architecture. This will afford less than one thousand euros and I will be able to get 3 Tflops in single precision and 1 Tflops in double precision with more ram for each GPU. The ambition it to upgrade my CUDA machine to computational capabilities that, in 2007, made a breakthrough in the lattice studies of the propagators for Yang-Mills theory. The main idea is to have both the code for Yang-Mills and scalar field theories running under CUDA comparing their quantum behavior in the infrared limit, an idea pioneered by Rafael Frigori quite recently (see here). Rafael showed that my mapping theorem (see here and references therein) is true also in 2+1 dimensions through lattice computations.
The GeForce 580 GTX that I bought are from MSI (see here). These cards are overclocked with respect to the standard product and come with a very convenient price. I should say that my hardware is already stable and I am able to produce software right now. But this upgrade will take me into the Fermi architecture opening up the possibility to get double precision on CUDA. I hope to report here in the near future about this new architecture and its advantages.
Nuno Cardoso, & Pedro Bicudo (2010). SU(2) Lattice Gauge Theory Simulations on Fermi GPUs J.Comput.Phys.230:3998-4010,2011 arXiv: 1010.4834v2
Rafael B. Frigori (2009). Screening masses in quenched (2+1)d Yang-Mills theory: universality from
dynamics? Nuclear Physics B, Volume 833, Issues 1-2, 1 July 2010, Pages 17-27 arXiv: 0912.2871v2
Marco Frasca (2010). Mapping theorem and Green functions in Yang-Mills theory PoS(FacesQCD)039, 2011 arXiv: 1011.3643v3