It was twenty years ago today . . .


With these beautiful words starts a recollection paper by the founder of arXiv, Paul Ginsparg. This is worth the reading as this history spans a number of years exactly overlapping the computer revolution that definitely changed our lives. What Paul also changed through these new information tools was the way researchers should approach scientific communication. It is a revolution that is not stopped yet and all the journals I submit my papers have a link to arXiv for direct uploading of the preprint. This change has had also a great impact on the way these same journals should present to authors, readers and referees as well at their website.

For my readers I would like just to point out how relevant was all this for our community with the Grisha Perelman’s case. I think all of you are well aware that Perelman never published his papers on a journal: You can find both of them on arXiv. Those preprints paid as much as a Fields medal and a Millenium prize. Not bad I should say for a couple of unpublished papers. Indeed, it is common matter to have a paper largely discussed well before its publication and often a preprint becomes a case in the community without not even seeing the light of a publication. It is quite common for us doing research to console colleagues complaining about the harsh peer-review procedure by saying that today exists arXiv and that is enough to make your work widely known.

I was a submitter since 1994, almost at the very start, and I wish that the line of successes of this idea will never end.

Finally, to prove how useful is arXiv for our community, I would like to point out to you, for your summer readings a couple of papers. The first one is this from R. Aouane, V. Bornyakov, E.-M. Ilgenfritz, V. Mitrjushkin, M. Müller-Preussker, A. Sternbeck. My readers should know that these researchers always do a fine work and get important results on their lattice computations. The same happens here where they study the gluon and ghost propagators at finite temperature in the Landau gauge. Their conclusion about Gribov copies is really striking, comforting my general view on this matter (see here), that Gribov copies are not essential not even when one rises the temperature. Besides, they discuss the question of a proper order parameter to identify the phase transition that we know exists in this case.

The next paper is authored by Tereza Mendes, Axel Maas and Stefan Olejnik (see here). The idea in this work is to consider a gauge, the \lambda-gauge, with a free parameter interpolating between different gauges to see the smoothness of the transition and the way of change of the propagators. They reach a volume of 70^4 but Tereza told me that the errors are too large yet for a neat comparison with smaller volumes. In any case, this is a route to be pursued and I am curious about the way the interpolated propagator behaves at the deep infrared with larger lattices.

Discussions on Higgs identification are well alive yet ( you can see here). take a look and enjoy!

Paul Ginsparg (2011). It was twenty years ago today … arXiv arXiv: 1108.2700v1

R. Aouane, V. Bornyakov, E. -M. Ilgenfritz, V. Mitrjushkin, M. Müller-Preussker, & A. Sternbeck (2011). Landau gauge gluon and ghost propagators at finite temperature from
quenched lattice QCD arXiv arXiv: 1108.1735v1

Axel Maas, Tereza Mendes, & Stefan Olejnik (2011). Yang-Mills Theory in lambda-Gauges arXiv arXiv: 1108.2621v1

CUDA: Lattice QCD on a Personal Computer


At the conference “The many faces of QCD” (see here, here and here) I have had the opportunity to talk with people doing lattice computations at large computer facilities. They said to me that this kind of activities imply the use of large computers, user queues (as these resources are generally shared) and months of computations before to see the results. Today the situation is changing for the better due to an important technological shift. Indeed, it is well-known that graphics cards are built with graphical processing units (GPU) made by several computational cores that work in parallel. Such cores do very simple computational tasks but, due to the parallel architecture, very complex operations can be reduced to a set of such small tasks that the parallel architecture executes in an exceptionally short time. This is the reason why, on a PC equipped with such an architecture, very complex video outputs can be obtained with exceptionally good performances.

People at Nvidia have had the idea to use these cores to do just floating point operations and use them for scientific computations. This is the way CUDA (Compute Unified Device Architecture) was born. So, the first Tesla cards without graphics output, but with GPUs, were produced and the development toolkit was made freely available. Nvidia made parallel computation available to the masses. Just mounting a graphics card with CUDA architecture it is possible for everybody to have a desktop computer with Teraflops performances!

As soon as I become aware of the existence of CUDA I decide to mount on this bandwagon opening to me the opportunity to do QCD on the lattice at my home. So, I upgraded my PC at home with a couple of 9800 GX2 cards (2 GPUs for each with 512 MB of DDR3 RAM each one) having CUDA architecture 1.1. This means that these cards can do single precision computations at about 1 Tflops and my PC can express a performance of 2 Tflops. But I have no double precision. I have also changed my motherboard to a Nvidia 790i Ultra that support a 3-way SLI mode and the power supply upgraded to 1 KW (Silent Gold Cooler Master). I have added 4 GB of DDR3 RAM and maintained my CPU, an Intel Core 2 Duo E8500 with 3.16 GHz for each core. The interesting point about this configuration is that I have bought the three Nvidia cards from Ebay as used material at a very low cost. Then, I was in business with very few bucks!

Before this upgrading of my machine I had Windows XP home 32 bit installed. This operating system was only able to address 3 GB of RAM and 1 GB of it was used by the two graphics cards. This revealed a serious drawback to all the matter. In a moment I will explain what I did to overcome it.

The next important step was to obtain CUDA code for QCD. The question is that CUDA technology is going to spread rapidly into academic environment and a lot of code was available. Initially I thougth to MILC code. There is CUDA code available and people of MILC Collaboraion was very helpful. This code is built for Linux and I was not able to make this operating system up and running on my platform. Besides, I would have had needed a lot of time to make all this code working for me and I had to give up despite myself. Meantime, a couple of papers by Pedro Bicudo and Nuno Cardoso appeared (see here and here). Pedro was a nice companion at the conference “The many faces of QCD” where I have had the opportunity to know him. He was not aware I had asked the source code to his student Nuno. Nuno has been very kind to give me the link and I downloaded the code. This has been a sound starting point for the work on my platform. The code has been written for CUDA since the start and so well optimized. Pedro said to me that the optimization phase cost them a lot of work while putting down the initial code was relatively easy. They worked on a Linux platform so he was surprised when I said to him that I intended to port their code under Microsoft Windows. But this is my home PC and all my family uses it and also my attempt to install Ubuntu 64 bit revealed a failure that cost to me the use of Windows installation disk to remove the dual boot.

Then, during my Christmas holidays when I have had a lot of time, I started to port Pedro and Nuno code under Windows XP Home. It was very easy. Their code, entirely written with C++, needed just the insertion of a define. So, setting the path in a DOS mode box and using nvcc with Visual Studio 2008 (the only compiler Nvidia supports under Windows so far) I was able to get a running code but with a glitch. This code was only able to run on my CPU. The reason was that I had not enough memory under Windows XP 32 bit to complete the compilation for the code of the graphics cards. Indeed, Nvidia compiler ptxas stopped with an error and I was not able to get it running on the graphics cards of my computer. But after this step, successful for some aspects, I wrote to Pedro and Nuno informing them of my success on porting the code at least running on my CPU under Windows. The code was written so well that very few was needed to port it! Pedro said to me that something had to be changed in my machine: Mostly the graphics cards should have been taken more powerful. I am aware of this shortcoming but my budget was not so good at that time. This is surely my next upgrade (a couple of 580 GTX with Fermi architecture supporting double precision).

As I have experienced memory problems, the next step was to go to a 64 bit operating system to use all my 4 GB RAM. Indeed, on another disk of my computer, I installed Windows 7 Ultimate 64 bit. Also in this case the porting of Pedro and Nuno’s code was very smooth. In a DOS box I have obtained their code up running again but this time for my graphics cards and not just for CPU only. As I have the time I will do some computations of observables of SU(2) QCD experiencing with the limit of my machine. But this result is from yesterday and I need more time to do some physics.

Pedro informed me that they are working for SU(3) and this is more difficult. Meantime, I have to thank him and his student Nuno very much for the very good job they did and for permitting me to have lattice QCD on my computer at home successfully working. I hope this will represent a good starting point for other people doing this kind of research.

Update: Pedro authorized me to put here the link to download the code. Here it is. Thank you again Pedro!

Nuno Cardoso, & Pedro Bicudo (2010). Lattice SU(2) on GPU’s arxiv arXiv: 1010.1486v1

Nuno Cardoso, & Pedro Bicudo (2010). SU(2) Lattice Gauge Theory Simulations on Fermi GPUs J.Comput.Phys.230:3998-4010,2011 arXiv: 1010.4834v2

%d bloggers like this: