You are here: Home Grid tools and utilities Reserve SMP nodes Simple porting of RegCM using reserve_smp_nodes
Personal tools
Document Actions

Simple porting of RegCM using reserve_smp_nodes

by Riccardo Di Meo last modified 2008-07-18 18:15

this page describes the porting of a tightly- coupled MPI-enabled application with real-time feedback and user interaction.

Ovverview

Here we present a porting of the climatological application RegCM with an approach very similar to the one already used in the development of the matmul example we have already exposed.

RegCM is an application developed at the International Centre for Theoretical Physic for climatological modelling and which can exploit multiple processors using MPI.

However, the code is very tightly coupled, which means that using normal MPI over tcp/ip brings very little improvement to the computation, and it must be executed on SMP/multi-core machines in order to gain any benefit from the parallelism (or in a very fast connected cluster, with MPI over Myrinet or MPI over Infiniband).

Before the development of reserve_smp_nodes, this would have kept RegCM from running on the grid at all (or at least would had limited the effectiveness of any porting greatly): now we will show how to run RegCM simulations using up to 8 nodes, with a relatively simple python script.

Please note that, though the script aims to show some nice tricks that can be used to port applications effectively, improvements still can be made: the advanced user will be surely able to discern which approaches need enhancements, as well as in which ways.

The application

We will not dwell in the details of setting up a RegCM simulation (for more insight about that, you can check the documentation at the RegCM homepage), instead we will start from a fully configured regcm package, constituted by:

  • A RegCM executable, compiled statically for a specific simulation and linked against MPICH with the shmem device (for this purpose, we will use a test binary compiled for 2-processors simulations).
  • The relocatable MPICH package with the shmem device, which is just a normal precompiled MPICH archive with small changes to allow for an easier installation on the grid (you can find it here).
  • A tar-gzipped package with the input files required to run the simulation. Our example will be called input.tar.gz and contains the following files:
    • DOMAIN.CTL
    • DOMAIN.INFO
    • ICBC1994050100
    • ICBC1994050100.CTL
    • regcm.in

The porting

We have again used the SimpleXMLRPCServer module of python, in order to create a server and a client (using the xmlrpclib module) in order to provide the user with an interactive system.

As soon as the client hits the grid (after a successful job reservation) the user is asked, on the server, for the location for:

  • of the MPICH relocatable package
  • of the regcm binary
  • of the tar package with the input files
  • for the output package (as soon as it will be created)

All of which can be provided as 2 kind of urls:

  • file: type locations, for files residing on the user's machine (no grid storage used)
  • gsiftp: location, if files are to be downloaded/uploaded to/from an appropriate SE

The user can mix the 2 approaches and, e.g. provide location for the binary and the MPICH package using the gsiftp url (saving local bandwidth, since they are likely to be used multiple times) where the input and output files packages can be specified as file urls on the user's computer (sparing the boring step of uploading them to and from the SE: also unnecessary, since is of "one-shoot" files which we are speaking about).

After this step, the client will display a quick measure of the transmission speed of the xmlrpc messages and offer the user the opportunity to get the output of the simulation displayed in real time.

At that point, the simulation will start on the WN and, after the conclusion, the output will be tarred and saved where specified.

Both the server and the client are available at the address:

http://www.ictp.it/~dimeo/regcm_shmem/regcm_xmlrpc
http://www.ictp.it/~dimeo/regcm_shmem/regcm_xmlrpc_server

Putting aside some limits in the user interface and in the work-flow, a noteworthy limitation of this porting is that it doesn't provide a way to checkpoint the simulation, which is therefore limited to the maximum length of the queue (usually 48 hours).

How the simulation is started

As suggested in the previous section, we will use a Storage Element to store the RegCM binary and the relocatable mpi package:

$ globus-url-copy file:`pwd`/relocatable_mpich_shmem.tar.gz  gsiftp://se-01.grid.sissa.it/tmp/relocatable_mpich_shmem.tar.gz
$ globus-url-copy file:`pwd`/regcm_2_shmem  gsiftp://se-01.grid.sissa.it/tmp/regcm_2_shmem

The first operation is to start the server on a host with inbound connectivity, for simplicity purposes we will assume that the input package (input.tar.gz) will reside there:

$ ls
input.tar.gz  regcm_xmlrpc_server
$ ./regcm_xmlrpc_server
Wrong number of arguments:
        ./regcm_xmlrpc_server <password> <listening port>
$ ./regcm_xmlrpc_server foo 23000
     (nothing spectacular happens...)

with a password to protect it from malicious accesses, specifying a port in the GLOBUS_PORT_RANGE.

Then, the reserve_smp_nodes is used to gather the required resources on the grid, launching the regcm_xmlrpc script with the proper arguments.

It may be worth of notice, that we are using the CE:

grid0.fe.infn.it:2119/jobmanager-lcgpbs-grid

which doesn't have MPI support at all! This is only possible, since we are using reserve_smp_nodes and setting up the MPI environment on the fly!

$ ./regcm_xmlrpc
Wrong number of arguments:
        ./regcm_xmlrpc <password> <host> <port>
$ ./reserve_smp_nodes -r grid0.fe.infn.it:2119/jobmanager-lcgpbs-grid  -j 8 -N 2 -p 23001 -T 500 -F ./regcm_xmlrpc -O "foo egrid-ui.egrid.it 23000"
Starting to receive...
All jobs correctly submitted!
** New connection established from 193.206.188.12:47200
   + Hostname received: grid17.fe.infn.it
** New connection established from 193.206.188.12:47201
   + Hostname received: grid17.fe.infn.it
Fitting a 2 job for my 2CPUs node
After '__best_matches' i have 1 tasks and 0 cpu free on the node

Got a task for 2 proc to fit in the node
Script './regcm_xmlrpc' sent.
Closing socket grid17.fe.infn.it (193.206.188.12:47201)
Closing socket grid17.fe.infn.it (193.206.188.12:47200)
All tasks have been assigned!
Out of the receiving cycle
Closing the remaining resources
1 tasks submitted for execution
$

At this point, we should see some activity on the server:

$ ./regcm_xmlrpc_server foo 23000
Starting...
Now you will have to enter the location
of a mpich relocatable package (in tar.gz format).
Enter the location (gsiftp/file):

at this point, we have to input the required information and wait for the client on the grid to download the data it requires, either from the SE or from our computer:

$ ./regcm_xmlrpc_server foo 23000
Starting...
Now you will have to enter the location
of a mpich relocatable package (in tar.gz format).
Enter the location (gsiftp/file):  gsiftp://se-01.grid.sissa.it/tmp/relocatable_mpich_shmem.tar.gz
         (... pause to allow the WN to retrieve the package)
gsiftp://se-01.grid.sissa.it/tmp/relocatable_mpich_shmem.tar.gz retrieved as relocatable_mpich_shmem.tar.gz
**** mpich_smp correctly relocated
Enter the location of the regcm executable.
Enter the location (gsiftp/file):  gsiftp://se-01.grid.sissa.it/tmp/regcm_2_shmem
         (... pause to allow the WN to retrieve the binary)
gsiftp://se-01.grid.sissa.it/tmp/regcm_2_shmem retrieved as regcm_2_shmem
Setting the permissions
Enter the location of the input data (in tar.gz format)
Enter the location (gsiftp/file):  file:input.tar.gz
   (... another pause: now the clients retrieves the input from the user's machine)
Retrieving from your machine input.tar.gz
Got 4102648 bytes
Dumping the data on input.tar.gz
input.tar.gz retrieved and saved.
Where do you want the output to be saved?
Enter the location (gsiftp/file):  file:output.tar.gz
                (... small pause to check if the location is available)
Trying to access the location file:output.tar.gz...
Destination tested. Ready to start.
Creating the output directory
About  to execute regcm_2_shmem...
I estimate that an XMLRPC call is taking ~ 1.6" to execute (network overhead)
If you think that your program will write on the stdout/err faster,
i strongly recommend you to answer 'no' to the next option.
Do you want to use the real time output (yes/[no])?:  yes
 process 0 of 2
 RESTARTPARAM READ IN
 TIMEPARAM READ IN
 OUTPARAM READ IN
 PHYSICSPARAM READ IN
 SUBEXPARAM READ IN
 GRELLPARAM READ IN
  dtau =  37.5 75.
 NREC =  289276
 IDATE1, IDATE2, dtmin, ktaur =  1994050100 1994050300 2.5 0
 READING HEADER FILE
 DIMS 34 48 18
 DOMAIN 60000. 45.39 13.48 45.39 13.48 0.7155668
 PROJLAMCON
 SIGMA 0. 0.05 0.1 0.16 0.23 0.31 0.39 0.47 0.55 0.63 0.71 0.78 0.84 0.89 0.93 0.96 0.98 0.99 1.
 PTOP 5.
 OUTPUT 1 1
 ***** mdate =  1994050100
  input/output parameters
  ifsave =  T  savfrq =  48.  iftape =  T  tapfrq =  6.  ifprt  =  F  prtfrq =  12.  kxout  =  6  jxsex  =  40  radisp =  6.  batfrq =  3.  nslice =  120  ifchem =  F  chemfrq = 6.

  physical parameterizations
  iboudy =  5  icup =  2  igcc = 2  ipptls =  1  iocnflx =  2  ipgf =  0  lakemod =  0  ichem = 0

  model parameters
  radfrq =  30.  abatm =  600.  abemh =  18.  dt =  150.

  ncld =  1

 HT
 HTSD
 SATBRT
 XLAT
 XLONG
 MSFX
 MSFD
 F
 SNOWC

 ***************************************************
 ***************************************************
 **** RegCM IS BEING RUN ON THE FOLLOWING GRID: ****
 ****     Map Projection: LAMCON                ****
 ****     IX= 34  JX= 48  KX= 18              ****
 ****     PTOP= 5.  DX= 60000.        ****
 ****     CLAT=  45.39  CLON= 13.48     ****
 ***************************************************

       (.......)

 OUT-history written date =  1.9940503E+9
 BATS variables written at  1994050300 0.
 Writing rad fields at ktau =  1152 1994050300
 OPENING NEW SAV FILE: SAV.1994050300
 restart written date =  1.9940503E+9


 ***** restart file for next run is written at time     =       0.00 minutes, ktau =    1152 in year 1994
  *** new max DATE will be  1994060200
output/
output/SRF.1994050100
output/OUT_HEAD
output/RAD.1994050100.ctl
output/ATM.1994050100
output/ATM.1994050100.ctl
output/SRF.1994050100.ctl
output/SAV.1994050300
output/RAD.1994050100
output/OUT_HEAD.CTL
regcm_2_shmem terminated: program has taken 234.668567"
Tarring the output...
Saving to output.tar.gz
   ( ... pause to save the output on your local directory...)
Sent 16862175 bytes
output.tar.gz sent and saved in output.tar.gz.
Data correctly saved
Program terminated: shut down the server
or run another simulation

Congratulations: you have executed a RegCM simulation on the grid, and now the output should in the server directory:

     (....)
Saving to output.tar.gz
Sent 16862175 bytes
output.tar.gz sent and saved in output.tar.gz.
Data correctly saved
Program terminated: shut down the server
or run another simulation    (Ctrl+c)
Interrupted by the user
$ ls -l
total 20508
-rw-------  1 dimeo dimeo  4102648 Aug 17  2007 input.tar.gz
-rw-rw-r--  1 dimeo dimeo 16862175 Jul 18 18:48 output.tar.gz
-rwx------  1 dimeo dimeo     3466 Sep  7  2007 regcm_xmlrpc_server
$ tar tvzf output.tar.gz
drwxr-xr-x euindia018/euindia 0 2008-07-18 18:48:30 output/
-rw-r--r-- euindia018/euindia 2702592 2008-07-18 18:48:31 output/SRF.1994050100
-rw-r--r-- euindia018/euindia   64768 2008-07-18 18:44:45 output/OUT_HEAD
-rw-r--r-- euindia018/euindia    1062 2008-07-18 18:44:45 output/RAD.1994050100.ctl
-rw-r--r-- euindia018/euindia 5988096 2008-07-18 18:48:30 output/ATM.1994050100
-rw-r--r-- euindia018/euindia     879 2008-07-18 18:44:45 output/ATM.1994050100.ctl
-rw-r--r-- euindia018/euindia    1669 2008-07-18 18:44:45 output/SRF.1994050100.ctl
-rw-r--r-- euindia018/euindia 10412768 2008-07-18 18:48:30 output/SAV.1994050300
-rw-r--r-- euindia018/euindia  4292352 2008-07-18 18:48:31 output/RAD.1994050100
-rw-r--r-- euindia018/euindia      744 2008-07-18 18:44:45 output/OUT_HEAD.CTL

Conclusions

Using the reserve_smp_nodes tool we have been able to submit a pre-configured pre-compiled RegCM simulation which has run on the grid.

Not only we have been able to execute a tightly coupled application, using the shared memory flavor of MPICH, but we have been also able to do it on a cluster which doesn't support parallel execution at all (since it doesn't provide MPICH support).

Though the example provided here is not particularly refined in terms of usability (and limits the maximum time of execution to about 48 hours, since no checkpoint mechanism is provided), it provides some clues about a work-flow with real-time feedback and user interaction can be provided (as well as the ability to use either the user's computer or an SE for data transfer).

« January 2021 »
Su Mo Tu We Th Fr Sa
12
3456789
10111213141516
17181920212223
24252627282930
31
 

Powered by Plone This site conforms to the following standards: