CCG Meeting - cp, jh, jm, js
New power outage schedule, May 15-16 and June 12.
Bazaar annex is fully functional. Joshes both running code on it.
See MT for JoshH's MPI config file.
No graphing tool yet, maybe this weekend. JoshM.
Some progress on mdrun/MPI, starting parallel runs. JoshM.
All of us (particularly JM and CP) need to be more regular about using
MT.
The new testing users are setup on the cluster, f-at-c and b-and-t-g.
Passwords are the same as the switches.
Fix rid-lookup.php and install in cgi-bin, link to it and the new tool
at c.e.e/html/resources (all of those files are in CVS) and in the
Links section in MT. JoshM.
Villin/Urea molecule failure. Charlie.
Implement catch the child failure, recover, and re-start with one less
process logic and code in v0.5 of F-at-C. JoshH.
We'll need to setup Bugzilla before too long.
Reading list for 1/x, sqrt(x), and vector processing. Charlie.
As I was waiting for some GROMACS code to finish compiling I took the liberty to finish imaging the new bazaar nodes (b16-20). They are all running now, waiting eagerly for work.
Since you need a special lam-bhost.conf file to work with these nodes, I have created one that we can all use. Instructions on how to use it are at the top of the file.
http://cluster.earlham.edu/home/joshh/src/lam-mpi/bazaar-annex.conf
I added the Project1012 Molecule to CVS. Here is the link if you want to view the files:
http://cluster.earlham.edu/project/b-and-t-gromacs/tests/etc/orignal-molecules/project1012/
It is a bit smaller than the last molecule that we recieved, but this will 'stretch' to 100 processes with out error [in one set of testing].
So I have typed up an protocol outline for v0.5 which is a bit more detailed then what we have on the board now. There are some notes at the bottom which we should address when we meet next. Here is the link:
http://cluster.earlham.edu/home/joshh/dev/fac/protocal.txt
Here is a link to the current frame work:
http://cluster.earlham.edu/home/joshh/dev/fac/
I placed the "villin and URE in 6 A cubic box in water" molecule set in the b-and-t-gromacs CVS repository with the other molecules. It is under the directory villin-urea or via the softlink 'urea'.
I have been doing some testing with this molecule, and cannot seem to get it to span more than 3 processes with out major failure [i.e. Application death]. This is a very large molecule, and if we can only use at most 3 processes with it I am interested in finding out why. This is one of those question that we need to answer in order to produce some stable F@C code. grompp is able to split it fine, but mdrun chokes. I am playing around with the other versions of GROMACS to see if there is any difference, specifically I am interested in testing with 3.2.1.
I have installed the latest stable release of GROMACS (3.2.1) on the clusters. I also installed GROMACS 3.1.4 on cairo. For both versions I installed a Baseline and an Optimal Config.
So on both clusters we have the following versions of GROMACS with both Baseline and Optimal configurations:
MPI_Comm_spawn sequentally gives processes to configured processors. For out application we would like the user to define exactly which machines this program will run on. MPI_Info can specify a file that will be used to specify such nodes. The file needs to look something like:
< MPI_Info File >
n1 -np 1 nanny
n4 -np 1 nanny
n5 -np 1 nanny
< /MPI_Info File >
To abstract the user a bit from the details of this level of detail I have created a function to generate the Nanny and Child MPI_Info files from a hostfile.conf that contains a comma seperated list of nodes in the lamnodes format above. For Example:
< hostfile.conf >
n1,n4,n5
< /hostfile.conf >
It is vital to note that MPI_Comm_spawn, and MPI_Info are part of the MPI-2 standard but many implementations do not support it fully. LAM-MPI is one of the few that support both MPI_Comm_spawn and MPI_Info. However LAM-MPI does not currently have some additional functions implemented for intercommunicators. Some of these functions are MPI_Bcast and other functions that send/receive/reduce messages globally to a group. Also there is not function [that I have thus found] to poll the side of a group via an intercommunicator. The work around here is to have the first member of the Child group (rank == 0) to send its size to the mother over the intercommunicator. Man pages provide useful information about each of these commands and their limitations.
cflow is installed on bazaar in /cluster/bazaar/bin/cflow
and on cairo in /cluster/cairo/bin/cflow
The rpm's failed silently, so I had to grab the source. A PPC version seems to be hard to locate, but I am still looking. *Just after posting I found a diff for ppc and installed cflow on cairo*
After looking up usage documents online, it seems that cflow is difficult to run on larger programs due to instability. I have had no successes after 40 minutes of playing with it on gromacs, but the given examples and smaller c programs work fine.
documentation:
http://www.opengroup.org/onlinepubs/007904975/utilities/cflow.html
http://www.freealter.org/doc_distrib/cflow-2.0/#sect6
In order to have a speerate Mother, Nanny, and Child process and play in the MPI sandbox we will need to use MPI_Comm_spawn to launch our Child and Nanny binaries from the mother. I have been playing around with this and produced a basic framework that uses this functionality. You can play wiht the files they are located here:
http://cluster.earlham.edu/home/joshh/dev/discover/dev/spawn/
Some points to mention before running to make a huge sandcastle:
1. There is an difference between intracommunicators (standard usage of MPI_COMM_WORLD) and intercommunicators. intracommunicators are used to speak with those members of your own tribe (the Children that are spawned are in a tribe all of their own, and the mother is in a seperate tribe). Intercommunicators allow tribes to talk together. So the Children need to know the handle (MPI_Comm mother) to reach the mother, and the mother needs to know the handle (MPI_Comm everyone) to reach the children.
2. The Children are able to call MPI_COMM_WORLD directly and it will allow the children to talk amongst themselves without talking to the Mother. If they want to talk to the mother they need to use the mother MPI_Comm 'channel'.
We should be able to Spawn Nannies and Children as seperate Tribes, and join them as necessary. It is possible to, after creating 1 or more intercommunicators, to join them into one big, happy intracommunicator.
To start the program you only need to start the mother, and pass it one node to start from. MPI_Comm_spawn wakes up the additional processors.
$ mpirun n0 mother
The installation of cflowd requires the arts and GNU flex libraries. There seems to some problem with arts communicating with flex at the moment. I'll taker a closer look asap.
It seems that cflowd analyzes flow files from network communication developed by Cisco. Is that what we where looking for?
I cleaned up the Capability Discovery code a bit.
Here is a table of the Future runs that I would like to run to answer the question:
For a given molecule, what is the optimal Number of Processes, taking into consideration SMP vs Uni-processor machines running both x86 and PPC hardware [Bazaar and Cairo respectivly]?
The file is here:
chart.html
Here are some notes about reading SMP vs Uniprocessor runs in the Database.
SMP Collection
cpus | nodes | processes | label | molecule | cluster_name | finish_time
------+-------+-----------+---------------------------+----------+--------------+---------------------
2 | 1 | 2 | Gromacs-SMP-Optimal-3.2.0 | villin | bazaar | 2004-02-24 21:52:37
4 | 2 | 4 | Gromacs-SMP-Optimal-3.2.0 | villin | bazaar | 2004-03-02 16:24:51
6 | 3 | 6 | Gromacs-SMP-Optimal-3.2.0 | villin | bazaar | 2004-03-02 16:10:04
8 | 4 | 8 | Gromacs-SMP-Optimal-3.2.0 | villin | bazaar | 2004-03-02 15:58:00
2 | 1 | 2 | Gromacs-SMP-Optimal-3.2.0 | villin | cairo | 2004-02-24 20:44:34
4 | 2 | 4 | Gromacs-SMP-Optimal-3.2.0 | villin | cairo | 2004-03-02 20:59:57
6 | 3 | 6 | Gromacs-SMP-Optimal-3.2.0 | villin | cairo | 2004-03-02 20:51:44
8 | 4 | 8 | Gromacs-SMP-Optimal-3.2.0 | villin | cairo | 2004-03-02 20:45:20
cpus | nodes | processes | label | molecule | cluster_name | finish_time
------+-------+-----------+-------------------------------------+----------+--------------+---------------------
2 | 2 | 2 | Gromacs-Optimal-Configuration-3.2.0 | villin | bazaar | 2004-01-20 16:28:06
4 | 4 | 4 | Gromacs-Optimal-Configuration-3.2.0 | villin | bazaar | 2004-01-20 16:06:36
6 | 6 | 6 | Gromacs-Optimal-Configuration-3.2.0 | villin | bazaar | 2004-01-20 15:50:56
8 | 8 | 8 | Gromacs-Optimal-Configuration-3.2.0 | villin | bazaar | 2004-01-20 15:37:34
2 | 2 | 2 | Gromacs-Optimal-Configuration-3.2.0 | villin | cairo | 2004-01-17 20:40:55
4 | 4 | 4 | Gromacs-Optimal-Configuration-3.2.0 | villin | cairo | 2004-01-17 20:27:23
6 | 6 | 6 | Gromacs-Optimal-Configuration-3.2.0 | villin | cairo | 2004-01-17 20:18:47
8 | 8 | 8 | Gromacs-Optimal-Configuration-3.2.0 | villin | cairo | 2004-01-17 20:11:57
The difference is in the conbination of nodes, and processes. In the Uniprocessor runs nodes = processes, in SMP (Dual CPU) runs nodes = processes/(cpus per node) or nodes = processes/2.
Note that in these runs cpus = processes, but this may not be so in the furture. This is only true because we only tested by running one process per cpu, but we may find that running more than 1 processes on a cpu is the optimal configuration.
I am going to setup some runs on Cairo and Bazaar to fill out our table.
As near as I can tell the SIGTRAP error is an interaction between MPI and the AltiVec code although that doesn’t sound reasonable on the surface of it. I made a copy of discovery.c san all the MPI calls and it works fine.
I think we will have to trace down exactly where in inl1130() the trap is occurring in order to ultimately fix this. One approach would be to have our signal handler tell us where it was invoked from. Another would be to put some code in to pause stresscpu() long enough for us to attach with gdb before the offending call is made.
I have finished the mother/nanny Capability Discovery Code. I have an MPI Version and a singleton version (seperate mother and nanny programs).
These are located here:
MPi Version:
http://cluster.earlham.edu/home/joshh/dev/discover/mpi/
Singleton Version:
http://cluster.earlham.edu/home/joshh/dev/discover/single/
I have a version for the x86 SSE enabled Linux systems, PPC Altivec Linux systems, and a generic Linux version. Currently the Altivec MPI Version is broken due to a Trap/Breakpoint error that I am struggling to track down.
in the directories above there is a create.pl script, which will build the approprate collection of source for your specified (on the command line) system. I need to make this into a configure script in the future.