June 30, 2004
Update - Josh H - June 30, 2004
Worked On/Working On
- Weather Duck
- Sent follow up email to support folks about sound.
- Created PHP page to allow user to adjust the graph to the last N hours.
- GROMACS Port PVM: All levels finished.
- GROMACS Port MPICH: Ran singular NxNxN tests, but working on how to get the SMP installation set up properly. It is currently stalling when the program finishes.
- GROMACS Port MPICH2: Working on installation. This may be a flop.
- GROMACS Port MP_Lite: Notes are now in blog
Progressing with installation. Need to run GDB to see what is causing the core dump
To do
- B-and-T-GROMACS Paper
- F@C development
- Start merging GROMACS mdrun with framework. Keep notes on any changes and what needed to be extracted.
Posted by hursejo at
05:36 PM
|
Comments (64)
Literature Search Notes
A short list of key phrases: vectorization, vector implementation, vector extension.
Places to start from:
Craig Hunter's (of NASA) paper on evaluation of PowerMac G4 systems
the FFTW papers
the GROMACS manual
the GROMACS papers
the documentation for the gcc complier concerning vector capabilities
related topic/information:
hpc.sourceforge.net has information about memory and cache useage.
Posted by schaejo at
03:30 PM
MP_Lite Notes
MP_Lite is a subset of the MPI-1 standard. Here are some of my notes from porting GROMACS from MPI to MP_Lite:
- A shared filesystem is needed since there is no deamon running on the nodes.
- Uses ssh to communicate by default
- M-VIA and VIA compatable
- I am testing initally with tcp settings, but I found this in the README:
For workstations, type 'make tcp' and link libmplite.a into your code.
If you are sure you won't pass messages larger than the TCP buffer
size, you can use the synchronous version by doing 'make tcp_sync'
which may increase performance by a few %. The TCP buffer size is reported
in the .nodeX log files after each run.
- FFTW needs at least the following functions:
- MPI_Comm_dup
- MPI_Alltoall
- MPI_Alltoallv
- MPI_Issend
which are not included in the MP_Lite library. So we have to compile fftw without mpi support.
- Load Balancing of processes is fairly streight forward since the hosts passed to the command line are used in the order you give it:
- Command Line: mprun -np 7 -hosts c0 c1 c1 c2 c3 c3 c0 prog
- c0 2 processes
- c1 2 processes
- c2 1 process
- c3 2 processes
Note that they are numbered by the oder in which they were given on the command line.
- 0 -- c0
- 1 -- c1
- 2 -- c1
- 3 -- c2
- 4 -- c3
- 5 -- c3
- 6 -- c0
So if we are using a ring structure we would want to group our nodes together. With GROMACS the ring is based on the MPI numbering of the nodes. so the optimal version of the command line should be as follows:
- Command Line: mprun -np 7 -hosts c0 c0 c1 c1 c2 c3 c3 prog
- 0 -- c0
- 1 -- c0
- 2 -- c1
- 3 -- c1
- 4 -- c2
- 5 -- c3
- 6 -- c3
- More than one process can be on a single node. MP_Lite uses a different port per process for communication on a single node.
Posted by hursejo at
10:47 AM
|
Comments (166)
June 29, 2004
notes on gromacs soure
I have posted most of my notes on the GROMACS source that I have been grep'ing around in, in the CVS directory for numerical methods.
Here, hopefully, is a link to it: /cluster/project/numerical-methods/doc/schaejo_gromacs_notes.txt
Posted by schaejo at
04:48 PM
Meeting Notes
General Notes
- JoshM: Should look into MPI's native abilities for Load Balancing, Checkpointing, Recovery. Reading MPI books and articles.
- Load Balancing
Are we able to specify exactly the number of processes per node in the configuration? so
- Node 0: 2 processes
- Node 1: 5 processes
- Node 2: 3 processes
- Node 3: 1 processes
- Node 4: 7 processes
Control processes within the code, or control over the distribution pattern?
- Checkpointing: Native to MPI or LAM specific or other?
- Checkpointing interval should be adjustible. Able to flex from 1 hr to 5 min or something like that. Should be signal between nanny and child.
First order solution we have a fequency of the mdp file dump. Which should appear on the mother. Nanny does all local file collections.
- Mininal GROMACS with grompp and mdrun.
- Add scientific core to Framework. Initally just frame, then GROMACS. Child merged with mdrun. Document changes.
Posted by hursejo at
03:11 PM
|
Comments (33)
June 28, 2004
Meeting Notes
persons present: dawit, john, both joshs, charlie
Plumbing
- Switch to 1/3 plumbing and 2/3 other.
- LAM on cairo - do a bproc search to find problem. Check lam list for other installation errors.
- Weatherduck - dropdown for time periods.
- No word from weatherduck people in regards to sound.
- MP graphing tool works.
- DVC and PQC are now tools instead of consoles.
- Dawit to test bazaar images on bazaar annex.
- PBS scare. Someone on plumbing remove.
- Charlie got a recommendation for Debian on the clusters. Consider for future.
- Install yellowdog 3.0.1 on cairo.
- Get athena working (F@C testground?)
- Fishy lag on cluster, hopper, admin. Smells like a DNS problem. Check it out.
- Kill mt image wishlist entry.
- Joshm and Dawit will look at newly updated plumbing list.
Numerical Methods
- Charlie will update reading list.
- John found many cool things while grep'ing through gromacs.
- Goal is to pull out the inner loops in gromacs for the benchmarking kernel.
- New goodies in the gromacs-overview cvs project.
- Look at pros and cons of loop unrolling vs cache efficiency.
F@C
b-and-t
- Poster not yet printed. Thoughts of submitting poster to Kinko's web interface. Joshh will see if we can get a proof electronically. If not, charlie will pick it up.
Cluster Move
- Move weather duck to the new cluster room.
- Possibly move sometime next week.
Misc.
- 11a-1p meeting Thursday (lunch included). Normal meeting time next Monday.
- on cluster.earlham.edu: split resources page into tools, resources, and monitoring. Maybe a combination of all (problem with a label for this). Put links to the reading, plumbing, conferences, summerplan, docs.
Posted by mccoyjo at
02:19 PM
|
Comments (20)
Sunday update
After more greping around gromacs I have found some more useful things. I found the print statement for the tail end of the .log files, and I am working on unraveling that thread. I have not found any of the other numerical routines that are employed yet (I found the most basic one a while ago). So I guess I will just keep looking.
Posted by schaejo at
09:29 AM
Update - Josh H - June 27, 2004
Worked On/Working On
- Molecule Testing Tool: Moved to CVS cgi-bin/mtt
- Weather Duck
- Sent email to support folks about sound. Waiting for response.
- Added source code to CVS in generic/src
- Should I display only the last X hours of reported data in the WeatherDuck Graph? I can see it getting crowded, and adjusting the script to display only the last 48/96/... hours is fairly trivial.
- GROMACS Port PVM: All levels finished. Waiting on Graph of data.
- GROMACS Port MP_Lite: Working on configuring GROMACS. Keep getting core dumps, and I am trying to figure out exactly why. Also producing my notes on MP_Lite in the blog.
- GROMACS Port MPICH: Ready to run on cairo once Charlie is finished with it.
To do
- B-and-T-GROMACS Paper
- F@C development
- Port GROMACS to:
- MPICH2
Posted by hursejo at
07:21 AM
|
Comments (34)
June 27, 2004
update Josh McCoy
Cluster Admin
- LAM-MPI 7.0.6 - The install went smoothly on bazaar. There is a comilation error on cairo. I am in the process of an indepth check.
- I am still messing with system imager on athena.
- Bazaar annex as a testbed for bazaar image: I would like to do this if it does not step on others' toes.
- The last option for a 2.6 kernel on cairo have failed. The mess has been cleaned up on c15. Clean install of yellowdog will happen in short order.
- Added a cluster wishlist entry in the Cluster Admin category. Add anything you would like to have done before the change over to the new images. This includes changes to admin, hopper, and any software we use. It would be nice to record any changes made.
Data Visualization
- Spent some time cleaning DVC source.
- Made first cut of a graphing script for MP implementation on ps vs parallel architecture. It still needs some cleaning and a coat of wax. Check it out at the Data Visualization Console.
- Switched the DVC to full output for both debugging and checking out the latest information added.
b-and-t gromacs
- Read through the CVS documents and joshh's notes. More to come at 10am tomorrow morning.
numerical methods
- Added gromacs figs and cflows to the gromacs-overview project.
Posted by mccoyjo at
11:23 PM
|
Comments (60)
June 24, 2004
Meeting Notes - June 24, 2004
Numerical Methods
- John Reported that K&R is going well, and he is feeling more comfortable reading C code. Has not looked too deep into the profiling numbers reported by GROMACS. Has started to look at the inner loops and lookup tables in the GROMACS source, and is keeping notes on which files and functions used for the extraction phase of the benchmarking kernel.
- Need resources on Assembly programming on x86 and Altivec.
- Need to think about how to actually develop the benchmarking kernel from the infromation we have. Esp. for folks who have minimal experience with C code.
F@C
- Moving forward. Josh H reported on the conversation Charlie, Prof. Pande, and Josh H had on Tuesday night. Our goal is to have a working version by the first week in August.
- Molecule testing tool is current with last Monday's notes. Will move to CVS repository /cluster/cgi-bin.
B and T GROMACS Paper
- Meeting Monday at 10 am to talk about paper. Review all documents produced thus far (josh H's doc, and Charlie's notes in /cluster/project/b-and-t-gromacs/reposts specifically the ones with -buffer or -notes extensions in the directory that one cannot see via the web, MT enties), Where to publish?
- GROMACS PVM Port: Waiting for Visualization tool
- GROMACS MP-Lite Port: In development may have a lead by harnessing PVM development.
- GROMACS MPICH Port: Compiled and working on the scheduler script.
- /cluster/project/gromacs-overview to review for GROMACS notes.
- Should we develop a document that has a short description of each C file in GROMACS along with descriptions of which variables/functions mean what? Use are limited knowladge to start then open the 'Help Document' Project up to the GROMACS developers community.
Plumbing
- Bazaar is ready. Need wish list for Zero'ith node, and need environment for testing zero'ith node. Suggested using bazaar annex as a seperate cluster to test the bazaar image.
- Athena: Josh M is learning SystemImager. Athena needs a bit of network work (NIS, etc.) before it is ready to image.
- Cairo: YD 2.6 kernel is not building at all. There is a core group of people developing the 2.6 kernel for PPC, but nothing is working yet.
We have the latest and greatest, so leave it as is and wait for the stable 2.6 kernel to emerge.
The latest version of YD is a minor fix for G5 security hole, and RPM upgrades. Josh M is going to go through the Change log once more to see if there is anything that we care about. This would be an upgrade from 3.0 to 3.0.1.
If there is nothing then Josh M will do a fresh install on c15 to clean out the kruft in the current image. He will do the same for the c0 Image.
- No progress on Data visualization, but now that the load of the images is lighten'ing Josh M will move to developing the PVM visual for Josh H.
- PostGreSQL is being backed-up on admin via the backup script. Josh H fixed a ssh key exchange between admin and hopper for root.
- LAM-MPI is not upgraded due to FTP troubles on host site. If it is not up in a few days Josh M will send mail to Users list.
General
- Charlie is using Cario instead of Bazaar for the workshop.
- DNS lookup problem may extend beyond bazaar to hopper and admin as noted by Josh M. Should look into this while developing the image. Charlie can take a look at this next week since it requres a bit of advanced knowlage of named and DNS.
- Josh M will park the cflow and xfig files for GROMACS that he has in /cluster/project/gromacs-overview
- Noted that we should all strive to publish our updte notes the night Before the meeting, and send notifications when we post our bi-weekly entries.
Cluster Move
- WeatherDuck e-mail to support was sent. No word yet. If no word in a couple days, Josh H will ping again.
- Josh H will put the WeatherDuck source code into CVS into repository /cluster/generic/src/
- Do we have a date when we will do the move? Mid-july, if not sooner?
Conferences and Presentations
- LinuxFest is corresponding with us. We may need to alter our approach a bit, but it should work. Josh H has been forwarding all correspondance to the listserv as it comes in.
Posted by hursejo at
02:30 PM
|
Comments (31)
update Josh McCoy
The plumbing is going well. I hope to have things wrapped up by the end of the work day. I still need to play with system imager on athena and do the a0-a11 flop.
I am quite ready to move on to the data visualization tool and to doing some science.
Posted by mccoyjo at
09:40 AM
|
Comments (10)
Wednesday Update
I choose to spend the majority of yesterday reading K&R rather than GROMACS code. Today I am planning on finishing the chapter on structures and then searching the code.
Are bitwise logical operators important? I didn't quite understand that part, but wasn't sure it was worth the time to digest fully.
On the meta-level, I am getting kind of frustrated. I am three and a half weeks in, and I don't feel like I have done much. This doesn't need to be a meeting item, but I would appreciate any topical words of wisdom, or mental judo tricks to deal with it.
Posted by schaejo at
08:09 AM
Update - Josh H - June 23, 2004
Worked On/Working On
- Gave John a tour of CVS and grep.
- Added pg_dump of PostGreSQL DB to the admin backup script. I also fixed some ssh keys on admin for root.
- I have been communicating with the Ohio LinuxFest organizers WRT the b-and-t-gromacs presentation.
- Molecule Testing Tool: Per our conversation on Monday, I made a series of changes allowing for editing/deleting/updating of tests and molecules.
- Weather Duck Send email to support folks. Waiting for response.
- GROMACS Port PVM: All levels finished. Waiting on Graph of data to make judgement on performance. Once the best has been determined then the rest of the parallel structures will be run on cairo.
- GROMACS Port MP_Lite: Working on configuring GROMACS. Also producing my notes on MP_Lite in the blog.
- GROMACS Port MPICH: Working on an installation.
- F@C Development: Talked with Prof. Pande and Charlie, and have some notes that we will review soon. Goal date: 4-6 weeks from now [First week in August]
To do
- B-and-T-GROMACS Paper
- F@C development
- Port GROMACS to:
- MPICH2
Posted by hursejo at
07:07 AM
|
Comments (56)
June 21, 2004
Meeting Notes - June 21, 2004
Numerical Methods
- JH gave JS a grep tour. The cairo runs are done, results are in the table in JS's home directory. JM has the directory ready, he needs to check with JH for instructions for how to do the CVS chicken swing, that needs to be documented in MT under Cluster Admin. The distribution of time was roughly the same under x86 as PPC. We still haven't identified a proper subset of MDP parameters to use to identify particular methods used with a given molecule.
F@C
- JH's molecule testing tool looks good. Location should be clear that it's a file system reference, URL manufactured from that. Need a delete mechanism, both molecule and tests and just tests. Edit the tests row.
- On Tuesday evening JH and CP will talk with Vijay Pande. Before that JH and CP will design an MD agnostic architecture.
B and T GROMACS Paper
- MP_Lite might be a flop. There is a dependency between GROMACS with MPI and FFTW with MPI. FFTW uses more complex MPI instructions, ones that MP_Lite doesn't implement. JH thinks he can work through this.
- Now that c14 is available JH will start on MPICH.
Plumbing
- Imaging - JM says that bazaar good to go, athena not quite complete (NIS, mounting, c3 tools, etc.), ppc in progress with Yellow Dog (CD download is taking forever). Latest version of Yellow Dog isn't that different from what we are running now (3.0 vs 3.0.1), JM will try 2.6 kernel under 3.0 and see if it works. JM will try SystemImager out on athena.
- NIS is running on admin now, /cluster seems to be mounted permanently on admin.
- JM will install the latest LAM-MPI on bazaar and cairo preserving the old version.
- Database backup - JH will add a pgdump to backup script following the same even/odd directory pattern.
- JM will try to get some of the graphing work done while waiting on imaging, etc.
- Charlie still needs to update the wish list for the new image.
General
- We should all be getting in the habit of making MT entries when we have a substantive item, e.g. exactly what code is responsible for the LJ + Coul(WW) entry in the results from GROMACS, or how the PVM implementation of GROMACS compares with LAM-MPI. Remember to use appropriate blessed keywords with each entry. These nuggets will form the basis for our posters and papers.
- For the workshop CP will use bazaar. 14 users (user0-user13), ssh setup, GROMACS, FFTW, subdirectory called villin with the base files set to 500 steps.
Cluster Move
- Charlie will ask (again) about lighted switches, key core, and vibration on the panel. BillB is going to move the next ethernet port from the East wall to the closet and remove the 110v.
- WeatherDuck - Working fine other than sound, JH's graphs look good. JH will check with them about swapping it for another unit. JH will put it in cvs, the link to it is already in the references section of cluster.earlham.edu.
Conferences and Presentations
- Consider SIAM Computational Science and Engineering 05, submission deadline is August 11. F@C (poster), B and T GROMACS (paper), numerical methods for MD (poster). CSE05.
- LinuxFest has started to review presentations, no word yet.
Posted by charliep at
08:06 PM
|
Comments (41)
Update Monday morning
Sorry I should have done this last night.
I have moved the contents of the table to an html format, but I don't know how to look at it. It is probably still pretty ugly, and I have no idea what the spacings will look like, but it is in close to the right format.
I would like to do that grep and find tour today. I tried to figure some stuff out on my own, but I did not turn over any useful results.
I am starting the cairo runs this morning, because I know how to do that part.
Posted by schaejo at
09:41 AM
Meeting Notes - June 17, 2004
Numerical Methods
- Problems with molecule organization, we need a better structure. Identify the subset of mdp parameters that we care about to distinguish molecules and methods. JM will create /cluster/project/numerical-methods and the associated CVS stuff. src and doc subdirtories for now. JS will use one of the tables from the b and t gromacs poster to move his chart to that form and park it in CVS in that directory. JS will start with LJ + Coul(WW) and figure-out exactly what code that represents. JH will work with JS to give him a tour of the relavent tools.
F@C
- Josh is working on a tool for tracking molecules and results. On Tuesday evening JH and CP will talk with Vijay Pande.
B and T GROMACS Paper
- First cut of mp_lite seems to be working. PVM port is running. JH has notes for mp_lite setup that he'll publish in MT with appropriate blessed keywords. JH, JM, and CP will meet on Monday the 28th
Cluster Move
- The cluster closet looks like it's ready as of about 12:30p today. Charlie will ask about lighted switches, key core, and vibration on the panel. BillB is going to move the next ethernet port from the East wall to the closet and remove the 110v.
- WeatherDuck - Working fine other than sound, JH's graphs look good. JH will check with them about swapping it for another unit. JH will put it in cvs, the link to it is already in the references section of cluster.earlham.edu.
Plumbing
- Gentoo on PPC - CDs did not work in the lab. New cut of cd failed as well, are those Macs disabled from booting from CD? I don't want us to spend our whole summer doing plumbing! JM will install latest and greatest Yellow Dog with a 2.6 kernel on c15 and see how it goes.
- Gentoo on x86 - bazaar is ready to go other than the additions necessary for b0 and a problem with NIS. Try file attribute of immuteable to see where yp.conf is being deleted. Charlie still needs to update the wish list for the new image.
- JM will try to get the images for both bazaar and cairo done by the middle of next week.
- The mount of /cluster to admin should be made permanent. NIS setup on admin. JM.
- Long-term item - Scripts to monitor WeatherDuck and notify and then shutdown as appropriate. Organize and web publish all our code for this (HIP?).
General
- Support for workshop next week - Charlie will need 10 users, user1-user10, setup with rc scripts that setup the appropriate paths for GROMACS, LAM-MPI, etc. on cairo.
- JH will move cp's extra phone patch to the seminar room.
- Meetings next week - Monday plumbing and paper, Thursday F@C and numerical methods.
- JM latter half of next week data visulation tool and preset query tool development.
- We should all be getting in the habit of making MT entries when we have a substantive item, e.g. exactly what code is responsible for the LJ + Coul(WW) entry in the results from GROMACS, or how the PVM implementation of GROMACS compares with LAM-MPI. Remember to use appropriate blessed keywords with each entry. These nuggets will form the basis for our posters and papers.
Posted by charliep at
04:27 AM
|
Comments (28)
June 20, 2004
update Josh McCoy
There isn't much to update as I have just returned from a weekend trip. I will continue to work on the images and other plumbing until the middle of the week.
Posted by mccoyjo at
11:41 PM
|
Comments (34)
Update - Josh H - June 20, 2004
Worked On/Working On
- Weather Duck I updated the code to make it a bit more robust, and checked the sound. It seems to work exactly as it should, but I cannot verify the dB since I don't have anything that produces a given dB that I can measure with it. Some informal testing showed that it was working fine, but not nearly as sensitive as it once was.
- GROMACS Port PVM: Version 0, 1, and 2 are finished. Version 3 should be finished Monday morning. Next to run is the MP-Lite port.
- GROMACS Port MP-Lite: I am compiling the first version using the MPI bindings and default TCP window sizes. Once the PVM runs are finished then I can fully test my configure scripts, and make sure that I have everything wired up correctly.
Versions:
- Version 0: (mp_iterface = MP-Lite) Use MPI Bindings and little to no changes in the GROMACS code.
- Version 1: (mp_iterface = MP-Litev1) Use make tcp_sync to see how much of a performace gain we can set from the syncronous TCP version.
- Version 2: (mp_iterface = MP-Litev2) Possible Convert GROMACS source from MPI to MP-Lite syntax to see if there is a performace benifit from using the native calls instead of the Wrappers. This is unlikely, but possible.
- Version 3: (mp_iterface = MP-Litev3) Use the suggested rmem_max and wmem_max sizes in the MP-Lite/README
- I am working on a tool to catalog informaiton about molecules. Specificaly Fail modes, and contributed information. I am currently working on the web interface to this tool. I have created 2 new tables in the databse, and thier schemas have been appended to the end of the db-objects.sql file.
To do
- B-and-T-GROMACS Paper
- F@C development
- Port GROMACS to:
- MPICH
- MPICH2
Posted by hursejo at
10:20 PM
|
Comments (39)
June 17, 2004
Update - Josh H - June 16, 2004
Sorry I forgot to post last night.
nothing really to report. i am running the PVM tests, and waiting for them to finish. While waiting I am investigating MP-Lite
Posted by hursejo at
10:21 AM
|
Comments (31)
wednesday update
I made very little progress wednesday. I expanded/tidyed-up the table, and JoshM and I found a copy of K&R in the Library. I am almost finished with Chapter one of the Numerical Methods book.
Question: How am I supposed to find these sub-routines in GROMACS? Charlie's comment about working backwards from the print statement made some sense, but JoshM showed me what a mess the GROMACS source is and I feel I could use a litte more guidance on this project. I don't want it done for me, but I do want to know some inteligent ways of doing it.
Posted by schaejo at
09:22 AM
June 15, 2004
Meeting Notes - June 15, 2004
- Postmaster was not running on hopper. Josh M will check to see if hopper has a shutdown script for postgres.
- John is making progress in the Numerical Methods book. Charlie suggested some places to focus
- Image. athena and bazaar are mostly/completely fine. Gentoo boot cd is flaky on PPC. Going to test in the Mac Lab to see if that helps. Boots, but does not play well with /dev. If all else fails then try to port 2.6 kernel to latest YellowDog release.
- John has a chart of methods/molecules/dominate subroutine/flops.
- John and Dawit should be starting to learn C so they can move through the GROMACS source they care about. Specifically how these numbers are calculated. This will be the first step in creating a subset of GROMACS to focus on, and tune. This way we do not. How stats are done? What they represent? Develop Benchmark Kernel.
- Molecule repository cleanup. What is project1012
- PVM is running. hopefully by the end of the room all of the PVm runs will be finished. Once Josh M gets the graph I need I can visual this performance improvement/loss.
- Is the cluster room ready. Power Outlets/HVAC installed/hole filled in/Cookies?
- Kinkos stuff should be moving with Josh and Charlie back to Richmond.
Posted by hursejo at
01:33 PM
|
Comments (29)
update Josh McCoy
Due to very poor internet connections and loss of electricity(both at home and at EC), I have basically been catching up on the reading list. I did have brief chance to mess with ypbind on the bazaar gentoo image. I have also subscribed to several gentoo mailing lists (ppc-user, ppc-dev, cluster, osx) and have taken a good look at all of them. Unfortunately, traffic is very light (so light that the digests are weekly or even monthly in some cases) and many posts go without responses.
The next few days will entail finishing the cluster images and using system imager to distribute them. Hopefully all goes well.
Posted by mccoyjo at
09:39 AM
|
Comments (25)
June 14, 2004
Sunday Update
I have finished the table of molecules. I am pretty sure that I only needed to run each once, because while the results may have varied from run to run, the number of calls to each sub-routine did not. I am not 100% sure of this, but I have gotten the same results everytime I have run the shorter molecules.
I will entertain myself today with more reading, and hope that Charlie outlined the chapters for the Numerical Methods book.
Posted by schaejo at
09:02 AM
June 13, 2004
Update - Josh H - June 13, 2004
Worked/Working on
- Found this interesting command from the lam list:
laminfo -param rpi all | grep priority
It should display the default RPI module ranking. It must be a new feature (integrated in versions later than 7.0.2) since it does not quite work with our current setup.
- GROMACS PVM Port: Finished cleaning, and it has been running tests. I have version 1 running now. Once it finishes then I will run version 2 on the same dataset to see if there are any performace gains. There may be a version 3 to test the performace loss/gain of the following command:
pvm_setopt(PvmRoute, PvmRouteDirect);
Which was in the orignal code, and preserved in versions 1 and 2.
- Version 0: (mp_iterface = PVM) Orignal Version from GROMACS site.
- Version 1: (mp_interface = PVMv1) Modified to allow multiple slaves per node in configuration.
- Version 2: (mp_interface = PVMv2) Made all sends PvmDataRaw since we do not use a endian hetrogeneous cluster. This will likely give us a performace boost, at the cost of portability to hetrogenenous environments.
- Version 3: Possibility (mp_interface = PVMv3) Test the performace result due to the removal of the explicit PvmRouteDirect declaration.
- Fixed Arther Vining Davis logo in Poster, and placed files on JumpDrive
To Do:
- B-and-T-GROMACS Paper
- Port GROMACS to:
Posted by hursejo at
12:48 PM
|
Comments (86)
June 10, 2004
Meeting Notes - June 10, 2004
- John is running simulations, using files in /cluster/project/molecules noting the soft-links present in that directory.
- Bring Numerical Methods Book from Ranch.
- Move notes:
- Athena Image is ready.
- Bazaar is coming along. NIS and c3-tools need to be installed.
- Cairo will not install Gentoo. can't go to YellowDog easily since it does not support the 2.6 Kernel. JoshM will
- Send mail to Bill Birum about north wall of Recompute space and moving the boxes. Contact security about key core.
- Send donut requests to Charlie for Sunday morning.
- Josh H will shutdown clusters around 8ish Friday.
- O'Reilly Essentual System Admin Book to add to reading list.
- Charlie is noticing a difference between running a molecule from a fresh boot and from a longer uptime system on Bazaar with regards to failure. He will keep monitoring this situation.
- PVM port of GROMACS has reached stage 2 of 3. After cleaning things up and testing a bit, Josh H will post a tarball of the new source to the GROMACS site.
Posted by hursejo at
01:44 PM
|
Comments (56)
update Josh McCoy
The last few days have been long sessions of plumbing mostly consisting of installing gentoo on various nodes. Here is a progress report.
Athena - a11 - Dawit seems to have control of the image here. I'm not sure of the details, but progress is being made.
Bazaar - b20 - Things are going well on bazaar. The image is almost ready save the the following items: nis, c3tools, a dns issue, and getting dhcpd to run on boot.
Cairo - c15 - There is an issue about booting from the gentoo cd that we are unable to solve (dawit, skylar, and I). The odd thing is that the process begins, you get to pick your temporary kernel, and then the errors start to roll in. It seems the /dev filesystem is not initializing properly. I have tried everything I can think of (including a substantial amount of time trying to track down information) to no avail. I'd be happy to give a tour of the problem tomorrow during our meeting.
I would like to note that gentoo has very little automation in regards to installation. It boots to an environment on the cd and says have fun. This has made the install both more time consuming and more worthwhile. Considering my previous sys admin experience, I've jumped into the deep end of the pool and seem to be swimming.
Posted by mccoyjo at
09:36 AM
|
Comments (23)
Wednesday Update
I have been doing runs on node b16 and recording the most significant sub-routines to a file. I didn't get all of them (or truncated versions of all of them), but I did 8 of them. Two came out looking very similar, and I don't really know why.
I finished reading chapters one and three from the GROMACS manual. I don't really know how much of that was expected to stick, because not much of it did. If it is worth spending time on, I could go back through and try to unpack into a more readable form.
Since I lack the courage to wade into the GROMACS source code that JoshM showed me the location of, I will spend tomorrow morning revisiting the HPC book and making notes to myself. If I finish that, I will start on a numerical analysis book that Tim McLarnen gave me.
Whats next?
Posted by schaejo at
12:46 AM
Update - dawit
I got the short intro on how to ran gromacs from Josh.
I have now a working gentoo image on athena, node 11 (159.28.231.32). The network is configured statically since there is still the dhcp thing to work on. But you guys can test the system, ssh is enabled.
Posted by bekelda at
12:08 AM
|
Comments (68)
June 09, 2004
Update - Josh H - June 9, 2004
Worked On/Working On
- WeatherDuck Monitoring: I have a working set up and running on admin. You can access the graph which is updated every 10 min. I did not add the ability to automatically shutdown the cluster is the temp is over X or any functionality like that. At the moment I am just poling the device, gathering data, then pushing that data through gnuplot.
The Sound metric is dB.
- GROMACS PVM Port: It is comming along. I have broken the sequential nature of the previous port, and am now cleaning things up. It should now work with any collection of runs that we wish. I have not tested to see if the dependancy upon the number of nodes in pvmd is still there. In theory it should not be, since it was only a dependancy because of the sequental numbering scheme in pvmd when you order the nodes that you add correctly and have is match the size of your ring (which was very shaky ground from the beginning).
To do
- Thursday Morning: Fix Arther Vining Davis logo in Poster
- B-and-T-GROMACS Paper
- Port GROMACS to:
- PVM
- MPICH
- MPICH2
- MP-Lite
Posted by hursejo at
04:45 PM
|
Comments (73)
June 08, 2004
Editing Picoseconds
Hi All,
Charlie mentioned last thursday that he had edited the number of pico-seconds that the simulation ran through. How did he do that? I ask because many of the molecules sitting in the folder JoshM pointed me towards have default run times of 1000.0ps which would take days. I would really like to chop that down to 10-20ps so that I could put together my table of "the most flop consuming subroutines" with a wide selection of molecules.
Thanks.
Posted by schaejo at
01:47 PM
June 07, 2004
Meeting Notes - June 7, 2004
CCG Meeting - cp, jh, jm, db, js
New cluster closet - JoshH will take a tour and make sure things are on-track for us to move Sunday. Things appear to be moving along, JohnW says it should be ready. Kevan is working on moving ports to the CS subnet, at least enough for our immediate needs.
WeatherDuck - JoshH has it working in a basic mode, he and JoshM will keep working on this. Charlie will order the H2O monitor for the WeatherDuck.
Plumbing
- Distros - Dawit and Skylar tried Gentoo this weekend, didn't go well but it seems like they know why. Stick with Gentoo for all three. Use lilo for the image, SystemImager works with it better than grub.
- DVQ and PQC - Minor changes, more to come after the move this weekend.
- Consider warewulf and oscar after the move.
- During the new image build process keep a good list of everything that is different from the base image. In theory it is just routing and configuring the second network interface.
- When we have 2.6 running on cairo let's test SMP on c0.
Numerical methods - John and Dawit will get a tour from JoshM on using ssh, GROMACS, etc.
GROMACS/PVM - JoshH is making progress, he isn't convinced that the person that wrote it did the work of using task ids correctly.
B and T GROMACS - JoshH waiting on feedback on the prose, likely this won't happen soon. We should in the near term look at JoshH's list of potential outlets (May 23 MT entry) and make a decision about this.
Next meeting items
- All - review pertinent section of the plan
- Plumbing - plan for workshop support
Future
- Data visuaization console
- Preset query console
- Grant tour
- Conference tour
- Scheduler presentation and conversation, read JoshH's entry in preparation
- FFTW/GROMACS/MPI diagram update, use GROMACS manual as source?
- Modular F@C - make a plan
Posted by charliep at
01:38 PM
|
Comments (79)
Update - Josh h - June 6, 2004
Sorry for the late update, I was without net connection for the past 48 hours.
I have been working on the PVM GROMACS code, making it a bit more roubust. hopefully I will be finished with it by the end of the day Monday.
Posted by hursejo at
09:23 AM
|
Comments (78)
Update - Dawit
I have been working on plumbing all weekend. A CD burn of a gentoo installl failed for G4-SMP kernel(hence why c15 has been taking a break for the weekend). A gentoo install failed halfway on athena when trying to chroot, happens to be an error on my part in thinking that the subarchitecture is an i686. Hopefully this same image would work for the bazaar node. I am taking node 20 from the annex.
Me and Skylar are also waiting on hopper to finish downloading packages with jigdo for Debian. It appears it has 2.6 kernel support.
Posted by bekelda at
02:58 AM
|
Comments (53)
update Josh McCoy 6 June 2004
- Updated the GROMACS software overview with information on running single/multiple process(es). Added a few sample commands to show how to run villin.
- Fixed the titles of the DVC and QVC and cleaned up/documented source.
- I should have a working graphing script for joshh tomorrow.
- Read Hassan's system imager docs. It tells of a special case for the 0th nodes. I'm guess there aren't images for the head nodes.
- I'm hoping to have the gentoo installations in basic working order (NIS, NFS, networked properly) Monday. Dawit and I will have to spend a majority of our time tomorrow on this task (side note: I plan on coming to campus Tuesday and Wednesday to help ensure the new images go over smoothly).
- Took a look at OSCAR. The project seems interesting. It may be worth attempting to install on the annex.
- I had a hard time finding warewolf (even after searching the beowolf mailing list archives). Can someone send me a link?
- After spending some time looking at gentoo, I found a list of gentoo supported kernels. I didn't see any specific references to NAPI or low-latency kernels.
- As far as I can tell, Linux kernel 2.6 has NAPI support in that many of the drivers included with the kernel support NAPI. The official change logs/readmes can be found here: http://www.kernel.org/pub/linux/kernel/v2.6/
- Read more of the first altivec reading.
Posted by mccoyjo at
01:19 AM
|
Comments (42)
June 06, 2004
John's sunday's update
I have moved my eyes across all the text in the HPC book. I will revisit it in the near furture to make notes for myself. I intend to find and read the "so you want to run GROMACS" document tonight or tomorrow. I will no doubt have questions involving that process, but those questions would not be an efficient use of meeting time. If one of the two Joshes could help me out, that would be great. I know my understanding of remote remote computing, via ssh or whatever, is lacking. But I do not know if that is my biggest problem.
Posted by schaejo at
10:17 PM
June 03, 2004
Meeting Notes - June 3, 2004
CCG Meeting - cp, jh, jm, db, js
Check-out clusterworld.com's urban legend article. If you aren't a slashdot.org reader check that out too.
The b-and-t gromacs prose that Charlie was going to find is already in CVS in the -notes files.
Plumbing
- distro research, so far Josh thinks either Gentoo or SuSe (out of SuSe, Fedora, Gentoo, Debian, RedHat). He'll check-out OSCAR and Warewolf. Gentoo and SuSe have recent kernels and frequent updates. Gentoo supports lots of kernels, do any of them purport to be low latency? Gentoo supports x86, PPC, and SPARC. D and J will build three images, a, b, and c; NIS and NFS, and will let us know when we can test it.
- When you have a minute check-out IPMI.
- Charlie will check with John Walker about 24H HVAC
- 0th nodes are custom - cexec (should be on all nodes), routing, dhcrelay. Check local SystemImager for others and update as necessary.
- WordPress after the images are ready.
Data Visulization Console - JoshH's bug couldn't be recreated. Use "DVC" and "PQC" for the titles. This was cut short, more to follow.
Running GROMACS - JoshM will update the document and make sure that a single processor config is supported. John and Dawit will learn how to run it on bazaar. Study the output from mdrun, particularly mega-flop accounting, and start groking it.
Sub-meetings - Mon for plumbing and b-and-t gromacs paper. Thu for F@C and numerical methods. 1p most weeks, 7p or thereabouts when Charlie is out of town. Try to stick to 1/2 hour per topic. All of us attend each meeting (except during vacations, etc.)
Readings are going ok, answered questions. Starvation will occur sometime next week, Charlie will update the list.
Brief review of the summer-2004 plan. During each of the upcoming sub-meetings review the pertinent section.
Communication - We talked at lunch about various approaches to this, Charlie will write them up and circulate.
GROMACS and message passing libraries - JoshH still plugging along, limited results and a direction to pursue WRT node numbering.
Next meeting items
- All - review pertinent section of the plan
- Plumbing - plan for workshop support
Future
- Data visuaization console
- Preset query console
- Grant tour
- Conference tour
- Scheduler presentation and conversation, read JoshH's entry in preparation
- FFTW/GROMACS/MPI diagram update, use GROMACS manual as source?
- Modular F@C - make a plan
Posted by charliep at
04:02 PM
|
Comments (34)
Linux Distributions for x86
| Distribution |
Latest Stable Kernel |
Notes |
Red Hat 9.0 |
2.4.26 |
Dead and and 2.6 kernel upgrade options look sketchy |
| Fedora Core 2 |
2.6 |
Rumors of instability persist. |
| Gentoo 2004.1 |
2.6 |
Updated often, support for many kernels (hurd, windows compatibility kernel). |
| SUSE 9.1 |
2.6 |
Updated often and has an extremely large package library. It is becoming less "free-friendly". |
| Debian 3.0 |
2.2.20 |
Can be upgraded to 2.6 kernel, but it certainly is not a Debian branded stable release |
As it looks now, SUSE and Gentoo look the most promising followed by Debian. Red Hat 9.0 seems like a really bad idea. Fedora could be a real wild card. It has good update times and the kernel we need but it is still considered flakey.
Posted by mccoyjo at
11:46 AM
|
Comments (41)
Update - Josh H - June 2, 2004
Working on/Worked on
- Fixed dhcrelay on cairo. All nodes are now using DHCP to configure eth1, eth0 is disabled on startup by default.
- Porting GROMACS to:
- PVM: I have been able to run a molecule. I have a script that is running our tests on cairo. It has completed the NxNxN or Eq. A, and will be working on Eq. B-D for the next couple of days.
I need a new Graph in the DVC to display PVM and LAM-MPI runs on the same graph. So I can show the performance difference between PVM, LAM-MPI, MPICH, MPICH2, and MP-Lite on the same graph for a single molecule. Can we adapt the 2-D level graph or should we start anew? I am leaning towards the latter.
Currently the scheduler uses the DB to store results, but does not use the option_profile table due to shear lazyness of the developer.
To Do list
- Fix Arther Vining Davis logo in Poster
- B-and-T-GROMACS paper
- Port GROMACS to:
- Modify DB schema and scheduler to contain [success|fail], Where did it fail, how did it fail.
Posted by hursejo at
07:27 AM
|
Comments (15)
update Josh McCoy 2 June 2004
- Installed Debian 3.0 ("Woody") on the athena golden client.
- Did research on x86 linux distros. Expect an mt entry tomorrow morning after I get to work.
- Took a preliminary look at rdist.
- Gave a good lock at the docs for the latest version of gnuplot as to assess how much the graphing scripts need to change to accomodate a version upgrade.
- Display tool updates:
- Changed table schema to use a sequence as the key.
- Fixed the update vs add issue (PQC)
- Removed preset query deletion box (PQC).
- Removed associated graphs column from tables(PQC).
- To do for display tool:
- Take a closer look at the bug Joshh brought to light.
- Merge delete, add/update scripts with PQC. They are quickly converging.
- Enable column selection box.
- Update names/descriptions of the graph buttons.
- Make 2d Graph work.
Posted by mccoyjo at
01:46 AM
|
Comments (68)
June 02, 2004
Update-dawit
Me and Josh have installed Debian on one of the athena nodes and we will be testing it to make sure it is ready to be a golden image.
I am still playing with the systemimager problem that I had with the old RH7.3 athena image.The problem seems to be that the dhcp packets that hopper sends do not get delivered to the nodes when sysimager does a dhcpbroadcast. I restarted dhcrelay on a0 on both interfaces and ipforwarding has been enabled.Does anyone have any similar past experiances with dhcrelay?
Another issue that might affect it is the line ddns-update-style in dhcpd.conf(I tried both interim and ad-hoc but it complains that it does not recognize it).
Posted by bekelda at
11:04 PM
|
Comments (63)
Wednesday Update
Um... Here goes.
I am on page 67 of the HPC, and understood most of it. I will probably need to re-read some parts (ie chapter 3). I have read the first Altvec document, but I will read it again to gather specific questions for tomorrow.
Posted by schaejo at
06:47 PM
June 01, 2004
Meeting Notes - June 1, 2004
CCG Meeting - cp, jh, jm, db, js
Schedule - 40 hrs/week, Mon and Thu working on the second floor of Dennis between 9a-5p ish. Don't forget the Sunday and Wednesday evening updates.
Move - Check to make sure that HVAC is on the 24x7x365. We have a WeatherDuck that we'll use for monitoring the temperature and shutting down the clusters if necessary. Cairo moves as a unit (minus admin and bazaar), bazaar we'll take 1/2 of the machines out. Shutdown Friday night (JoshH), move Sunday morning at 7a (Charlie will bring the donuts). Clean ENI lab and shuttle remainder of ancilliary gear later in June.
New images - see plumbing document for the details. JoshH will be running tests until just before the move. JoshM and Dawit can use bazaar annex and c15 for testing, plan to upgrade to new images on Thursday June 17th. What to do on x86? GenToo, RedHat9, ROCKS, SuSe, DebIan, others? Use DebIan for athena, not RedHat.
Workshops - during the weeks of June 20-26 and August 8-14 Charlie will be using the clusters for students in the parallel and distributed workshops. Partition each, part for workshops and part for CCG.
Grants - probably to a couple of private/corporate foundations, maybe NSF, maybe Keck. Consider further in a couple of weeks.
Reading - Dawit and John need to both read HPC, timeshare efficiently.
Sub-meeting plan - Before too long it will be more efficient to have regular b-and-t gromacs, f@c, numerical methods, plumbing.
Data Visualization Console (DVC) - Key value for table, makes editing possible. Add orderby. If using a preset query only appropriate graph types should be displayed. Graph = 3D parallel architecture and levels with PS/day; 2D Graph = 2D molecules and layers with PS/day; 2D graph (one molecule) = molecule and layers with PS/day (currently broken); Text Dump = Text Dump; Parallel Architecture Graph = 2D parallel architecutre with PS/day (either one molecule or many). For some graphs reasonable to choose one molecule or more than one. For others only one or more than one is sensible, label appropriately. Finish column select functionality. Link to this in the Resource section of cluster.earlham.edu.
Preset Query Console (PQC) - Drop entry box for Delete Preset Query, add orderby. Link to this in the Resource section of cluster.earlham.edu.
Communication - We talked at lunch about various approaches to this, Charlie will write them up and circulate.
GROMACS and message passing libraries - JoshH still plugging along, some problems with initial PVM on cairo (eg it crashed the cluster). Use native communications for PVM.
WordPress - J&D will install it on admin for us to testdrive, easy to migrate MT to WordPress but coming back may be hard.
Thursday meeting
- Which x86 distro?
- Review summer plan
- Conference and presentation tour
- Sub-meeting plan
- B and T GROMACS - review JoshH's draft, charlie to find old prose
- Data visuaization console
- Preset query console
Future Meeting
- Scheduler presentation and conversation, read JoshH's entry in preparation
- FFTW/GROMACS/MPI diagram update, use GROMACS manual as source?
- Modular F@C - make a plan
Posted by charliep at
08:00 PM
|
Comments (73)
update Josh McCoy 31 May 2003
- Fixed the display tool
- Fixed quote issues
- Added accidental deletion prevention
- cleaned up code and added some documentatiion
Display tool locations
- Read over summer plan. It looks good.
- More progress was made in Leach. I started the first altivec reading.
- Gave Dawit the rundown on installing LAM.
- Read Charlie's grant mt entry. The possibilities look interesting. I am looking forward to getting more information.
Posted by mccoyjo at
01:27 AM
|
Comments (56)