Thursday, July 23, 2009

Karajan and GridFTP

I owe big thanks to Mihael Hategan for some help here.

Karajan is a workflow system that is part of the Java COG Kit and also powers the Swift system. You can get it in the 4.1.5 version of the COG Kit, but this version is out of date, so it is better to get the code from the SourceForge SVN:

View it: http://cogkit.svn.sourceforge.net/viewvc/cogkit/trunk/current/src/cog/

Get it: svn co https://cogkit.svn.sourceforge.net/svnroot/cogkit/trunk/current/src/cog

Compile with ant. Here's an example script that will run gridftp on a remote host and return the file listing and metadata to your screen. Run it with $COG_INSTALL_PATH/bin/cog-workflow gridftpExample.xml:

<project>
<include file="cogkit.xml"/>

<set name="l">
<file:list dir="." host="gridftp-hg.ncsa.teragrid.org" provider="gridftp"/>
</set>
<for name="file" in="{l}">
<set names="type,permissions,size,modified">
<file:info file="{file}"/>
</set>
<print message="{file} Type:{type} Permissions:{permissions} Size:{size} Modified:{modified}"/>
</for>
</project>


Mihael actually added some of these fields on my request (they are part of the underlying code).

Karajan also has its own scripting language that is much less awkward than XML for expressing the scripts:

import("sys.k")
import("task.k")

element(niceType, [type]
if (
type == FILETYPE:FILE, "File"
type == FILETYPE:DIRECTORY, "Directory"
type == FILETYPE:SOFTLINK, "Softlink"
type == FILETYPE:DEVICE, "Device"
"Unknown"
)
)

files := file:list(dir=".",host="gridftp-hg.ncsa.teragrid.org",provider="GT2")

for(f, files,
[type, perm, size, modified] := file:info(f)
print("name: {f}, type: ", niceType(type),
" ({type}), size: {size}, modified: {modified}, permissions: {perm}")
)

Tuesday, July 07, 2009

Blogger, Google, and Bing

I noticed my inflammatory post on how to use Gaussian on the TeraGrid (http://communitygrids.blogspot.com/2009/07/running-gaussian-on-big-red.html) is the #3 match if you google "gaussian teragrid", but it doesn't make the Top 50 if you use Bing. I suppose Google is weighting the search results to direct you to Blogger/Blogspot content.

Monday, July 06, 2009

Running Gaussian on TeraGrid

These are notes for running Gaussian serially on the TeraGrid. I'll look at IU's BigRed and NCSA's Abe, Mercury, and Cobalt. Surprisingly this wasn't documented anywhere that I could find with Google (I had to resort to the Help Desk, which was helpful but this is missing the point). And of course running things on each machine requires a different incantation. You may wonder why the TeraGrid doesn't make its environments more consistent for these common applications. I suggest posting hypotheses on this subject as comments.

IU's Big Red
Gaussian is a famous piece of Quantum Chemistry software. I'll assume you have an input file, your_input.inp.

1. Request to be added to the Gaussian group. I found this through non-standard routes (thanks to Ray Sheppard) but try contacting the TeraGrid help desk. You have to do this before you can proceed.

1.5 Add gaussian to your .soft file. For more on the SoftEnv, see http://www.teragrid.org/userinfo/jobs/environment.php

My .soft file looks like this:
more ~/.soft
#
# This is the .soft file.
# It is used to customize your environment by setting up environment
# variables such as PATH and MANPATH.
# To learn what can be in this file, use 'man softenv'.
#
#
@bigred
@teragrid-basic
@globus-4.0
@teragrid-dev
+gaussian
----------------------
Either type "resoft" or just exit and login again.

2. Make a Loadleveler script like the one below:
---------------
# @ output=stdout.txt
# @ error=stderr.txt
# @ wall_clock_limit=5:00:00
# @ account_no=YOUR_ACCOUNT_NUMBER
# @ queue=default

export g03root=/N/soft/linux-sles9-ppc64/gaussian/g03-d.02
. $g03root/g03/bsd/g03.profile


g03 $HOME/your_input.inp $HOME/your_output.out
---------------

A real script would move stuff in and out of scratch space and stuff like that (Gaussian creates very large files while you run it), but I intend to automate this through Globus+Condor-g. Also this only will submit serial jobs.

3. Submit with llsubmit, track with llq. The IU Knowledge Base has a nice comparison of Loadleveler and PBS commands and directives here: http://kb.iu.edu/data/axpz.html

NCSA's Abe

Unlike the other machines, you definitely do not want "+gaussian" in your .soft file on abe. This will produce the following error:

g03: error while loading shared libraries: /usr/local/intel/mkl/10.0.3.020/lib/em64t/libmkl.so: invalid ELF header

Make a PBS script like the one below. Again this is only a serial job.
[mpierce@honest2 ~]$ more gaussTest.pbs
#PBS -o stdout.txt
#PBS -e stderr.txt
#PBS -A YOUR_ACCOUNT_NUMBER
#PBS -q normal
#PBS -l walltime=05:00:00

soft add +intel-mkl
setenv g03root /usr/apps/chemistry/gaussian/g03
source $g03root/g03/bsd/g03.login
g03 $HOME/your_input.inp $HOME/yourOutput.out
----------------------

Alternatively, I could have added intel-mkl to my .soft file. I had to get the NCSA Help Desk to tell me this (thanks go to John Estabrook @ NCSA for a quick and accurate response to my ticket). They pointed me in the right direction, but why not just have this documented?

NCSA also has a command-line tool, /usr/local/bin/qg03, that will make and submit a better PBS script than the one above.

globusrun -o -r grid-abe.ncsa.teragrid.org/jobmanager-pbs '&(executable=/u/ncsa/mpierce/gaussian.abe.ncsa.sh)(arguments=/u/ncsa/mpierce/input.in /u/ncsa/mpierce/mepjunk12.out)(project=YOUR-TG-ACCOUNT)(queue=debug)(host_types=himem)(host_xcount=1)(xcount=8)'

or

globusrun -o -r grid-abe.ncsa.teragrid.org/jobmanager-pbs '&(executable=/u/ncsa/mpierce/gaussian.abe.ncsa.sh)(arguments=/u/ncsa/mpierce/input.in /u/ncsa/mpierce/mepjunk12.out)(project=YOUR-TG-ACCOUNT)(queue=debug)(count=1)(hostCount=8)(minMemory=16000)'

I think these are equivalent.

NCSA's Mercury

This still works but who knows how long it will be before the system/application admins break it. As with Abe, best to at least use the qg03 command (it should be in your path) to generate a sample PBS script.

1. Add "+gaussian" to your .soft file (in $HOME) as above.

2. Make a PBS script like the one below. Again this is only a serial job.
[mpierce@honest2 ~]$ more gaussTest.pbs
#PBS -o stdout.txt
#PBS -e stderr.txt
#PBS -A YOUR_ACCOUNT_NUMBER
#PBS -l walltime=05:00:00

g03 $HOME/your_input.inp $HOME/yourOutput.out
----------------------

3. To invoke this with Globus clients or equivalent, you should use a command like

globusrun -o -r grid-abe.ncsa.teragrid.org/jobmanager-pbs '&(executable=/u/ncsa/mpierce/gaussian.cobalt.ncsa.sh)(arguments=/u/ncsa/mpierce/input.in /u/ncsa/mpierce/mepjunk12.out)(project=YOUR_ACCOUNT)(queue=debug)(minMemory=1600)'

or

globusrun -o -r grid-abe.ncsa.teragrid.org/jobmanager-pbs '&(executable=/u/ncsa/mpierce/gaussian.mercury.ncsa.sh)(arguments=/u/ncsa/mpierce/input.in /u/ncsa/mpierce/mepjunk12.out)(project=YOUR_ACCOUNT)(queue=debug)(host_types=himem)(host_xcount=1)(xcount=1)'

NCSA's Cobalt
This has changed and so previous instructions (identical to running Gaussian on Mercury above) no longer work. *DELETED TIRADE*. Setting +gaussian in your .soft no longer seems to work. Use the following minimal scriptlet

#!/bin/csh
setenv g03root /usr/apps/chemistry/gaussian/G03/pp5_e01
source $g03root/g03/bsd/g03.login
$g03root/g03/g03 $1 $2

You can invoke this with a globus run command like the one above.

PSC's Pople
You have to fill out a PDF form and mail or fax it to PSC to get access to Gaussian. I did not pursue this.

Friday, July 03, 2009

Error in gsi-ssh and ssh on Mac 10.5: percent_expand: NULL replacement

I found this error while trying to log onto the TeraGrid ("Tara Grid") with gsissh (google "teragrid single sign on" for context).

percent_expand: NULL replacement

Apparently this is a well-known bug with Mac 10.5's ssh. The workaround is to append "-i ~/.ssh/id_rsa" or "-i ~/.ssh/id_dsa" to the command line, like so:

gsissh login-hg.ncsa.teragrid.org -i ~/.ssh/id_rsa

I got this from http://www.nabble.com/ssh%3A-percent_expand%3A-NULL-replacement-to13439207.html