|
A Login Debugging Tool for PlanetLabAnother "Co" tool from the CoDeeN project |
CoTest is a login debugging tool for PlanetLab. If you are having problems logging into a node, you can run CoTest to see what various data sources think about the node in question. The output is meant to be human-friendly. This tool gets its inspiration from Neil Spring's "why" script.
You download CoTest, compile it, and run it on the command line. You provide it your slice name, and a list of nodes, and for each node, it provides some information about any problems the node is experiencing. It currently pulls data from two sources - the CoMon and CoTop. It uses a fairly simple process to determine what problems might be occurring, and while it's not perfect, it should be reasonably accurate.
Follow these steps to install CoTest on
your own system:
CoTest is a fairly simple program to
use. Simply type
./cotest slice_name node.name
where slice_name is the name of your slice, and
node.name is the fully-qualified name of the
PlanetLab node that you are trying to test. For example, if our
slice is princeton_codeen, and our node is
planetlab-1.cs.princeton.edu, we would type
./cotest princeton_codeen planetlab-1.cs.princeton.edu
We may see results like the following:
cotest version 0.8a - see http://codeen.cs.princeton.edu/cotest for updates
Status: current version 0.8a released Apr 28,2006
Using server: summer.cs.princeton.edu
planetlab-1.cs.princeton.edu: both CoMon and CoTop think node is dead
In this case, all of the tools think the node is dead, so the node
is likely to be dead at the moment. We can also get different
information if we choose other nodes. If we run
./cotest princeton_codeen planetlab1.it.uts.edu.au
we may see something that looks like the following:
cotest version 0.8a - see http://codeen.cs.princeton.edu/cotest for updates
Status: current version 0.8a released Apr 28,2006
Using server: summer.cs.princeton.edu
planetlab1.it.uts.edu.au: just fyi - clock drift 18510 secs
planetlab1.it.uts.edu.au: slice princeton_codeen exists
In this case, CoTest is telling us that other than the node's clock being out of sync, the node looks normal.
CoTest tries to simplify a number of conditions monitored by the tools that it uses. The list of messages that it can produce include the following, but some of these messages may not be in use at the moment because the PSEPR tools is not being polled:
CoTest is designed to work in conjunction with our other tools. If you specify your slice name via the MQ_SLICE environment variable, you will not need to specify it on the command line.
You may want to run CoTest on multiple nodes. You have two options - you can specify all of them on the command line, or you can specify none of them on the command line. If no nodes are listed on the command line, then it is assumed you will provide them on standard input. If you are typing them manually, the program can be terminated by typing ctrl-D.
Finally, CoTest keeps some cached information in the directory cotest_cache in your home directory. PEPR alert information is cached for one hour, and CoMon node information is cached for five minutes. If you believe these to be stale, you may manually delete this directory.
CoTest is currently in beta testing. Please let us know if you encounter anything that seems strange or confusing. Likewise, feel free to suggest any improvements.
Vivek Pai, with help from KyoungSoo Park and input from lots of others. We may collectively be contacted at princeton_codeen at slices.planet-lab.org
We would like to thank Neil Spring for his original "why" script, which motivated this effort. We would also like to thank Mic Bowman and the folks at Intel Oregon for PEPR, which we use as a data source.