Build Your Own Rocks Cluster
You should have the following parts:
- 4 single-processor Athlon PCs
- 5 network cards (already installed)
- 4 network cables
- 1 fast ethernet switch and its power adapter
- 1 keyboard
- 1 mouse
- 1 monitor
- 1 power strip
- 5 power cables
- 3 Rocks CD-ROMs (from www.rocksclusters.org)
- 1 TCB Cluster CD-ROM (NAMD
examples and
binaries)
- 3 floppy disks
Part 1: Frontend (Master Node) Installation
- Plug the monitor into the power strip and turn it on.
- Find the machine with two network cards; this is the master node,
and should be the leftmost computer.
- Plug the master node into the power strip and connect the monitor,
keyboard, and mouse.
- Power on the master node and insert the Rocks Kernel CD into the
CD-ROM drive, then press the reset switch.
If you wait too long and your machine starts booting off of the
hard drive, just press the reset button to make it boot from the CD-ROM.
If your machine still insists on booting from the hard drive you may
need to modify its BIOS settings.
- As soon as the boot menu appears, type "frontend" to boot the
frontend installation. If you wait too long without pressing any
keys, it will attempt to boot as a cluster. If this happens,
simply restart the machine and try again.
- The first screen to appear should list the available "rolls,"
with the first four already selected (ganglia, kernel,
hpc, base). Select "sge" in addition to these and select
"OK."
"SGE" or the "Sun Grid Engine" is what allows
us to queue jobs on the cluster.
- The installer will ask you if you have another roll CD-ROM to load.
Choose "Yes" to process another CD.
- Insert OS Disc 1 into the drive and press Enter.
- Repeat for OS Disc 2.
- Once all three CDs are loaded, select "No" in order
to continue. The current CD will be ejected, and you might have
to wait several minutes for the next step as the computer
silently processes.
- You should be presented with various fields to input. The only
required field is "Fully Qualified Hostname." Enter the full
host and domain name of the master node as given to you by the
instructor. The other fields can be left at their defaults or
replaced with any arbitrary data you'd like. This is completely
optional, and only used internally. Select "OK" when done.
- You will then be prompted whether you want to the installer to
autopartition your drive, or manually set up the partitions
yourself using Disk Druid. Select "Autopartition."
- Next you'll configure both ethernet cards of your frontend.
"Eth0" is used for the private network, and its settings can be
left default. "Eth1" is for the outside network, and you should
input the IP Address given to you by your instructor. The
default netmask should be fine. Select "OK" when done
with each interface.
- Enter these settings:
Gateway: 130.126.120.1
Primary DNS: 130.126.120.32
Secondary DNS: 130.126.120.33
Tertiary DNS: 130.126.116.194
and select "OK" to continue.
Note: these values are specific to our network. If you
want to set up your own cluster later on, you'll have to get
these addresses from your local sysadmin (which might be
you!).
- On the next screen, select "America/Chicago" as the
timezone, and change the network timeserver to
"timehost.ks.uiuc.edu."
Select "OK" to continue.
- You will then be prompted for your root password. This can be
any password agreed upon by your group for this excercise, but
should be extremely secure for a live server in the real world.
Enter this password twice (and write it down somewhere) and select
"OK" to continue.
- Now begins the copying process. The installer will prompt you
for CD changes as necessary, and copy their contents to a temporary
location on the harddrive. You should only have to provide each
CD once.
- When the three CDs have been copied over, the installer will
merge the rolls on the harddrive and begin the actual install.
While this is happening, feel free to start setting up the rest
of the nodes. Each node needs a power cable connected to your
power strip and a network cable going from their only* network
card to the mini switch.
When done, the system will automatically reboot.
*Note: your master node has two ethernet cards. Connect
the Intel card to the mini switch and the 3COM card to our larger
central switch, using the long network cable provided.
Part 2: Configuring the Frontend
- The first time you log in as root, you'll be prompted about
setting up ssh keys. Press Enter three times to accept the
default location and enter (and confirm) a blank password for
the key pair.
- The normal Rocks distribution is lacking a single library needed
for NAMD, a program we'll use later. To fix this, follow these
steps:
cd /home/install/site-profiles/4.0.0/nodes
wget http://www.ks.uiuc.edu/~beacham/extend-compute.xml
cd /home/install
rpm -i rocks-dist/lan/i386/RedHat/RPMS/compat-libstd*
rocks-dist dist
This will install the "compat-libstdc++-33" package and rebuild
the Rocks distribution so it automatically installs on all slave
nodes. A script is also available to automate this step
at
http://www.ks.uiuc.edu/~beacham/namdrocks.sh.
- Once that finishes, create a normal user for your group to use (you
don't always want to use root) by running "useradd
username." This will set up a default account with
a blank password. Run "passwd username" to set
the password for the account.
- Test this account by trying to log in as the new user on a different
terminal: Press Ctrl+Alt+F2 to switch to tty2, log in
as the new user, exit, then Ctrl+Alt+F1 to switch back to
tty1.
- Back at your root login, run "insert-ethers" to start
detecting any nodes that boot.
- Press Enter to select "Compute," for the type of nodes to listen
for.
- Make sure your slave nodes are connected and have one of the
Rocks kernel CDs (the one you first booted with) inserted.
Boot them all now, and you'll see them appear on the screen
as they connect to the master node. The CD only needs to be
in the drive a matter of seconds in order to load the PXE boot
system (for network booting). Most modern-day motherboards
actually have support for this built in, and even the CDs are
unnecessary, but these specific computers are just old enough to
require them. Make sure the CDs are then removed, since the
installation program will automatically reboot the computers
afterwards (and we don't want them installing in an infinite
loop). You can also use disposable floppies instead of CDs,
which can easily be generated at http://www.rom-o-matic.net.
The downside of using floppy disks is that they can usually only
hold one set of network card drivers at a time, so if you have a
many slaves all using different network cards, you'll have to
make many unique boot disks. A CD, however, can hold all of the
network card drivers supported by Linux.
- Each slave entry should contain an empty "( )", which will be replaced
by "(*)" when the node properly requests its kickstart file.
When all of the nodes have done this, press F10 to exit the
insert-ethers program.
- Each node will do a full installation and then reboot.
What's happening: When each slave boots, it accesses the network
and searches for the master node. The master then sends an entire Linux
distribution over to the slave to be installed. This makes the slaves
easier to maintain, since you can easily swap nodes in and out with
replacements, which are automatically installed and configured without
any help from you.
Part 3: Checking Your Cluster Status
Rocks doesn't use bproc
like Clustermatic, so you can't
use bpstat
, but it has an incredibly powerful web interface.
We'll configure and boot into X so that we can properly view this.
- While logged in as root, run "system-config-display" in order
to initially set up your X config file.
- Using the mouse (finally!), set the resolution to 1024x768
and the color depth to "Millions of Colors," and click
"OK."
- Back at the root prompt, run "startx" to actually start
your X session.
- Once X boots, start Firefox by navigating to "Applications >
Internet > Firefox Web Browser." Its homepage should default
to "http://localhost/," so it should already display the Rocks
web interface.
- Feel free to explore the menus of the web interface. You can
generate graphs, check the health of your cluster, and view
job submissions to the SGE queue. Most of these entries will
be blank now, but you can watch the "Cluster Status (Ganglia)"
page to see when your nodes are fully up. A listing of each
individual node and its status is at the bottom of the page.
- For the rest of the activities, you can either stay in X and use
a virtual terminal (Right-click on the Desktop, "Open Terminal"),
or log out of X (Actions > Log Out) to return back to your
console session.
See Also
Rocks web site (www.rocksclusters.org)