sscc

SSCC News for November 2003

Inside this issue...

Printer Changes in 2470 and Computer Room
A New Option for Filtering Spam Email
Condor Usage Hits Milestone
New Server Status Web Page
New SSCC Publications
Tip of the Month: Printing Troublesome PDF Files


Printer Changes in 2470 and Computer Room

During next Wednesday's 7-9 a.m. down time, we will be replacing the two HP printers in the 2470 terminal room with a Lexmark printer like we have in the I/O room. One of the HP printers will then replace the printer inside the Computer Room.

You will need to delete and recreate any Windows print shares you currently have defined for SSCC2470 or SSCC_COMP before you will be able to print successfully to these printers after the changes take place Wednesday. This is because different print drivers are needed. It only takes a minute to delete and create print shares. Instructions are provided in SSCC's Publication Setting Up Network Printers in Windows.

The purpose of having a printer in the computer room is for people who need to print special documents that require assistance from the Operator (like for printing final copies of dissertations on special paper) or for people who want more privacy in their printing than is provided at the other public self-serve printers. Operators remove the print out from this printer and file it in the rack in the I/O room. Obviously, this does not guarantee confidentiality, but it does reduce the likelihood of others viewing your output.


A New Option for Filtering Spam Email

More and more spam is being sent to anyone with an email account, and more people are getting fed up with it and want a relatively easy solution. SpamBouncer, the program SSCC has offered for some time now, does a good job with most spam but has a problem with false positives (legitimate email misidentified as spam). For this reason, SSCC has been looking at another program called SpamAssassin. Aside from sounding tougher, this program has a Bayesian component which learns what you consider spam and "ham" (legitimate email), and modifies its filters accordingly (see http://www.paulgraham.com/spam.html for a discussion of how Bayesian spam filters work). Even during this learning process you will see fewer or no false positives, though a bit more spam may sneak into your inbox. Once the Bayesian filter kicks in, the results are nearly perfect.

SpamAssassin, like SpamBouncer, moves your spam email to a mailbox called "spam", rather than deleting anything. You will still want to check that mailbox periodically, especially during the learning period, to make sure nothing was misfiled as spam. SpamAssassin can use the same .nobounce file that SpamBouncer does, for all the addresses you want to be absolutely sure get through. You will not have to set up a new file for that.

Setting up SpamAssassin is quick and easy and instructions are provided in SSCC Publication, Filtering Your Email with SpamAssassin. Please contact the SSCC consultant if you have any questions about the program, setting it up, or changing your settings.


Condor Usage Hits Milestone

In the last couple of weeks we hit a milestone: demands for cycles on Condor exceeded its CPU availability. SSCC staff consider this good news and a sign of Condor's success. For the uninitiated, Condor is a powerful batch processing system, developed at UW-Madison's Computer Science Department, that automatically schedules, checkpoints and allocates computations on a set of Linux computers (currently 13 CPU's). We plan to add more CPUs next year using Capital Exercise funds. Meanwhile, we've made a significant change in how Condor runs.

First an explanation of what happened: Condor is designed for maximum throughput. Having all its CPUs busy all the time is one of its design goals. So Condor puts no limits on how many jobs a user can submit, and will run as many as it can. But it also has a priority system to make sure everyone gets their fair share of Condor's resources. Priority depends on recent usage; if you use Condor a lot, your priority will be lower than someone who hasn't used Condor for a while. If Condor gets a job and there are no available CPU's, it will preempt a lower priority job.

Here's where the problem arose: the designers of Condor were mostly interested in running programs they had written themselves in C/C++, FORTRAN, etc. If these user-written programs are compiled using the Condor libraries and thus run in the "standard universe," they can be "checkpointed." If they are interrupted, they save all their work and can pick up right where they left off, even on a different computer. But other programs, like R or Stata, run in the "vanilla universe" and cannot be checkpointed. If they are preempted, they have to start over from the beginning. So for a standard universe job, being preempted is an inconvenience. For a vanilla universe job, it is a disaster.

Condor's priority system works extremely well if most of the jobs are in the standard universe. But here at the SSCC most jobs are vanilla. We've seen the consequences for the first time in the last couple of weeks: one user would start a long job, which would gradually lower their priority because they are using Condor resources. So another user would submit a job, and since no CPUs were available they would preempt the first user. But then the second user's priority would drop and the first user's priority would rise. Eventually the first user could get higher priority and preempt the second user's job--only to have their job preempted yet again when their priority dropped again. In theory, with long jobs you could have a situation where nothing ever got to run long enough to actually finish--and some of you have been pretty close to just that.

In order to prevent this, we have turned off job preemption completely. No job will ever be interrupted once it starts running. This does solve the problem, but raises another: one user could easily tie up the entire Condor flock for as long as their jobs take to run. We are looking for alternative ways to allocate Condor resources fairly, but for now we must depend on your courtesy to your fellow Condor users. For now, we would ask that you limit yourself to having no more than about six jobs in the Condor queue at any one time. This way it will take at least three users to tie up the entire flock. We are also hoping to add some new and faster servers to the Condor flock, but of course that depends on the money becoming available.

For more information on Condor, see SSCC's Publication, An Introduction to Condor.


New Server Status Web Page

The server status web page has been updated. It now includes the Windows Terminal Servers, and provides more useful information about all the servers and how busy they are. If you use the less busy servers not only will your jobs run faster, it will help balance the load between servers so everyone's jobs can run faster.


New SSCC Publications

Writing a CD Using Ahead Nero

CDs are an excellent way to back up data or transfer files from place to place. This publication will tell you how.

Configuring Outlook Express to Read SSCC Email

If Outlook Express is your preferred email program, this publication will tell you how to set it up for SSCC email.

Configuring Email Programs to Read SSCC Email

We have complete instructions for configuring Eudora and now Outlook Express to read SSCC email. And of course PINE and Squirrel need no configuration. But if you prefer a different email program, this publication has all the settings you need.


Tip of the Month: Printing Troublesome PDF Files

Sometimes PDF files do not print correctly. If you are printing and you only receive the yellow banner page and perhaps one blank page, but the rest of your document does not print, then you should try using one of the PostScript queues available for the printers in the I/O room or the 2470 Terminal room: SSCC4411_PS or SSCC2470_PS. Instructions for adding printer queues are provided in SSCC Publication, Setting Up Network Printers in Windows.


Go to previous editions of SSCC News.
Go to the SSCC Home Page.

© 2003 University of Wisconsin Social Science Computing Cooperative