Linstat is the SSCC's primary Linux computing cluster. Linstat combines familiar statistical software like Stata, SAS, R, and Matlab with the power of Linux, making it ideal for jobs that require more memory or computing time than Winstat can provide. Linstat also gives you access to the SSCC's HTCondor flock, where you can run multiple jobs at the same time.
Learning to run jobs on a Linux server is probably easier than you think. If you're new to Linux, be sure to read the section Getting Started on Linstat. Veteran Linux users can probably stop reading when they reach that point, but should be sure to read the sections before that which describe some of the unique features of Linstat.
To log in to Linstat you'll use your SSCC username (typed in lower case) and password. If you've forgotten your password, you can reset it here.
If you are outside the United States please read Connecting to Linstat from Outside the United States.
How you'll connect to Linstat depends on what kind of computer you're connecting from:
If your computer runs Windows, we suggest you connect using a program called X-Win32 (though there are many fine alternatives). X-Win32 is already installed and configured on Winstat, so one option is to log in to Winstat and run X-Win32 there. Alternatively, you can download and install a pre-configured version of X-Win32 from the SSCC web site. Simply download the installation file and then double-click on it.
You'll be asked to log in because X-Win32 is only licensed for UW faculy, staff, and students. Just give your usual SSCC username and password.
When you run X-Win32 it will place an icon in the lower right corner of your screen:
Click on the icon once and choose Linstat.
For more details, including how to set up a connection to a particular Linstat server, see Connecting to SSCC Linux Computers using X-Win32.
If you are not on the UW-Madison campus you must establish a VPN connection to campus before using X-Win32.
Macs and Linux computers have client programs for connecting to Linux servers installed by default. Simply start a Terminal program (on a Mac it will be found under Applications, Utilities) and then type:
ssh -Y firstname.lastname@example.org
username should be replaced by your SSCC username. If your username on your computer is the same as your SSCC username, you can leave it out (ssh -Y linstat.ssc.wisc.edu). If you are plugged into the wired network in the Sewell Social Sciences Building you can leave out the domain (ssh -Y linstat).
For more details, including how to connect to a particular Linstat server, see Connecting to Linstat from a Mac.
When you connect to Linstat, you'll be directed to the least busy of the five Linstat servers (linstat1, linstat2, linstat3, linstat4, and linstat5) automatically. This will spread users among the five servers and help avoid situations where one server is much busier than another.
If you are running a long job and need to connect to the same server again to monitor it, log in to Linstat and then type ssh server, where server should be replaced by the name of the server where you started the job. Be sure to note which server you're on when you start a long job. Most people have the server name in their prompt, but if you don't you can find out which server you're using by typing printenv HOST. It's also possible to connect to a specific server directly—the links in the previous section have instructions.
The Linstat servers have 320GB of RAM (except for Linstat5 which has 384GB), but we've limited the amount any one person can use to 64GB. This will help prevent the servers from running out of RAM, which causes performance and stability problems. Exceptions can be made, so if you need more than 64GB of RAM contact the Help Desk.
/ramdisk is a special "directory" that is actually stored in RAM, making it extremely fast. The maximum size of /ramdisk is 32GB, and any files that are not in use will be deleted after one hour. /ramdisk can be very helpful for programs that spend a lot of time reading and writing temporary files.
We have a small number of Stata MP16 licenses, which are ideal for running computationally intensive do files. Do files run using the stata and condor_stata commands will be run using Stata MP16, though some HTCondor servers only have 8 cores and Stata MP16 will automatically adapt accordingly.
On Linstat, the default directory where SAS stores temporary data sets (the WORK library) is /ramdisk. This increases the speed of data-intensive programs significantly. It also prevents them from slowing down the entire server due to disk I/O bottlenecks.
If you need more than 32GB of temporary space, change the WORK directory to /tmp. You can do so by adding the -work option to your SAS command:
sas -work /tmp myprogram
You'll then be able to use up to 243GB of space (or as much of it as is available at the time). For more details see Running Large SAS Jobs on Linstat.
The SSCC's HTCondor flock contains CPUs and is ideal for running very long statistical programs or for running multiple jobs at the same time. HTCondor can run Stata, SAS, Matlab, R, and Mplus jobs as well as user-written programs. We've written scripts that make submitting jobs to HTCondor very easy—see An Introduction to HTCondor for instructions. (You can also submit Stata jobs to HTCondor flock via the web.)
Due to licensing restrictions, Mplus is only installed on Linstat1, Linstat2, and Linstat3, and may only run one job at a time on each server. Because of the unusual way Mplus launches additional terminal sessions you'll need to stay logged in the entire time the program is running. Running Mplus on Linstat has more details.
Linux can be intimidating because it just waits for you to type commands without giving you any menus or icons to suggest what you can do. But if all you want to do is run jobs, you can get by with just a couple of Linux commands. Here's how:
If you're on Winstat or a Windows PC that logs into the SSCC's PRIMO domain, your Linux home directory is available as the Z: drive, and Linux project directories are the V: drive. They're also available from Macs—see Using SSCC Network Disk Space from Macs. This means you can write your program, manage your files, etc. using the tools you're familiar with and still put the programs and related files on the Linux file system so Linstat can run them.
Put all the files relating to a given project in a single folder (or directory in Linux terminology), then write your programs on the assumption that that folder will be your working directory (i.e. a Stata program should say use datafile, not use z:\research\datafile). If you're only working on a single project then just declare Z: itself that project's "directory."
When you log into Linux, your "current working directory" (where you "are" in the file system) starts out as your home directory—what Windows calls Z:. If that's where your project's files are, you can skip directly to running your job. Otherwise you'll need to go to your project's directory using the cd ("change directory") command. If your project's directory is on your Z: drive, type:
Where myProject should be replaced by the name you gave your project's directory.
If your project's directory is inside an official Linux project directory on the V: drive, type:
A few more points on the Linux file system:
The command to run your program will depend on the program you want to use. Here are some of the most popular:
You can start Stata's graphical user interface by typing xstata or xstata-mp. You can also run a do file called mydofile.do in batch mode by typing:
stata -b do mydofile
Alternatively you can submit it to HTCondor with:
If you run mydofile.do in batch mode or on HTCondor, Stata will automatically log its output in mydofile.log.
You can start SAS's graphical user interface by typing sas, though it's somewhat clunkier than the Windows version. You can also run a program called myprogram.sas in batch mode by typing:
To run R, simply type R. It does not have a graphical user interface but the commands are the same as in Windows R.
To run an R program in batch mode, type:
R < myprogram.R > myprogram.log
To submit myprogram.R to HTCondor and save the output to myprogram.log, type:
condor_R my program.R myprogram.log
If your job uses multiple processors, type:
condormp_R program.R program.log &
You can start Matlab's graphical user interface by typing matlab. To submit myprogram.m to HTCondor and save its output in myprogram.log, type:
condor_matlab myprogram.m myprogram.log
If your job uses multiple processors, type:
condormp_matlab program.m program.log &
To run an Mplus job, log into Linstat1, Linstat2, or Linstat3, and type:
where myprogram.inp should be replaced by the name of the Mplus program (the .inp file) you want to run.
Linstat has many other programs available (see our software database). See the documentation of the program you're interested in for details on how to run it.
If your job will run for a long time, put it "in the background" by adding an ampersand (&) to the end of the command. For example:
stata -b do mydofile &
This will allow you do other things on Linstat while the job is running, or log off without interrupting the job. Jobs submitted to HTCondor are essentially "in the background" already.
For more information see Managing Jobs on Linstat.
While this will get you started, there are several other SSCC Knowledge Base articles you can read to become a more flexible and efficient Linstat user. Managing Jobs on Linstat will teach you how to monitor and manage jobs while they run. An Introduction to HTCondor will teach you more about the SSCC's HTCondor flock and how to use it. Finally, if you really want to make yourself at home in Linux, read the SSCC's Getting Started in Linux. For a full list of articles, visit the Linux section of our Knowledge Base. SSCC staff will also be happy to answer any questions you have about using Linstat and help you solve any problems you run into—just contact the Help Desk.
Last Revised: 3/20/2017