Silo Expansion for SMPH

Latest Updates - Known Issues - Project Overview

Latest Updates

  • All SMPH researchers are now welcome to request SSCC accounts by filling out the new account request form. Faculty should request their acounts first, and then they'll be able to sponsor accounts for others. Be sure to keep your local IT support in the loop.
  • Using Silo now includes instructions for using WinSilo, LinSilo, and CondorSilo.
  • We're reaching out to the data providers identified by our early adopters to arrange automated ways to move data into Silo. But for now, contact the Help Desk and tell us where your data are, and we'll find a way to get them into Silo.
  • CondorSilo, the HTCondor pool in the Silo environment, is now ready for use.
  • NetLogo can now run in headless mode on LinSilo.
  • Almost all of the requested research software has been installed on the new servers—see the Known Issues for exceptions.
  • Silo Users may now install R packages from CRAN or BioConductor, Python packages from PyPI, and Stata packages from SSC. Other packages will be installed by SSCC Staff.

Known Issues

  • Spyder and Jupyter Notebook work on LinSilo, but the windows they create do not have the usual controls like the 'x' to close them. What's more, running them strips those controls from terminal windows and makes those terminal windows incapable of taking keyboard input (while the Spyder and Jupyter Notebook windows work just fine). Restarting X-Win32 fixes the problem. We are investigating.
  • The following software is not yet available:
    • gingko
    • Genome MuSiC
  • Qiime 2 must be installed by the user in a Conda environment.

Project Overview

The School of Medicine and Public Health has joined the Social Science Computing Cooperative in order to use SSCC's existing Silo secure computing environment and DoIT's RestrictedDrive secure data storage to provide SMPH researchers the resources to analyze big data that's covered by HIPAA. SMPH has provided the funding for a very significant expansion of Silo.

SSCC is hiring several new staff members to support this expansion, including a Biomedical Research Computing Facilitator who will work directly with researchers to help them use the new resources.

SSCC is providing the computing component of the expansion:

  • 3 Interactive Linux servers (44 cores, 384GB RAM each)
  • 2 High-Performance Linux servers (80 cores, 768 GB RAM each)
  • 9 HTCondor servers (44 cores, 384GB RAM each)
  • 4 Windows servers (32 cores, 384GB RAM each)
  • 768 total cores at launch

SMPH has budgeted to purchase a similar quantity of additional servers during the first year of the collaboration, with the servers to be purchased depending on identified needs. This will likely include servers with GPUs.

DoIT is providing the data storage component of the expansion:

  • 400TB initially
  • 2PB expansion year 1
  • 10PB expansion year 5
  • “Pay as you go” use of DoIT's Isilon infrastructure

Software requested thus far:

R/RStudio with the following packages:

Bedr

Coloc

corrplot

dplyr

ggplot2

Gtools

MatrixEQTL

Mediation

PEER

 

Biobase

BiocManager

ChIPseeker

DESeq2

edgeR

GenomicRanges

GO.db

iBMQ

WGCNA

pROC

Robustbase

Tidyverse

UpSetR

impute

Limma

preprocessorcore

Qvalue

snpStats

Python 2.7 & 3.7, Anaconda Distribution, with the following packages:

Cutadapt

MACS2

Clipper

HTSeq

Other Software:

Matlab

Stata

SAS

Julia

Java

Netlogo

Git

Freesurfer

FSL

SPM

Samtools

bedtools

GATK

gingko (running locally)

Trim Galore!

TopHat

STAR

RSEM

Genome MuSiC

GATK

VarScan

PLINK

htslib

Eagle2

IMPUTE2

VCFtools

SNPEff

QTLtools

Qiime2

FastQC

Picard

Burrows-wheeler aligner (BWA)

Bowtie2

 

 

If you are an SMPH researcher who anticipates using Silo, you can contact the SSCC Help Desk to request additional software.