Silo Expansion for SMPH

Latest Updates:

  • Almost of the requested research software has been installed on the new servers.
  • We're reaching out to the data providers identified by our early adopters to arrange automated ways to move data into Silo.
  • Silo Users may now install R packages from CRAN and Python packages from PyPI as well as Stata packages from SSC. We're working on giving the same access to BioConductor. Other packages will be installed by SSCC Staff.
  • The first early adopters from SMPH will be given access shortly. If you are an SMPH researcher that is interested in being an early adopter, first speak with your local IT support within SMPH, and then contact the SSCC Help Desk. Instructions for early adopters are in Welcome to the Expanded Silo!
  • We have switched Silo's multi-factor authentication to Duo. You'll use the same app or token to log into Silo as you do to log in with your NetID.
  • Servers are scheduled to go into production at the end of March!

Known Issues

  • HTCondor is not yet available
  • Spyder and Jupyter Notebook work on LinSilo, but the windows they create do not have the usual controls (like the 'x' to close them). What's more, running them strips those controls from terminal window makes them incapable of taking keyboard input. Restarting X-Win32 fixes the problem. We are investigating.
  • The following software is not yet available:
    • gingko
    • STAR
    • Genome MuSiC
    • VarScan
    • Qiime 2

Project Overview

The School of Medicine and Public Health has joined the Social Science Computing Cooperative in order to use SSCC's existing Silo secure computing environment and DoIT's RestrictedDrive secure data storage to provide SMPH researchers the resources to analyze big data that's covered by HIPAA. SMPH has provided the funding for a very significant expansion of Silo.

SSCC is hiring several new staff members to support this expansion, including a Biomedical Research Computing Facilitator who will work directly with researchers to help them use the new resources.

SSCC is providing the computing component of the expansion:

  • 3 Interactive Linux servers (40 cores, 384GB RAM each)
  • 2 High-Performance Linux servers (80 cores, 768 GB RAM each)
  • 9 HTCondor servers (40 cores, 384GB RAM each)
  • 4 Windows servers (32 cores, 384GB RAM each)
  • 768 total cores at launch

SMPH has budgeted to purchase a similar quantity of additional servers during the first year of the collaboration, with the servers to be purchased depending on identified needs. This will likely include servers with GPUs.

DoIT is providing the data storage component of the expansion:

  • 400TB initially
  • 2PB expansion year 1
  • 10PB expansion year 5
  • “Pay as you go” use of DoIT's Isilon infrastructure

Software requested thus far:

R/RStudio with the following packages:

Bedr

Coloc

corrplot

dplyr

ggplot2

Gtools

MatrixEQTL

Mediation

PEER

 

Biobase

BiocManager

ChIPseeker

DESeq2

edgeR

GenomicRanges

GO.db

iBMQ

WGCNA

pROC

Robustbase

Tidyverse

UpSetR

impute

Limma

preprocessorcore

Qvalue

snpStats

Python 2.7 & 3.7, Anaconda Distribution, with the following packages:

Cutadapt

MACS2

Clipper

HTSeq

Other Software:

Matlab

Stata

SAS

Julia

Java

Netlogo

Git

Freesurfer

FSL

SPM

Samtools

bedtools

GATK

gingko (running locally)

Trim Galore!

TopHat

STAR

RSEM

Genome MuSiC

GATK

VarScan

PLINK

htslib

Eagle2

IMPUTE2

VCFtools

SNPEff

QTLtools

Qiime2

FastQC

Picard

Burrows-wheeler aligner (BWA)

Bowtie2

 

 

If you are an SMPH researcher who anticipates using Silo, you can contact the SSCC Help Desk to request additional software.