Computational Environments

Within the Information Commons data science platform, there are two main computational environments:

  • IC AWS: UCSF Information Commons AWS Cluster is an AWS Apache Spark computing cluster that is pre-configured for data science research.
  • IC Wynton: Wynton-based IC App Server is an on-premise computational environment that supports interactive data science workflows requiring parallel processing (GPU) and/or PHI compliance.

The table below lists the key features of the Information Commons and some other computational environments available at UCSF.

 

IC AWS
Shared Cluster

IC AWS
SEC*

New! IC Wynton
App Server

RAE Premium

Wynton HPC

UCSF AWS SEC

Configuration

Shared auto-scalable Spark EMR Cluster (CPU)

On-demand auto-scalable Spark EMR cluster​

Powerful on-premise app server, access to Wynton HPC (GPU); Spark, Dask

On-prem server with custom CPU, RAM 
configuration ​
Scalable on-prem HPC cluster (GPU) Scalable, secure, customizable HPC cluster on UCSF Enterprise AWS cloud

PHI Support

 

✔︎

✔︎

✔︎

✔︎

✔︎

Storage for user data**

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

Access to de-identified UCSF research data assets (No IRB)

EHR (structured)

✔︎

✔︎

✔︎

✔︎*    

Clinical Notes

✔︎

✔︎

✔︎

✔︎*    

New! Radiology Images

   

✔︎

     

Interactive Tools

Hue, Jupyter, RStudio

Hue, Jupyter, RStudio

Jupyter

Azure Data Studio, SAS, RStudio, Spyder, MATLAB, 
Jupyter, SPSS, STATA​
   

UCSF Enterprise GitHub for Collaboration

 

✔︎

✔︎

✔︎ ✔︎  

* IC AWS will be moving to UCSF Secure Enterprise AWS in the near future, which will enable PHI support
** Storage for user data is PHI-compliant with "PHI Support"

As shown in the table above, some features that are available in both Information Commons environments are:

  • Jupyter environment for interactive data science
  • Availability of structured, de-identified electronic health records data (DeID CDW and OMOP)

Learn more about a specific environment