Within the Information Commons data science platform, there are two main computational environments:
- IC AWS: UCSF Information Commons AWS Cluster is an AWS Apache Spark computing cluster that is pre-configured for data science research.
- IC Wynton: Wynton-based IC App Server is an on-premise computational environment that supports interactive data science workflows requiring parallel processing (GPU) and/or PHI compliance.
The table below lists the key features of the Information Commons and some other computational environments available at UCSF.
IC AWS |
IC AWS |
New! IC Wynton |
RAE Premium |
Wynton HPC |
UCSF AWS SEC | |
---|---|---|---|---|---|---|
Configuration |
Shared auto-scalable Spark EMR Cluster (CPU) |
On-demand auto-scalable Spark EMR cluster |
Powerful on-premise app server, access to Wynton HPC (GPU); Spark, Dask |
On-prem server with custom CPU, RAM configuration |
Scalable on-prem HPC cluster (GPU) | Scalable, secure, customizable HPC cluster on UCSF Enterprise AWS cloud |
PHI Support |
✔︎ |
✔︎ |
✔︎ |
✔︎ |
✔︎ | |
Storage for user data** |
✔︎ |
✔︎ |
✔︎ |
✔︎ |
✔︎ |
✔︎ |
Access to de-identified UCSF research data assets (No IRB) |
||||||
EHR (structured) |
✔︎ |
✔︎ |
✔︎ |
✔︎* | ||
Clinical Notes |
✔︎ |
✔︎ |
✔︎ |
✔︎* | ||
New! Radiology Images |
✔︎ |
|||||
Interactive Tools |
Hue, Jupyter, RStudio |
Hue, Jupyter, RStudio |
Jupyter |
Azure Data Studio, SAS, RStudio, Spyder, MATLAB, Jupyter, SPSS, STATA |
||
UCSF Enterprise GitHub for Collaboration |
✔︎ |
✔︎ |
✔︎ | ✔︎ |
* IC AWS will be moving to UCSF Secure Enterprise AWS in the near future, which will enable PHI support
** Storage for user data is PHI-compliant with "PHI Support"
As shown in the table above, some features that are available in both Information Commons environments are:
- Jupyter environment for interactive data science
- Availability of structured, de-identified electronic health records data (DeID CDW and OMOP)