Early Results with Longleaf and Pine
As part of our pre-general release work, we have worked with some groups who have workload that fits the profile for which we designed Longleaf and Pine.
For example, we ported a pipeline in use for an ongoing project in Epidemiology. It has been running on KillDevil. On Longleaf, the first step, which estimates the variance component of a linear mixed model, requires a little less than 15 minutes on Longleaf—it requires 45 minutes on KillDevil. That’s three times faster and it isn’t the most data-intensive part; it’s just the first part. It will be interesting to see the gains for the later steps.
Here are some initial comments from a few of our early users. Mark Psioda, Biostatistics:
“Without Longleaf, my research simply would not be practical…”
“My research is focused on the problem of Bayesian clinical trial design. Bayesian approaches seek to incorporate external information into the design of new trials in order to decrease their size and/or bring about their conclusion more quickly. This translates to getting effective new therapies out to patients more quickly. One must evaluate “how much” information can be incorporated into the design through a simulation-based process that would literally take months and months on a single computer. Through my use of the Longleaf computing cluster, I am able to turn months into days and sometimes even hours by distributing computation across hundreds of compute nodes. Without Longleaf, my research simply would not be practical.”
Cassie Spracklen from Karen Mohlke’s lab (Genetics) wrote:
“You helped me get this accomplished faster by running the jobs on Longleaf…”
“Before, I had to run a specific analysis within EPACTS (called Firth) that is very memory intensive. On the bigmem queue, I could only run 2-4 chromosomes at a time (taking 2-3 days each), and needed to run over 120 jobs. You helped me get this accomplished faster by running the jobs on Longleaf, allowing it to finish faster and without clogging up the bigmem queue on Killdevil.”
Rob Lampe, Marchetti Lab (Marine Science):
“Longleaf gives us the computational power…”
“Our work can require taking 400 million RNA sequences from the environment and turning it into data we can interpret. Longleaf gives us the computational power to make this work possible in a short amount of time.”
Jessica Bullins, Gilmore Lab (Neuroscience):
“My dissertation work would not be possible without the cluster computing resources provided by UNC…”
“I am a Neuroscience PhD student working in John Gilmore’s lab. The focus of this lab is to investigate early brain development from birth into childhood. I am specifically using Longleaf to carry out computationally intense statistical analyses to relate brain white matter data derived from the analysis of Magnetic Resonance Images (MRIs) to children’s cognitive scores. The goal of this project (my dissertation work) is to understand how early brain structure supports cognition. My dissertation work would not be possible without the cluster computing resources provided by UNC, and particularly Longleaf as it supports the version of Matlab necessary for the statistical codes I use.”