Optimizing UltraScan Job Scheduling with Deep Learning-Based Performance Prediction

Aaron Householder, Cliff Zou, Emre Brookes

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

High-Performance Computing (HPC) resources are crucial for computationally intensive applications, yet efficient job scheduling remains a challenge due to inaccurate user-provided runtime estimates. UltraScan, software for analyzing analytical ultracentrifugation experiments, relies on queue-managed HPC resources where execution times vary significantly based on input parameters. To improve scheduling efficiency and resource utilization, we propose a deep learning-based approach for predicting UltraScan job run times. Unlike previous heuristic-based and regression models, our study employs deep neural networks trained on historical execution records, utilizing hyperparameter tuning and grid search to optimize predictive accuracy. However, feature selection methods such as LASSO regression, XGBoost, and Random Forest did not improve runtime prediction, suggesting that execution time is influenced by complex high-dimensional interactions rather than individual feature importance. Instead, deep learning models demonstrated better performance by capturing implicit patterns within the data. To further refine predictions, we introduce a Z-score-based outlier filtering strategy that adaptively adjusts acceptance thresholds, mitigating the impact of extreme cases on runtime estimation. Our results indicate that deep learning models, combined with outlier handling, provide a scalable approach for improving HPC scheduling, though challenges remain in reducing long-tail prediction errors. This study represents the first large-scale application of deep learning for UltraScan performance prediction.

Original languageEnglish
Title of host publicationPEARC 2025 - Practice and Experience in Advanced Research Computing 2025
Subtitle of host publicationThe Power of Collaboration
PublisherAssociation for Computing Machinery, Inc
Pages1-8
ISBN (Electronic)9798400713989
DOIs
StatePublished - Jul 18 2025
Event2025 Practice and Experience in Advanced Research Computing, PEARC 2025 - Columbus, United States
Duration: Jul 20 2025Jul 24 2025

Publication series

NamePEARC 2025 - Practice and Experience in Advanced Research Computing 2025: The Power of Collaboration

Conference

Conference2025 Practice and Experience in Advanced Research Computing, PEARC 2025
Country/TerritoryUnited States
CityColumbus
Period07/20/2507/24/25

Keywords

  • High-Performance Computing (HPC)
  • Job Runtime Prediction
  • Machine Learning for HPC
  • Performance Prediction

Fingerprint

Dive into the research topics of 'Optimizing UltraScan Job Scheduling with Deep Learning-Based Performance Prediction'. Together they form a unique fingerprint.

Cite this