Improving Scheduling using Job Runtime Predictions
I worked with Prof. Denis Trystram on studying the impact of job runtime estimates on the performance of non-clairoyant scheduling algorithms. More specifically, we considered the problem of scheduling a set of independent jobs whose processing times are unknown. User runtimes are known to be highly inaccurate. Scheduling policies such as backfilling or shortest job first rely on these estimates to schedule jobs. In this work, we explored the use of machine learning methods to provide better estimates for the exceution times using user history and job characteristics. Instead of estimating the exact value of the job runtime, we consider that the jobs fall in two categories – short and long jobs and predict the type of each job. We used several classification models as well as simpler schemes such as predicting based on the class of previously submitted jobs alone. We evaluated our models using several full workload traces.