There have been many efforts on scheduling mechanisms for parallel jobs in clusters . FCFS is the basic but popularly used batch scheduling algorithm. Backfilling , which was developed as the EASY for IBM SP1, is a technique that allows short/small jobs to use idle nodes while the job at the head of the queue does not have enough number of nodes to run. Backfilling can improve node utilization, but it requires each job to specify its maximum execution time so that only jobs that will not delay the start of the job at the head of the queue are backfilled. A preempted job is often given a reservation for a future time to run. Different methods of assigning reservations differentiate several variances of backfilling techniques. Backfilling techniques address the low-utilization problem caused by different node number requirements of parallel jobs. Backfilling does not deal with low resource utilization due to parallel jobs themselves. Gang scheduling allows resource sharing among multiple parallel jobs. The computing capacity of a node is divided into time slices for sharing among the processes of jobs. The gang scheduling algorithm manages to make all the processes of a job progress together so that one process will not be in sleep state when another process needs to communicate with it. The allocation of time slices of different nodes to parallel processes is coordinated, which requires OS support. Some gang scheduling algorithms, such as paired gang scheduling investigate how to place processes with complement resource needs together to minimize their interference, e.g., when a process performs I/O activities and leaves CPU idle, the paired gang scheduling algorithm can find a process to use the idle CPU resources. A similar strategy is used in cloud resource consolidation through correlation analysis of resource use among VMs . Processes of parallel jobs share the computing capacity of a node equally in common gang scheduling algorithms. This approach can improve the utilization to a certain degree, but is likely to stretch the execution time of individual jobs. There is attempt to integrate backfilling and gang scheduling , but it only results in a comparable performance to that of the simple backfilling algorithm
You are here: Home / IEEE Projects 2013-14 / Improving resource utilization For data centers that run parallel jobs