computations. Changed the load balancing scheme. Added timing estimates for GPU and driver overhead, cpu idle time.