-
Improving datacenter job scheduling
(with Thomas Karagiannis and Hitesh Ballani at Microsoft Research, Cambridge)
Job schedulers in distributed execution environments such as Hadoop MapReduce typically make scheduling decisions using simple heuristics that do not incorporate any information about individual jobs. In this project, we show how to design better schedulers that use simple, easily available job-specific information to obtain higher overall throughput and better job completion times.
-
Cloud computing and energy efficiency
While there has been significant interest recently in characterizing the tradeoff between the energy consumption and performance of jobs in cloud computing environments (such as MapReduce implementations), most recent work in this area has been descriptive in nature, with an emphasis on studying the factors contributing to energy consumption for various specific workloads. We focus instead on a predictive capability. Our primary goal is to develop simple analytical models that would allow translating a high-level description of the workload and the execution environment into an energy and performance estimate, and to use these models to investigate system design issues in datacenters.
-
Congestion pricing and denial-of-service attacks
Most conventional DoS protection mechanisms require that the network try to identify and separate out malicious traffic; in contrast, we consider an alternate approach in which the network simply auctions off scarce resources among any competing users. Specifically, we analyze Kelly's proportional allocation mechanism -- in both single-stage and multi-stage settings -- and show that it can allocate resources fairly to the legitimate users of the network even in the presence of arbitrary malicious behavior. We also demonstrate that it is strictly superior to less sophisticated classification-free DoS-resistance methods such a fair queueing.
-
Adaptive routing with end-to-end feedback
Traditional routing protocols rely on (the intermediate nodes in) the network to both collect information about network state and participate in making routing decisions. We demonstrate that acceptable performance can also be obtained when end-hosts make routing decisions solely on the basis of observed end-to-end performance. We have studied the performance of our algorithms on an overlay deployment on PlanetLab, and are currently working on obtaining more accurate performance estimates from an in-network deployment on the GENI OpenFlow testbed.