Large Scale Management of Virtual Machines (Virtualization)
Leader: Adrien Lèbre (ASCOLA)
Involved teams: ASCOLA, AVALON, ALGORILLE
Live-migration of virtual machines is one of the key-point of virtualization technologies. Besides simplifying maintenance operations, it provides an undeniable advantage to implement fine-grained scheduling policies such as consolidation or load-balancing strategies.
However, manipulating VMs throughout a large-scale and highly-distributed infrastructure as easy as traditional OSes handle processes on local nodes is still facing several issues. Among the major ones, we can notice the implementation of suited mechanisms to efficiently schedule VMs and to insure the access to the VM images through different locations. Such mechanisms should assume to be able to control,monitor and communicate with both the host OSes and the guest instances spread across the infrastructure at any time. If several works have addressed these concerns, the real experiments are in most cases limited to few nodes and there is a clear need to study such concerns at higher scales.
The goal of this challenge is to introduce techniques to deploy an operate a large amount of virtual machines over the Grid'5000 platform. Concretely, once deployed, end-users' scenarios may be executed on each VM in order to observe and analyze how they behave.
The result of that work opens doors to the manipulation of virtual machines thoughout a distributed infrastructure and the investigation of particular concerns such as the impact of migrating a large amount of VMs simultaneously or the study of new proposals dealing with VM images management.
[Booting and Using Virtual Machines on Grid'5000 |A set of scripts] has been implemented and validated throught the deployment of 10000 KVM instances over 4 Grid’5000 sites. This framework has been also presented in the latest publication related to Grid'5000 .
By the means of this framework, major limitations of available VM scheduling algorithms have been investigating and a new proposal leveraging cooperative and autonomous techniques has been proposed in . These results have been presented during the Grid'5000 challenges session of the Grid'5000 school held in Nantes in December 2012 .
Flauncher is currently extended to run complex stress workflows within VMs during migration operations. This work is supervised by AVALON with the support of ASCOLA and ALGORILLE. The ultimate objective is to define new models for VM migrations under different conditions.