Grid'5000 user report for Bruno Bzeznik
User informationBruno Bzeznik (users, user, account-manager, ct, grenoble, grenoble-staff, ml-users user)
More user information in the user management interface.
- Node virtualisation (Networking) [in progress]
Description: We want to experiment capabilities of host virtualization applied on g5k nodes. The aim is to have more free (virtual) nodes for large scale experiments. We have to test virtualization tools, choose one of them, and test it on every clusters, test if a master node can still use the high peformance network (Myrinet/Infiniband) and imagine a way for users to deploy/reserve virtual nodes. Other things may be tested, as the capability to run virtual images directly from NFS, disk usage optimization with LVM/COW, ip addressing...
Results: I think that my job has a lot of common things with Benjamin Quetier's work on Vgrid (http://www.lri.fr/~quetier/vgrid/), but with a different context: we could have different OS on the hosts and users may deploy environments on virtual nodes as they used to do on physical nodes with kadeploy.
More information here
- Distributed filesystems (Operating System) [in progress]
Description: We have to test several distributed filesystems (glusterFS, Lustre,...) for use in the CIMENT project. For the moment, we have 2 possible fields of application that are the reasons of this tests: * We are about to buy a new cluster and one of the criterions are the performance of the filesystem at the moment of collecting the results of a distributed job. * We are developping and using a local grid (CiGri) that may take advantages of a filesystem distributed on all the sites This experiment is also linked to future tests of a Scientific Linux + glite environment. So we are using Scientific Linux as a custom environment for those tests.
- OAR RPMS tests on Scientific Linux, Fedora, CentOS,... (Middleware) [planned]
Description: As the OAR RPM packager, I use a Scientific Linux environment to test my packages. I deploy a node as the OAR server and the others as nodes of the OAR cluster and I run test jobs to check if all is ok. I'm planning to have differents RPM-based distributions environments to validate my RPMS on different distributions (CentOS, Fedora, Suse,...).
- IDTRUIT: Testing frequenlty rebooted hardware strength (Other) [achieved]
Description: Taking 4 nodes of the venerable and famous IDPOT cluster, we made hard power off/on cycles during weeks to test if the hardware can be damaged by an energy saving feature that halts the nodes when there's no jobs running on a cluster.
Results: About 14000 reboots without damaging the hardware (equivalent to 12 reboots per day during 3 years)
More information here