# Grid'5000 user report for

## User information

( user)*More user information in the user management interface.*

## Experiments

**random patterns (Programming) [in progress]**

*Description:*The goal is to find one pattern which is created randomly and improves a player of go. The final goal is that the player can improve itself by testing different patterns and incorporating these "good" patterns without human intervention. The algorithm is called RBGP (Racing Based Genetic Programming)

RBGP algorithm.

S = S0 = some initial set of mutations.

while S = ∅ do

Select s ∈ S

// the selection rule is not specified here

// (the result is independent of it)

Let n be the number of simulations of mutation s.

Simulate s n more times (i.e. now s has been simulated 2n times).

//this ensures nbT ests(s) = O(log(n(s)))

computeBounds(s)

if lb(s) > 0.501 then

Accept s; exit the program.

else if ub(s) < 0.504 then

S ← S \ {s}

end if

end while

(In our experiments, a mutation is a random pattern)

By race, it develops patterns which seems the most promising. The player with one tested pattern plays against the same player without this tested pattern. Thanks to results of all games, we compute an interval of confidence by using Hoeffding. A pattern will be validated if the inferior boundary of Hoeffding will be bigger than 50.1%. The experiment stops when the best chosen pattern is validated or the superior boundary of Hoeffding is inferior than 50.4%.

When one random pattern has been validated or all random patterns have been rejected, a new iteration begins. (complete RBGP Algorithm)

The complete RBGP algorithm

Let P be the current agent.

Let λ the number of agents.

for a given number of iterations do

Create randomly a population pop of λ agents by mutation of P

Apply RBGP BernsteinRace(pop, δ)

if An agent P ′ ∈ pop has been accepted then

P ′ becomes the current baseline: P ← P ′ .

else

The current baseline is not changed.

end if

end for

(In our experiment, an agent is the program P + a mutation).

This experiment is launched in besteffort.

For the moment, the experiment turns on NoGo (This a variant of Go where the first player who captures one or several stones has lost). The goal is to compare for different value of λ the speed of the progression.*Results:*With NoGo

The baseline is NoGo without random patterns.

λ = 1 - Site Orsay

45 random patterns found (STOPPED)

Against the baseline 69.63% +/-0.3

λ = 2 - Site Lyon

28 random patterns found (STOPPED)

Against the baseline 65.71% +/-0.3

λ = 16 - Site Lyon

25 random patterns found (IN PROGRESS)

Against the baseline 65.11% +/-0.3

λ = 128 - Site Orsay

3 random patterns found (IN PROGRESS)

Against the baseline 54.12% +/-0.3

Old results with MoGo :

one "random" pattern has been found on a run of 20 random patterns in 9x9 go.

Individu 20 : SUCCESS

O!X! 33528 - 32064 [ 50.73% ; 51.11% ; 51.49% ]

Hoeffding: 50139 --- 52092

4 randoms patterns (P1,P2,P3 and P4) have been validated in 9x9

9x9 - 10000 simulations per move

MoGoCVS+P1 against MoGoCVS 66707 - 64498 [ 50.57% ; 50.84% ; 51.11% ]

MoGoCVS+P1+P2 against MoGoCVS+P1 33600 - 32027 [ 50.81% ; 51.19% ; 51.58% ]

MoGoCVS+P1+P2 against MoGoCVS 52018 - 48186 [ 51.6% ; 51.91% ; 52.22% ]

MoGoCVS+P1+P2+P3 against MoGoCVS+P1+P2 33432 - 32253 [ 50.51% ; 50.89% ; 51.27% ]

MoGoCVS+P1+P2+P3 against MoGoCVS 52634 - 47400 [ 52.3% ; 52.61% ; 52.92% ]

MoGoCVS+P1+P2+P3+P4 against MoGoCVS+P1+P2+P3 66396 - 64749 [ 50.35% ; 50.62% ; 50.89% ]

MoGoCVS+P1+P2+P3+P4 against MoGoCVS 53569 - 46622 [ 53.15% ; 53.46% ; 53.77% ]

## Publications

**Progress Rate in Noisy Genetic Programming for Choosing $\lambda$ [2011]***(international)**EntryType:*inproceedings *Hal_id:*inria-00622150 *Url:*http://hal.inria.fr/inria-00622150/en/ *Author:*Hoock, Jean-Baptiste and Teytaud, Olivier *Abstract:*{Recently, it has been proposed to use Bernstein races for implementing non-regression testing in noisy genetic programming. We study the population size of such a (1+$\lambda$) evolutionary algorithm applied to a noisy fitness function optimization by a progress rate analysis and experiment it on a policy search application.} *Language:*Anglais *Affiliation:*Laboratoire de Recherche en Informatique - LRI - CNRS : UMR8623 - Universit{\'e} Paris Sud - Paris XI - TAO - INRIA Saclay - Ile de France - INRIA - CNRS : UMR8623 - Universit{\'e} Paris Sud - Paris XI - National University of Tainan - NUTN - National University of Tainan *Booktitle:*{Artificial Evolution} *Address:*Angers, France *Audience:*internationale *Pdf:*http://hal.inria.fr/inria-00622150/PDF/ea2011.pdf **Bandit-Based Genetic Programming [2010]***(international)**EntryType:*inproceedings *Hal_id:*inria-00452887 *Url:*http://hal.archives-ouvertes.fr/inria-00452887/PDF/pattern.pdf *Author:*{H}oock, {J}ean-{B}aptiste and {T}eytaud, {O}livier *Abstract:*{W}e consider the validation of randomly generated patterns in a {M}onte-{C}arlo {T}ree {S}earch program. {O}ur bandit-based genetic programming ({BGP}) algorithm, with proved mathematical properties, outperformed a highly optimized handcrafted module of a well-known computer-{G}o program with several world records in the game of {G}o. *Language:*{A}nglais *Affiliation:*{TAO} - {INRIA} {S}aclay - {I}le de {F}rance - {INRIA} - {CNRS} : {UMR}8623 - {U}niversit{\'e} {P}aris {S}ud - {P}aris {XI} - {L}aboratoire de {R}echerche en {I}nformatique - {LRI} - {CNRS} : {UMR}8623 - {U}niversit{\'e} {P}aris {S}ud - {P}aris {XI} *Booktitle:*13th {E}uropean {C}onference on {G}enetic {P}rogramming *Publisher:*{S}pringer *Address:*{I}stanbul {T}urquie *Audience:*internationale **Intelligent Agents for the Game of Go [2010]***(international)**EntryType:*article *Hal_id:*inria-00544758 *Url:*http://hal.inria.fr/inria-00544758/PDF/cimok.pdf *Author:*{H}oock, {J}ean-{B}aptiste and {L}ee, {C}hang-{S}hing and {R}immel, {A}rpad and {T}eytaud, {F}abien and {T}eytaud, {O}livier and {W}ang, {M}ei-{H}ui *Abstract:*{M}onte-{C}arlo {T}ree {S}earch ({MCTS}) is a very efficient recent technology for games and planning, par- ticularly in the high-dimensional case, when the number of time steps is moderate and when there is no natural evaluation function. {S}urprisingly, {MCTS} makes very little use of learning. {I}n this paper, we present four techniques (ontologies, {B}ernstein races, {C}ontextual {M}onte-{C}arlo and pool{R}ave) for learning agents in {M}onte-{C}arlo {T}ree {S}earch, and experiment them in difficult games and in particular, the game of {G}o. *Language:*{A}nglais *Affiliation:*{L}aboratoire de {R}echerche en {I}nformatique - {LRI} - {CNRS} : {UMR}8623 - {U}niversit{\'e} {P}aris {S}ud - {P}aris {XI} - {TAO} - {INRIA} {S}aclay - {I}le de {F}rance - {INRIA} - {CNRS} : {UMR}8623 - {U}niversit{\'e} {P}aris {S}ud - {P}aris {XI} - {D}epartment of {C}omputer {S}cience and {I}nformation {E}ngineering - {CSIE} - {N}ational {U}niversity of {T}ainan - {N}ational {U}niversity of {T}ainan - {NUTN} - {N}ational {U}niversity of {T}ainan *Publisher:*{IEEE} *Journal:*{IEEE} {C}omputational {I}ntelligence {M}agazine *Audience:*internationale *Month:*11

## Collaborations

## Success stories and benefits from Grid'5000

*last update: 2011-09-12 13:57:21*