Minirisk
(last update: April 4th 2007)
Minirisk is designed to be a relatively simple testbed for the multi-agent reasoning with uncertainty. The Minirisk game has rules simple enough to easily understand and implement, while complex enough to leave space for advanced artificial intelligence algorithms.
Minirisk is a fully-observable random constant-sum game with a variable number of adversaries. Because of the nature of the rules it gives a strong advantage to the player who models the opponents' behavior.
Table of contents
- Remark
- News
- Mission of the project
- Rules
- Prerequisites
- Installation
- Running minirisk
- Minirisk server optional parameters
- Implementing an own player in Java
- Protocol
- Warning
- Download
- Brief description of the players
- Results
- Observations
Remark
A similar text in Czech is available by Ondøej Sýkora.
News
- April 14th, 2010: A link to a bitbucket repository was added.
- April 4th, 2007: Nash equilibria observations added.
- April 3rd, 2007: SlidingReactiveClient description, its matches were finished.
- March 19th, 2007: The quaternary matches; the SlidingReactiveClient added to the pairwise results.
- March 18th, 2007: Brief description of the players added and the construction of the pairwise matches table started.
- March 17th, 2007: Table of contents and the Observations sections added to this page.
- February 5th, 2007:
- NaiveMCLClient (naive Monte-Carlo player) added.
- Basic client extended by few usefull constants.
- Typo fixed in a comment in the RandomClient.
- Example game added to this page.
- November 24th, 2006: SimpleRLClient refactored. Code cleaned and commented.
- November 23rd, 2006: Simple reinforcement learning player (
cz.matfyz.isa.jiri.minirisk.client.simpleRL.SimpleRLClient) added to the package. - November 20th, 2006: Improved rules following control.
- November 19th, 2006: Winner determination fixed. If there are more deck cards on the table, the determination rule depends on the sum of their values. Not only on the value of the last card.
Mission of the project
During the artificial intelligence and multi-agent system studies and while designing a new reasoning-with-uncertainty method one often needs a simple testbed. There are several tasks which may be considered "interesting" and "promissing" (such as chess, go, poker). However, many of them are very difficult and too complex for education or methods comparison.
For this reason the
It would be my great pleasure to receive other people players, compare their performance and together with their basic descriptions publish their results.
Rules
Number of players: 2-5
The game sequence
In the begining of the game every player obtains the currency cards (bid cards) with values from one to fifteen. Other cards (so called deck cards), with values -5, -4, -3, -2, -1, 1, 2, ..., 10 are shuffled in the deck.
- In every step of the game a single card from the deck is placed face up on the table.
- Every player puts one of her remaining currency cards face-down on the table. Every currency card of a player is played exactly once during the game.
- Once all players have done so, all their currency cards on the table are turned face-up.
- The winner determined by the method described bellow wins the number of points equal to the value of the card from the deck.
After several repetitions of the game the player with the highest amount of points is the winner.
The card winner determination method
Because all players want to collect as many positive points as possible, the player who puts the highest card as a bid for a positive-value deck card wins the points. But if there are more players bidding the same, they are all skipped and the next best bid gets the chance. This way the highest single bid wins the points.
The method is a bit different if there is a card with the negative value. Because nobody wants it, the lowest single bid "wins" it.
If no single bid exists and thus the winner cannot be determined, the card is left on the table and is auctioned together with the next card.
In the round with more than one deck card on the table, their values sum is significant for the winner determination. The zero sum is treated as negative (the lowest bid "wins").
Sample game
| Table | Player A | Player B | Player C | |
|---|---|---|---|---|
| Bid cards in hand: | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 | |
| Current deck card(s): | 7 | |||
| Bids: | 10 | 11 | 10 | |
| First round result: | wins 0 points | wins 7 points | wins 0 points | |
| Bid cards in hand: | 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15 | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15 | 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15 | |
| Current deck card(s): | -3 | |||
| Bids: | 7 | 8 | 8 | |
| Second round result: | wins -3 points | wins 0 points | wins 0 points | |
| Bid cards in hand: | 1, 2, 3, 4, 5, 6, 8, 9, 11, 12, 13, 14, 15 | 1, 2, 3, 4, 5, 6, 7, 9, 10, 12, 13, 14, 15 | 1, 2, 3, 4, 5, 6, 7, 9, 11, 12, 13, 14, 15 | |
| Current deck card(s): | 4 | |||
| Bids: | 6 | 6 | 5 | |
| Third round result: | wins 0 points | wins 0 points | wins 4 points | |
| Bid cards in hand: | 1, 2, 3, 4, 5, 8, 9, 11, 12, 13, 14, 15 | 1, 2, 3, 4, 5, 7, 9, 10, 12, 13, 14, 15 | 1, 2, 3, 4, 6, 7, 9, 11, 12, 13, 14, 15 | |
| Current deck card(s): | 3 | |||
| Bids: | 4 | 4 | 4 | |
| Fourth round result: | wins 0 points | wins 0 points | wins 0 points | |
| Bid cards in hand: | 1, 2, 3, 5, 8, 9, 11, 12, 13, 14, 15 | 1, 2, 3, 5, 7, 9, 10, 12, 13, 14, 15 | 1, 2, 3, 6, 7, 9, 11, 12, 13, 14, 15 | |
| Current deck card(s): | 3, -5 | |||
| Bids: | 9 | 10 | 11 | |
| Fifth round result: | wins -2 points | wins 0 points | wins 0 points | |
| Game status after the fifth round: | -5 points | 7 points | 4 points |
Prerequisites
Minirisk is implemented using the Java language. For a run it needs a Java Runtime Environment (version 5.0 or newer). The Java Developement Kit of the corresponding version is needed to compile the game server and the sample players.
Installation
After the source code is downloaded and unpacked, the source can be compiled using the compile.sh script (on platforms with sh and javac in the PATH). However, the downloaded package contains the software already compiled. The source is distributed under the GNU Public License, Version 2.
Running minirisk
The run_example.sh can be used to start a sample match and to check how the server and clients are started.
Minirisk server optional parameters
-p|--port
|
The server port number. |
-n|--count n
|
The number of players. After this amount of players logged in, the tournament is started. |
-g|--games n
|
The number of games in a tournament. |
-t|--timeout n
|
The time for the players to think about what to bid (in milliseconds). |
-h|--help
|
The help message is printed. |
Implementing an own player in Java
The cz.matfyz.isa.jiri.minirisk.client.Client class implements the basic communication loop and few callback function that can be easily overloaded. For a sample check the cz.matfyz.isa.jiri.minirisk.client.random.RandomClient source code or an even more simple cz.matfyz.isa.jiri.client.reactive.ReactiveClient.
Protocol
Server communicates with all clients using the TCP connection (listening by default on the port 7654). Every message between the game server and the game player is a single text line (ASCII).
- Player sends its name (used for an identification).
- Server sends the number of players in the game.
- The server sends an ID of the player (its index).
- The server sends a per-move thinking timeout (in milliseconds).
- Every game is started by the "start" message.
- Server sends the comma-separated list of card values on the table.
- Player sends the bid.
- Server sends a comma-separated list of all bids.
- Server sends the amount of points obtained by the player in the current game step.
- The game end is announced by the server using the "end" message.
- The tournament end is announced by the server using the "quit" message.
Warning
The implementation is very naive and is not designed to handle any error. It is expected that all players obey the given protocol, otherwise the tournament ends abruptly.
Download
| April 14th, 2010 | The code now resides at http://bitbucket.org/ondrasej/minirisk |
| November 16th, 2009 | minirisk_20091116.tar.gz (without the compiled binaries) |
| November 3rd, 2009 | minirisk_20091103.tar.gz (without the compiled binaries) |
| February 5th, 2007 | minirisk_20070205.tar.gz |
| November 24th, 2006 | minirisk_20061124.tar.gz |
| November 23rd, 2006 | minirisk_20061123.tar.gz |
| November 20th, 2006 | minirisk_20061120.tar.gz |
| November 19th, 2006 | minirisk_20061119.tar.gz |
| November 18th, 2006 | minirisk_20061118.tar.gz |
Brief description of the players
RandomClient
As the name suggests, there is no planning or strategy involved in this player. This player takes randomly one of the cards on hand (uniform distribution) and plays it.
ReactiveClient
ReactiveClient contains a hand-coded strategy table. When a deck card appears on the table, this player looks it up in its strategy table and reads there the bid card value that ought to be played. This makes the player totaly predictable.
The ReactiveClient can be invoked with the "strategy shift" parameter:
java cz.matfyz.isa.jiri.client.reactive.ReactiveClient n
n is an integer value that influences the strategy table. The n = 1 case corresponds to the "+1 strategy" described in "Being predictable is wrong", if playing against the default ReactiveCLient (n = 0).
SlidingReactiveClient
SlidingReactiveClient is basically a ReactiveClient with the "strategy shift" parameter increased by one every new game.
SimpleRLClient
The SimpleRLClient uses a very straightforward reinforcement learning. The "deck_card × bid_card → expected_utility" mapping is estimated by RL based on the experience gained during the match.
NaiveMCLClient
NaiveMCLClient uses a naive version of the Monte-Carlo planning to estimate the expected value of its action. It is called "naive", because it considers its own and the opponents' future actions as random events with uniform distribution (which is far from being true).
This player was implemented to gain an insight of the naive implementation performance. From the very beginning it was not expected to give impressive results, but what if?
Results
Results of the pairwise matches
The table below shows an average per-game gain of all the players in the pairwise matches. 2000 rounds are played to obtain these numbers.
The results of a match sum up (in all but rare cases) to 40. This means, that result above 20 points can be considered "win", while below 20 points it is "loss".
| # | Player | 1. | 2. | 3. | 4. | 5. | 6. | 7. | Total | Rank |
|---|---|---|---|---|---|---|---|---|---|---|
| 1. | RandomClient | X | 8.36 | 14.74 | 10.03 | 20.02 | 16.13 | 20.08 | 89.36 | 6. |
| 2. | ReactiveClient | 31.39 | X | 7.05 | 23.85 | 4.65 | 13.06 | 30.19 | 110.19 | 5. |
| 3. | SimpleRLClient | 25.05 | 32.94 | X | 16.62 | 20.56 | 17.61 | 24.20 | 136.98 | 3. |
| 4. | NaiveMCLClient | 29.99 | 16.17 | 23.39 | X | 11.63 | 15.89 | 29.75 | 126.82 | 4. |
| 5. | MemoryClient | 19.84 | 35.26 | 19.21 | 28.39 | X | 18.94 | 19.73 | 141.37 | 1. |
| 6. | MemoryMCClient | 23.82 | 26.47 | 21.34 | 24.13 | 20.96 | X | 23.72 | 140.44 | 2. |
| 7. | SlidingReactiveClient | 19.70 | 7.45 | 15.58 | 10.22 | 20.02 | 16.19 | X | 89.16 | 7. |
And the winner izzz ... MemoryClient by Ond�ej S�kora! Congratulations!
The game logs are also available.
Results of the quaternary matches
Initially the players are sorted in the order of their final position in the pairwise tournament. The last four players form a quaternion, that plays a match of 2000 rounds. The player with least points drops out and is replaced by the next weakest player from the bottom of the original list. This drop&replace is repeated until all players participated.
Using this approach even the weakest player of the pairwise matches can become a champion of quaternary matches.
The average gain of a quaternary match is 10 points. Scoring more may be considered "win", scoring less may be considered "loss."
Note: The quaternary tournament was started when the SlidingReactiveClient did not exist yet.
| RandomClient | ReactiveClient | NaiveMCLClient | SimpleRLClient |
|---|---|---|---|
| 4.21 | 15.42 | 9.61 | 10.77 |
RandomClient drops out, the tournament continues ...
| NaiveMCLClient | SimpleRLClient | ReactiveClient | MemoryClient |
|---|---|---|---|
| 2.19 | 7.83 | 15.43 | 14.55 |
NaiveMCLClient drops out, the tournament continues ...
| SimpleRLClient | MemoryClient | ReactiveClient | MemoryMCClient |
|---|---|---|---|
| 6.31 | 13.27 | 14.21 | 6.18 |
And the winner izzz ... ReactiveClient by Ji�� I�a! Congratulations to myself!
Observations
Being predictable is wrong
If player's actions can all be predicted, it looses. The opponent can model the predictable player and play always one bid point higher ("+1 strategy"). This way there is only a single situation, when the predictable player wins a bid - when it bids the highest possible bid and the smart opponent has to bid the lowest possible bid. In all other cases the positive deck cards are won by the smart player, the negative deck cards go to the predictable one. This way the predictable player always looses the game.
Conclusion: An optimal strategy must not be predictable.Nash equilibria
The following observations are for simplicity formulated for the two-player game. Their equivalents for n-player game also hold.
Mixed strategy
As Nash has proved (1950), for every finite two-player game there exists a pair of mixed (stochastic) strategies {πa, πb} such that πa is a best response to πb and vice versa.
Symmetric game symmetric Nash equilibrium
Only recently, in 2004, the existence of a symmetric Nash equilibrium strategy in symmetric games has been proved: A finite symmetric game has a symmetric mixed-strategy equilibrium (πa = πb, πa is a best response to itself).
Conclusion: There exists a mixed strategy for Minirisk, which is a best response to itself.