Minirisk

Minirisk

(last update: April 4th 2007)

Minirisk is designed to be a relatively simple testbed for the multi-agent reasoning with uncertainty. The Minirisk game has rules simple enough to easily understand and implement, while complex enough to leave space for advanced artificial intelligence algorithms.

Minirisk is a fully-observable random constant-sum game with a variable number of adversaries. Because of the nature of the rules it gives a strong advantage to the player who models the opponents' behavior.

Table of contents

Remark

A similar text in Czech is available by Ondøej Sýkora.

News

Mission of the project

During the artificial intelligence and multi-agent system studies and while designing a new reasoning-with-uncertainty method one often needs a simple testbed. There are several tasks which may be considered "interesting" and "promissing" (such as chess, go, poker). However, many of them are very difficult and too complex for education or methods comparison.

For this reason the minirisk project is designed. It is an implementation of a simple card based game. The rules are easy to understand and allow to use a general AI or game theory method, while avoiding special domain specific tricks.

It would be my great pleasure to receive other people players, compare their performance and together with their basic descriptions publish their results.

Rules

Number of players: 2-5

The game sequence

In the begining of the game every player obtains the currency cards (bid cards) with values from one to fifteen. Other cards (so called deck cards), with values -5, -4, -3, -2, -1, 1, 2, ..., 10 are shuffled in the deck.

After several repetitions of the game the player with the highest amount of points is the winner.

The card winner determination method

Because all players want to collect as many positive points as possible, the player who puts the highest card as a bid for a positive-value deck card wins the points. But if there are more players bidding the same, they are all skipped and the next best bid gets the chance. This way the highest single bid wins the points.

The method is a bit different if there is a card with the negative value. Because nobody wants it, the lowest single bid "wins" it.

If no single bid exists and thus the winner cannot be determined, the card is left on the table and is auctioned together with the next card.

In the round with more than one deck card on the table, their values sum is significant for the winner determination. The zero sum is treated as negative (the lowest bid "wins").

Sample game

 TablePlayer APlayer BPlayer C
Bid cards in hand: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
Current deck card(s): 7
Bids: 10 11 10
First round result: wins 0 points wins 7 points wins 0 points
Bid cards in hand: 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15
Current deck card(s): -3
Bids: 7 8 8
Second round result: wins -3 points wins 0 points wins 0 points
Bid cards in hand: 1, 2, 3, 4, 5, 6, 8, 9, 11, 12, 13, 14, 15 1, 2, 3, 4, 5, 6, 7, 9, 10, 12, 13, 14, 15 1, 2, 3, 4, 5, 6, 7, 9, 11, 12, 13, 14, 15
Current deck card(s): 4
Bids: 6 6 5
Third round result: wins 0 points wins 0 points wins 4 points
Bid cards in hand: 1, 2, 3, 4, 5, 8, 9, 11, 12, 13, 14, 15 1, 2, 3, 4, 5, 7, 9, 10, 12, 13, 14, 15 1, 2, 3, 4, 6, 7, 9, 11, 12, 13, 14, 15
Current deck card(s): 3
Bids: 4 4 4
Fourth round result: wins 0 points wins 0 points wins 0 points
Bid cards in hand: 1, 2, 3, 5, 8, 9, 11, 12, 13, 14, 15 1, 2, 3, 5, 7, 9, 10, 12, 13, 14, 15 1, 2, 3, 6, 7, 9, 11, 12, 13, 14, 15
Current deck card(s): 3, -5
Bids: 9 10 11
Fifth round result: wins -2 points wins 0 points wins 0 points
Game status after the fifth round: -5 points 7 points 4 points

Prerequisites

Minirisk is implemented using the Java language. For a run it needs a Java Runtime Environment (version 5.0 or newer). The Java Developement Kit of the corresponding version is needed to compile the game server and the sample players.

Installation

After the source code is downloaded and unpacked, the source can be compiled using the compile.sh script (on platforms with sh and javac in the PATH). However, the downloaded package contains the software already compiled. The source is distributed under the GNU Public License, Version 2.

Running minirisk

The run_example.sh can be used to start a sample match and to check how the server and clients are started.

Minirisk server optional parameters

-p|--port The server port number.
-n|--count n The number of players. After this amount of players logged in, the tournament is started.
-g|--games n The number of games in a tournament.
-t|--timeout n The time for the players to think about what to bid (in milliseconds).
-h|--help The help message is printed.

Implementing an own player in Java

The cz.matfyz.isa.jiri.minirisk.client.Client class implements the basic communication loop and few callback function that can be easily overloaded. For a sample check the cz.matfyz.isa.jiri.minirisk.client.random.RandomClient source code or an even more simple cz.matfyz.isa.jiri.client.reactive.ReactiveClient.

Protocol

Server communicates with all clients using the TCP connection (listening by default on the port 7654). Every message between the game server and the game player is a single text line (ASCII).

  1. Player sends its name (used for an identification).
  2. Server sends the number of players in the game.
  3. The server sends an ID of the player (its index).
  4. The server sends a per-move thinking timeout (in milliseconds).
  5. Every game is started by the "start" message.
  6. Server sends the comma-separated list of card values on the table.
  7. Player sends the bid.
  8. Server sends a comma-separated list of all bids.
  9. Server sends the amount of points obtained by the player in the current game step.
  10. The game end is announced by the server using the "end" message.
  11. The tournament end is announced by the server using the "quit" message.

Warning

The implementation is very naive and is not designed to handle any error. It is expected that all players obey the given protocol, otherwise the tournament ends abruptly.

Download

April 14th, 2010 The code now resides at http://bitbucket.org/ondrasej/minirisk
November 16th, 2009 minirisk_20091116.tar.gz (without the compiled binaries)
November 3rd, 2009 minirisk_20091103.tar.gz (without the compiled binaries)
February 5th, 2007 minirisk_20070205.tar.gz
November 24th, 2006 minirisk_20061124.tar.gz
November 23rd, 2006 minirisk_20061123.tar.gz
November 20th, 2006 minirisk_20061120.tar.gz
November 19th, 2006 minirisk_20061119.tar.gz
November 18th, 2006 minirisk_20061118.tar.gz

Brief description of the players

RandomClient

As the name suggests, there is no planning or strategy involved in this player. This player takes randomly one of the cards on hand (uniform distribution) and plays it.

ReactiveClient

ReactiveClient contains a hand-coded strategy table. When a deck card appears on the table, this player looks it up in its strategy table and reads there the bid card value that ought to be played. This makes the player totaly predictable.

The ReactiveClient can be invoked with the "strategy shift" parameter:

java cz.matfyz.isa.jiri.client.reactive.ReactiveClient n

n is an integer value that influences the strategy table. The n = 1 case corresponds to the "+1 strategy" described in "Being predictable is wrong", if playing against the default ReactiveCLient (n = 0).

SlidingReactiveClient

SlidingReactiveClient is basically a ReactiveClient with the "strategy shift" parameter increased by one every new game.

SimpleRLClient

The SimpleRLClient uses a very straightforward reinforcement learning. The "deck_card × bid_card → expected_utility" mapping is estimated by RL based on the experience gained during the match.

NaiveMCLClient

NaiveMCLClient uses a naive version of the Monte-Carlo planning to estimate the expected value of its action. It is called "naive", because it considers its own and the opponents' future actions as random events with uniform distribution (which is far from being true).

This player was implemented to gain an insight of the naive implementation performance. From the very beginning it was not expected to give impressive results, but what if?

Results

Results of the pairwise matches

The table below shows an average per-game gain of all the players in the pairwise matches. 2000 rounds are played to obtain these numbers.

The results of a match sum up (in all but rare cases) to 40. This means, that result above 20 points can be considered "win", while below 20 points it is "loss".

# Player 1. 2. 3. 4. 5. 6. 7. Total Rank
1. RandomClient X 8.36 14.74 10.03 20.02 16.13 20.08 89.36 6.
2. ReactiveClient 31.39 X 7.05 23.85 4.65 13.06 30.19 110.19 5.
3. SimpleRLClient 25.05 32.94 X 16.62 20.56 17.61 24.20 136.98 3.
4. NaiveMCLClient 29.99 16.17 23.39 X 11.63 15.89 29.75 126.82 4.
5. MemoryClient 19.84 35.26 19.21 28.39 X 18.94 19.73 141.37 1.
6. MemoryMCClient 23.82 26.47 21.34 24.13 20.96 X 23.72 140.44 2.
7. SlidingReactiveClient 19.70 7.45 15.58 10.22 20.02 16.19 X 89.16 7.

And the winner izzz ... MemoryClient by Ond�ej S�kora! Congratulations!

The game logs are also available.

Results of the quaternary matches

Initially the players are sorted in the order of their final position in the pairwise tournament. The last four players form a quaternion, that plays a match of 2000 rounds. The player with least points drops out and is replaced by the next weakest player from the bottom of the original list. This drop&replace is repeated until all players participated.

Using this approach even the weakest player of the pairwise matches can become a champion of quaternary matches.

The average gain of a quaternary match is 10 points. Scoring more may be considered "win", scoring less may be considered "loss."

Note: The quaternary tournament was started when the SlidingReactiveClient did not exist yet.

RandomClientReactiveClientNaiveMCLClientSimpleRLClient
4.21 15.42 9.61 10.77

RandomClient drops out, the tournament continues ...

NaiveMCLClientSimpleRLClientReactiveClientMemoryClient
2.19 7.83 15.43 14.55

NaiveMCLClient drops out, the tournament continues ...

SimpleRLClientMemoryClientReactiveClientMemoryMCClient
6.31 13.27 14.21 6.18

And the winner izzz ... ReactiveClient by Ji�� I�a! Congratulations to myself!

Observations

Being predictable is wrong

If player's actions can all be predicted, it looses. The opponent can model the predictable player and play always one bid point higher ("+1 strategy"). This way there is only a single situation, when the predictable player wins a bid - when it bids the highest possible bid and the smart opponent has to bid the lowest possible bid. In all other cases the positive deck cards are won by the smart player, the negative deck cards go to the predictable one. This way the predictable player always looses the game.

Conclusion: An optimal strategy must not be predictable.

Nash equilibria

The following observations are for simplicity formulated for the two-player game. Their equivalents for n-player game also hold.

Mixed strategy

As Nash has proved (1950), for every finite two-player game there exists a pair of mixed (stochastic) strategies {πa, πb} such that πa is a best response to πb and vice versa.

Symmetric game symmetric Nash equilibrium

Only recently, in 2004, the existence of a symmetric Nash equilibrium strategy in symmetric games has been proved: A finite symmetric game has a symmetric mixed-strategy equilibrium (πa = πb, πa is a best response to itself).

Conclusion: There exists a mixed strategy for Minirisk, which is a best response to itself.