Models used in DP#

Below is an overview of the models and environments we use in the DP section of the course

Environments#

Name

Environment

File

Actions

States

Pacman

GameState

irlc/pacman/gamestate.py. Note you don’t have to read the file.

State-dependent: "North", "East", "South", "West", "Stop"

Each state is a GameState object

Inventory environment

InventoryEnvironment

See irlc/ex01/inventory_environment.py

Discrete(3)

Discrete(3)

Bobs friend environment

BobFriendEnvironment

See irlc/ex01/bobs_friend.py

Discrete(2)

All positive numbers

This is a list of the notable models in this section of the course:

Models#

Name

Class and file

Comments

Pacman DP Model

This model corresponds to the Pacman game. You will implement it as part of project 1.

Pacman#

Warning

When you use the Pacman-environment, I strongly recommend that you stick to the functions that are documented here (see GameState) and mentioned in the project description. The GameState object gives you access to other, internal, game-specific functions and data-structures, but their behavior may differ from what you expect and I recommend you don’t use them.

Pacman is among the most complex environments considered in this course. Each state need to keep track of:

  • Pacmans position

  • The ghosts position

  • The maze layout

  • Remaining food pellets

To accomplish this, the states \(x_k\) will therefore be a small class (GameState).

Let’s create an environment and use that print(state) provides a convenient representation of the game configuration represented by the current state:

>>> from irlc.pacman.pacman_environment import PacmanEnvironment, very_small_maze
>>> env = PacmanEnvironment(very_small_maze)
>>> s, _ = env.reset() # Works just like any other environment.
>>> s # Confirm the state is an object
<irlc.pacman.gamestate.GameState object at 0x7f7a385ff530>
>>> print(s)
%%%%%%
%<. .%
%  %%%
%%%%%%
Score: 0

The state has a few functions that tell Pacman what he can do. For instance, we can check what actions we have available or whether we won or lost as follows:

>>> from irlc.pacman.pacman_environment import PacmanEnvironment, very_small_maze
>>> env = PacmanEnvironment(very_small_maze)
>>> s, _ = env.reset() # Works just like any other environment.
>>> print("Available actions are", s.A())
Available actions are ['South', 'East', 'Stop']
>>> print("Have we won?", s.is_won(), "have we lost?", s.is_lost())
Have we won? False have we lost? False

We can move around using the s.f(action)-function as follows:

>>> from irlc.pacman.pacman_environment import PacmanEnvironment, very_small_maze
>>> env = PacmanEnvironment(very_small_maze)
>>> s0, _ = env.reset() # Works just like any other environment.
>>> s1 = s0.f("East")
>>> s2 = s1.f("East")
>>> s3 = s2.f("East")
>>> print("Is s2 won?", s2.is_won(), "is s3 won?", s3.is_won())
Is s2 won? False is s3 won? True

Pacman with ghosts#

When there are \(G\) ghosts pacman become a multi-player game. Pacman is labelled as player number 0, and the ghosts are players number \(1, 2, \dots, G\).

The game then proceeds in turns starting with Pacman. He makes a move, and then each of the ghosts make a move (in order), and finally we are back at Pacmans turn.

The following example illustrates the effect of Pacman and the ghost taking one step each:

>>> from irlc.pacman.pacman_environment import PacmanEnvironment, very_small_maze
>>> print("The maze layout\n",very_small_maze)
The maze layout
 
%%%%%%
%P. .%
%  %%%
%%%%%%

>>> env = PacmanEnvironment(very_small_maze)
>>> s0, _ = env.reset() # Works just like any other environment.
>>> s0.players() # Get the number of players
1
>>> s0.player() # Get the curent player
0
>>> s0 = s0.f("East") # Pacman has now moved east
>>> s0.player() # It is now the ghosts turn
0
>>> s0 = s0.f("West") # The ghost moves west and it is pacmans turn again
>>> env.close()

Hint about the win-probability questions#

Warning

This hint does not add or change anything about the problem. It is simply pointing out that the ghosts do not not always have 3 actions, i.e., it may be the case that A() can contain something less than 3 elements, and therefore code that relies on this assumption may give wrong results.

The win-probability problems, in particular for two ghosts, is typically what causes the most problems. If you passed all the others tests, the problem will mostly likely come down to the implementation of the p_next function and specifically how you computed the probabilities.

Lets say there is one ghost. In that case the ghost can actually have 1, 2 or 3 actions available. As an example, let’s take the maze which was plotted above:

>>> from irlc.pacman.pacman_environment import PacmanEnvironment, very_small_haunted_maze
>>> env = PacmanEnvironment(layout_str=very_small_haunted_maze)
>>> s0, _ = env.reset() # Get starting state
>>> s0 = s0.f("East") # Pacman has now moved east
>>> print(f"It is the ghosts turn since {s0.player()=} and the actions are {s0.A()=}")
It is the ghosts turn since s0.player()=1 and the actions are s0.A()=['West']
>>> env.close()

Another important case where this occurs is when the game is lost, in which case both pacman and the ghost only has a single action available (i.e., len(s.A()) == 1).

If you compute the probabilities using \(p(w | x_k, u) = \frac{1}{| \mathcal{A} |}\) where \(\mathcal{A}\) are the actions available to the ghost in a given position (see A()) this will not be a problem, however, for two ghosts you need to be a little more careful. In this case, the game update consists of

  1. Pacman makes a move

  2. Ghost 1 takes one of \(\mathcal{A}'\) available actions

  3. Ghost 2 takes one of \(\mathcal{A}''\) available actions

The probability is therefore the chance of the last two events:

\[p(w | x, u) = \frac{1}{ | \mathcal{A}'| }\frac{1}{ | \mathcal{A}''| }\]

In most cases these probabilities will be \(\frac{1}{9}\). However, if e.g. ghost 1 eats pacman, these probabilities can be correspondingly higher since \(| \mathcal{A}''| = 1\)

Tip

Note that when you implement the one-ghost case you will have to compute \(\frac{1}{ | \mathcal{A}'| }\). Thus the two ghost case is justTM a matter of computing \(\frac{1}{ | \mathcal{A}'| }\) as in the one-ghost case and multiply it the inverse of the number of actions for the second ghost (i.e., \(\frac{1}{ | \mathcal{A}''| }\)).

The takeaways are:

  • Check your probabilities sum to 1.

  • Don’t normalize your probabilities at the end of p_next. If they don’t sum to 1, you have a bug.

class irlc.pacman.gamestate.GameState(prevState=None)[source]#

A GameState specifies the full game state, including the food, capsules, agent configurations and score changes.

GameStates are used by the Game object to capture the actual state of the game and can be used by agents to reason about the game.

Much of the information in a GameState is stored in a GameStateData object. We strongly suggest that you access that data via the accessor methods below rather than referring to the GameStateData object directly.

Note that in classic Pacman, Pacman is always agent 0.

To get you started, here are some examples.

>>> from irlc.pacman.pacman_environment import PacmanEnvironment, very_small_haunted_maze
>>> env = PacmanEnvironment(layout_str=very_small_haunted_maze)
>>> state, _ = env.reset() # Get starting state
>>> print(state)
%%%%%%
%<. .%
% %%%%
%   G%
%%%%%%
Score: 0

In the above code, state is a GameState instance – i.e. has all the methods found in this class. So for instance to know if the game is won or lost you can do:

>>> from irlc.pacman.pacman_environment import PacmanEnvironment, very_small_haunted_maze
>>> env = PacmanEnvironment(layout_str=very_small_haunted_maze)
>>> state, _ = env.reset() # Get starting state
>>> print("Did we win?", state.is_won(), "did we loose?", state.is_lost())
Did we win? False did we loose? False

Or to get the available actions, and then the next state representing what occurs when you take an action a:

>>> from irlc.pacman.pacman_environment import PacmanEnvironment, very_small_haunted_maze
>>> env = PacmanEnvironment(layout_str=very_small_haunted_maze)
>>> state, _ = env.reset() # Get starting state
>>> actions = state.A()
>>> print("Available actions are", actions)
Available actions are ['South', 'East', 'Stop']
>>> next_state = state.f(actions[0]) # Take the first action
>>> print(next_state) # Result of taking the first of the available actions.
%%%%%%
% . .%
%^%%%%
%   G%
%%%%%%
Score: -1

When a ghost move, it will select randomly between the available actions. Thus, the chance of a single move is 1/len(state.A()).

player()[source]#

Return the current player.

The players take turns. Initially player=0, meaning it is Pacman (your) turn, and in case there are ghosts player will then increment until all ghosts have moved at which point player = 0 again and the game is ready for the next step.

Return type:

int

Returns:

The id of the player who will make the next move.

players()[source]#

Return the total number of players.

Returns:

Return the number of ghosts + 1 (pacman).

A()[source]#

Return the available actions for the current player in this state.

If the state is won/lost, the actions will be just the stop-action: ["Stop"].

Returns:

Available actions as a list.

f(a)[source]#

Let the current player take action a.

This will return a new GameState corresponding to the current player taking an action.

Parameters:

a (str) – The action to take.

Return type:

object

Returns:

The next GameState.

is_lost()[source]#

Determine if this is a lost game.

Returns:

True if this GameState corresponds to a lost game (a ghost ate pacman)

is_won()[source]#

Determine if this is a won game.

Returns:

True if this GameState corresponds to a won game (all pellets eaten)

__init__(prevState=None)[source]#