用Java解决游戏 2048 的算法

codingman included in Algorithms Java

2018-03-26 4385 words 9 minutes

Contents

1. 简介

**最近，我们研究了解决游戏 2048 的算法。**我们从理论的角度讨论了这个问题，而不是背后的任何真实代码。

**在这里，我们将用 Java 编写一个实现。**这将扮演人类和计算机玩家的角色，展示如何玩出更优化的游戏。

2. 初始设置

我们需要的第一件事是一个设置，我们可以在其中玩游戏并查看进展情况。

这将为我们提供玩游戏所需的所有构造，并完全实现计算机播放器——无论如何它只会放置随机图块。这为我们提供了实现“人类”玩家玩游戏的空间。

2.1. 棋盘

首先，我们需要一个棋盘。这是一个可以放置数字的单元格网格。

为了让一些事情更容易处理，让我们从一个简单的单元格位置表示开始。这实际上只是一对坐标的包装：

public class Cell {
    private final int x;
    private final int y;
    // constructor, getters, and toString
}

我们现在可以编写一个类来表示板本身。这会将值存储在一个简单的二维数组中，但允许我们通过上面的Cell类访问它们：

public class Board {
    private final int[][] board;
    private final int score;
    public Board(int size) {
        this.board = new int[size][];
        this.score = 0;
        for (int x = 0; x < size; ++x) {
            this.board[x] = new int[size];
            for (int y = 0; y < size; ++y) {
                board[x][y] = 0;
            }
        }
    }
    public int getSize() {
        return board.length;
    }
    public int getScore() {
        return score;
    }
    public int getCell(Cell cell) {
        return board[cell.getX()][cell.getY()];
    }
    public boolean isEmpty(Cell cell) {
        return getCell(cell) == 0;
    }
    public List<Cell> emptyCells() {
        List<Cell> result = new ArrayList<>();
        for (int x = 0; x < board.length; ++x) {
            for (int y = 0; y < board[x].length; ++y) {
                Cell cell = new Cell(x, y);
                if (isEmpty(cell)) {
                    result.add(cell);
                }
            }
        }
        return result;
    }
}

**这是一个代表棋盘的不可变类，让我们查询它以找出当前状态。**它还跟踪当前的分数，我们稍后会谈到。

2.2. 电脑玩家和放置棋子

现在我们有了一个棋盘，我们希望能够玩它。我们想要的第一件事是电脑玩家，因为这是一个纯粹的随机玩家，以后会完全按照需要。

电脑玩家只不过是在一个单元格中放置了一个棋子，所以我们需要一些方法来在我们的棋盘上实现这一点。我们希望保持它是不可变的，因此放置一个图块将在新状态下生成一个全新的棋盘。

首先，我们需要一个构造函数来获取实际的棋盘状态，而不是我们之前构造一个空白棋盘的构造函数：

private Board(int[][] board, int score) {
    this.score = score;
    this.board = new int[board.length][];
    for (int x = 0; x < board.length; ++x) {
        this.board[x] = Arrays.copyOf(board[x], board[x].length);
    }
}

这是private的，因此它只能被同一类中的其他方法使用。这有助于我们封装棋盘。

**接下来，我们将添加一个放置棋子的方法。**这将返回一个与当前板相同的全新棋盘，只是它在给定单元格中具有给定编号：

public Board placeTile(Cell cell, int number) {
    if (!isEmpty(cell)) {
        throw new IllegalArgumentException("That cell is not empty");
    }
    Board result = new Board(this.board, this.score);
    result.board[cell.getX()][cell.getY()] = number;
    return result;
}

最后，**我们将编写一个代表计算机玩家的新类。**这将有一个方法可以获取当前板并返回新棋盘：

public class Computer {
    private final SecureRandom rng = new SecureRandom();
    public Board makeMove(Board input) {
        List<Cell> emptyCells = input.emptyCells();
        double numberToPlace = rng.nextDouble();
        int indexToPlace = rng.nextInt(emptyCells.size());
        Cell cellToPlace = emptyCells.get(indexToPlace);
        return input.placeTile(cellToPlace, numberToPlace >= 0.9 ? 4 : 2);
    }
}

**这会从板上获取每个空单元格的列表，随机选择一个，然后将一个数字放入其中。**我们将随机决定在 10% 的情况下将“4”放入单元格，在其他 90% 的情况下放入“2”。

2.2. 一个“人类”玩家和移动棋子

接下来我们需要的是一个“人类”玩家。**这不是最终目标，而是一个纯粹随机的玩家，每次移动时都会选择一个随机方向来移动棋子。**然后，这将成为我们可以建立最佳玩家的地方。

首先，我们需要定义可以进行的可能移动的枚举：

public enum Move {
    UP,
    DOWN,
    LEFT,
    RIGHT
}

**接下来，我们需要增加Board类以支持通过在这些方向之一上移动棋子来进行移动。**为了降低这里的复杂性，我们想要旋转棋盘，这样我们总是在同一个方向移动棋子。

这意味着我们需要一种转置和反转板的方法：

private static int[][] transpose(int[][] input) {
    int[][] result = new int[input.length][];
    for (int x = 0; x < input.length; ++x) {
        result[x] = new int[input[0].length];
        for (int y = 0; y < input[0].length; ++y) {
            result[x][y] = input[y][x];
        }
    }
    return result;
}
private static int[][] reverse(int[][] input) {
    int[][] result = new int[input.length][];
    for (int x = 0; x < input.length; ++x) {
        result[x] = new int[input[0].length];
        for (int y = 0; y < input[0].length; ++y) {
            result[x][y] = input[x][input.length - y - 1];
        }
    }
    return result;
}

转置板将交换所有行和列，使得顶部边缘变为左侧边缘。反转板只是镜像它，使左边缘变成右边缘。

接下来，我们向Board添加一个方法，以在给定方向上移动，并以新的状态返回一个新的Board。

我们首先制作棋盘状态的副本，然后我们可以使用它：

public Board move(Move move) {
    int newScore = 0;
    // Clone the board
    int[][] tiles = new int[this.board.length][];
    for (int x = 0; x < this.board.length; ++x) {
        tiles[x] = Arrays.copyOf(this.board[x], this.board[x].length);
    }

接下来，我们操纵我们的副本，以便我们总是向上移动棋子：

if (move == Move.LEFT || move == Move.RIGHT) {
    tiles = transpose(tiles);
}
if (move == Move.DOWN || move == Move.RIGHT) {
    tiles = reverse(tiles);
}

我们还需要另一组棋子——这次是我们将构建最终结果的棋子——以及一个用于跟踪此次移动获得的新分数的跟踪器：

int[][] result = new int[tiles.length][];
int newScore = 0;

现在我们已经准备好开始移动棋子了，并且我们已经操纵了一些东西，以便我们始终朝着同一个方向工作，我们可以开始了。

**我们可以独立于其他列移动每一列。**我们只需要遍历列并重复，从构建我们正在移动的棋子的另一个副本开始。

这次我们将它们构建到LinkedList中，因为我们希望能够轻松地从中弹出值。我们也只添加具有数字的实际图块并跳过空图块。

这实现了我们的平移，但还没有实现棋子的合并：

for (int x = 0; x < tiles.length; ++x) {
    LinkedList<Integer> thisRow = new LinkedList<>();
    for (int y = 0; y < tiles[0].length; ++y) {
        if (tiles[x][y] > 0) {
            thisRow.add(tiles[x][y]);
        }
    }

接下来，我们需要合并图块。我们需要与上述分开执行此操作；否则，我们可能会多次合并同一个图块。

这是通过从上面构建另一个棋子的LinkedList来实现的，但这次我们合并：

LinkedList<Integer> newRow = new LinkedList<>();
while (thisRow.size() >= 2) {
    int first = thisRow.pop();
    int second = thisRow.peek();
    if (second == first) {
        int newNumber = first * 2;
        newRow.add(newNumber);
        newScore += newNumber;
        thisRow.pop();
    } else {
        newRow.add(first);
    }
}
newRow.addAll(thisRow);

在这里，我们还计算了这一举动的新分数。这是由于合并而创建的图块的总和。

我们现在可以将其构建到结果数组中。一旦我们用完列表中的图块，其余部分将填充值“0”以表示它们是空白的：

    result[x] = new int[tiles[0].length];
    for (int y = 0; y < tiles[0].length; ++y) {
        if (newRow.isEmpty()) {
            result[x][y] = 0;
        } else {
            result[x][y] = newRow.pop();
        }
    }
}

一旦我们完成了瓷砖的移动，我们需要再次将它们操作回正确的旋转。这与我们之前所做的完全相反：

if (move == Move.DOWN || move == Move.RIGHT) {
    result = reverse(result);
}
if (move == Move.LEFT || move == Move.RIGHT) {
    result = transpose(result);
}

最后，我们可以用这组新的棋子和新计算的分数构建并返回一个新的棋盘：

    return new Board(result, this.score + newScore);
}

**我们现在可以编写随机的“人类”玩家。**这只不过是生成一个随机移动并调用上述方法来播放该移动：

public class Human {
    private SecureRandom rng = new SecureRandom();
    public Board makeMove(Board input) {
        Move move = Move.values()[rng.nextInt(4)];
        return input.move(move);
    }
}

2.3. 玩游戏

**我们有足够的组件来玩这个游戏，虽然不是很成功。**但是，很快我们将改进Human类的游戏方式，这将使我们能够轻松地看到差异。

首先，我们需要一种打印棋盘的方法。

对于这个例子，我们只是要打印到控制台，所以System.out.print已经足够好了。对于一个真正的游戏，我们想要做更好的图形：

private static void printBoard(Board board) {
    StringBuilder topLines = new StringBuilder();
    StringBuilder midLines = new StringBuilder();
    for (int x = 0; x < board.getSize(); ++x) {
        topLines.append("+--------");
        midLines.append("|        ");
    }
    topLines.append("+");
    midLines.append("|");
    for (int y = 0; y < board.getSize(); ++y) {
        System.out.println(topLines);
        System.out.println(midLines);
        for (int x = 0; x < board.getSize(); ++x) {
            Cell cell = new Cell(x, y);
            System.out.print("|");
            if (board.isEmpty(cell)) {
                System.out.print("        ");
            } else {
                StringBuilder output = new StringBuilder(Integer.toString(board.getCell(cell)));
                while (output.length() < 8) {
                    output.append(" ");
                    if (output.length() < 8) {
                        output.insert(0, " ");
                    }
                }
                System.out.print(output);
            }
        }
        System.out.println("|");
        System.out.println(midLines);
    }
    System.out.println(topLines);
    System.out.println("Score: " + board.getScore());
}

我们差不多准备好了。我们只需要进行设置。

这意味着创建棋盘、两名玩家，并让计算机进行两个初始动作——即在棋盘上放置两个随机数：

Board board = new Board(4);
Computer computer = new Computer();
Human human = new Human();
for (int i = 0; i < 2; ++i) {
    board = computer.makeMove(board);
}

现在我们有了实际的游戏循环。这将是人类和计算机玩家轮流进行的重复，只有在没有空单元格时才停止：

printBoard(board);
do {
    System.out.println("Human move");
    System.out.println("==========");
    board = human.makeMove(board);
    printBoard(board);
    System.out.println("Computer move");
    System.out.println("=============");
    board = computer.makeMove(board);
    printBoard(board);
} while (!board.emptyCells().isEmpty());
System.out.println("Final Score: " + board.getScore());

此时，如果我们要运行该程序，我们会看到正在玩 2048 的随机游戏。

3. 实现 2048 播放器

一旦我们有了玩游戏的基础，我们就可以开始实现“人类”玩家并玩更好的游戏，而不仅仅是选择随机方向。

3.1. 模拟动作

我们在这里实现的算法是基于Expectimax 算法的。因此，算法的核心是模拟每一个可能的动作，为每一个动作分配一个分数，然后选择一个做得最好的动作。

我们将大量使用Java 8 Streams 来帮助构建此代码，原因我们稍后会看到。

我们将从在Human类中重写 *makeMove()*方法开始：

public Board makeMove(Board input) {
    return Arrays.stream(Move.values())
      .map(input::move)
      .max(Comparator.comparingInt(board -> generateScore(board, 0)))
      .orElse(input);
}

对于我们可以移动的每一个可能的方向，我们生成新的棋盘，然后开始评分算法——通过这个棋盘，深度为 0。然后我们选择得分最高的棋步。

然后，我们的*generateScore()*方法模拟每一个可能的计算机移动——也就是说，将“2”或“4”放入每个空单元格——然后看看接下来会发生什么：

private int generateScore(Board board, int depth) {
    if (depth >= 3) {
        return calculateFinalScore(board);
    }
    return board.emptyCells().stream()
      .flatMap(cell -> Stream.of(new Pair<>(cell, 2), new Pair<>(cell, 4)))
      .mapToInt(move -> {
          Board newBoard = board.placeTile(move.getFirst(), move.getSecond());
          int boardScore = calculateScore(newBoard, depth + 1);
          return (int) (boardScore * (move.getSecond() == 2 ? 0.9 : 0.1));
      })
      .sum();
}

如果我们达到了我们的深度限制，那么我们将立即停下来计算这个板有多好的最终分数；否则，我们继续我们的模拟。

然后，我们的*calculateScore()*方法是我们模拟的延续，运行等式的人类移动方面。

这与上面的*makeMove()*方法非常相似，但我们返回的是正在进行的分数而不是实际的棋盘：

private int calculateScore(Board board, int depth) {
    return Arrays.stream(Move.values())
      .map(board::move)
      .mapToInt(newBoard -> generateScore(newBoard, depth))
      .max()
      .orElse(0);
}

3.2. 计分板

我们现在处于可以模拟人类和计算机玩家来回移动的情况，当我们模拟足够多时停止。我们需要能够为每个模拟分支中的最终板生成一个分数，以便我们可以看到哪个分支是我们想要追求的分支。

我们的评分是多个因素的组合，我们将把每个因素应用到板上的每一行和每一列。这些都加在一起，然后返回总数。

因此，我们需要生成要评分的行和列列表：

List<List<Integer>> rowsToScore = new ArrayList<>();
for (int i = 0; i < board.getSize(); ++i) {
    List<Integer> row = new ArrayList<>();
    List<Integer> col = new ArrayList<>();
    for (int j = 0; j < board.getSize(); ++j) {
        row.add(board.getCell(new Cell(i, j)));
        col.add(board.getCell(new Cell(j, i)));
    }
    rowsToScore.add(row);
    rowsToScore.add(col);
}

然后我们取出我们建立的列表，对每个列表进行评分，然后将分数相加。这是我们即将填写的占位符：

return rowsToScore.stream()
    .mapToInt(row -> {
        int score = 0;
        return score;
    })
    .sum();

最后，我们实际上需要生成我们的分数。这进入了上面的 lambda，并且是几个不同的因素都有助于：

每行的固定分数
行中每个数字的总和
行中可能的每个合并
行中的每个空单元格
行的单调性。这表示该行按数字升序排列的数量。

在计算分数之前，我们需要构建一些额外的数据。

首先，我们想要一个删除了空白单元格的数字列表：

List<Integer> preMerged = row.stream()
  .filter(value -> value != 0)
  .collect(Collectors.toList());

然后我们可以从这个新列表中进行一些计数，给出具有相同数字的相邻单元格的数量，数字严格递增，数字严格递减：

int numMerges = 0;
int monotonicityLeft = 0;
int monotonicityRight = 0;
for (int i = 0; i < preMerged.size() - 1; ++i) {
    Integer first = preMerged.get(i);
    Integer second = preMerged.get(i + 1);
    if (first.equals(second)) {
        ++numMerges;
    } else if (first > second) {
        monotonicityLeft += first - second;
    } else {
        monotonicityRight += second - first;
    }
}

现在我们可以计算这一行的分数：

int score = 1000;
score += 250 * row.stream().filter(value -> value == 0).count();
score += 750 * numMerges;
score -= 10 * row.stream().mapToInt(value -> value).sum();
score -= 50 * Math.min(monotonicityLeft, monotonicityRight);
return score;

这里选择的数字是比较随意的。不同的数字将对游戏的表现产生影响，在我们的游戏方式中优先考虑不同的因素。

4. 算法改进

**到目前为止，我们所拥有的东西是有效的，我们可以看到它玩得很好，但是速度很慢。**每个人的移动大约需要 1 分钟。我们可以做得比这更好。

4.1. 并行处理

**我们可以做的显而易见的事情是并行工作。**这是使用 Java Streams 的巨大好处——我们可以通过向每个流添加单个语句来并行工作。

仅此更改就可以将我们的每次移动时间缩短到 20 秒左右。

4.2. 修剪不可播放的分支

**接下来我们可以做的就是修剪掉所有无法播放的分支。**也就是说，任何时候人类移动都会导致棋盘不变。几乎可以肯定，这些分支会导致更糟糕的结果——它们有效地让计算机自由移动——但它们花费了我们处理它们的时间。

为此，我们需要在Board上实现一个 equals 方法，以便我们可以比较它们：

@Override
public boolean equals(Object o) {
    if (this == o) {
        return true;
    }
    if (o == null || getClass() != o.getClass()) {
        return false;
    }
    Board board1 = (Board) o;
    return Arrays.deepEquals(board, board1.board);
}

然后，我们可以向我们的流管道添加一些过滤器，以停止处理任何未更改的内容。

return Arrays.stream(Move.values())
    .parallel()
    .map(board::move)
    .filter(moved -> !moved.equals(board))
    ........

这对游戏的早期部分影响很小——当填充的单元格很少时，可以修剪的动作也很少。然而，后来，这开始产生更大的影响，将移动时间减少到只有几秒钟。