After I got the ‘gamenev’ API working, it became very easy to program an agent/player which picks complete random actions (with equal probabilities for all possible moves). As shown in the source code
test/test_gameenv.lua, all that’s required is to call
gameenv.new_game() once, followed by
gameenv.step(a) repeatedly until the game is over.
for i = 1, opt.games do local terminal = false local cnt = 0 gameenv.new_game() while not terminal do cnt = cnt + 1 if cnt % opt.actstep == 0 then -- take a random action _, _, terminal = gameenv.step(actions[math.random(#actions)]) else _, _, terminal = gameenv.step() end end gameenv.step(0) -- release all buttons print(('Game #%3d, score = '):format(i) .. gameenv.get_score()) history[#history + 1] = gameenv.get_score() end
I experimented with different
opt.actstep values (repeat same action for this many consecutive steps/frames before trying to take another action). The result is shown in the table below: average, minimum, maximum scores over 100 games for the 6 different configurations. As it turned out, the random player with
opt.actstep = 2 got the highest average score: 6314.3. So that would be the score to beat for my DQN AI agent!
I also made some histograms for comparison: opt.actstep = 1~4 over 100 games each.