Growing an AI with NEAT
Let's grow an artificial brain! The idea came after watching this video by Seth Bling , and another one by Code Bullet . They both implement a solution called NeuroEvolution of Augmenting Topologies, or NEAT for short. It is a genetic algorithm for the generation of evolving artificial neural networks .
Unlike them, I went with an existing NEAT implementation, and since my main language is C#, I chose SharpNEAT . It provides a nice framework and handles all the neuro-evolution parts, so I only have to tweak a few parameters and I can focus on implementing my simulation.
Let's Get Started
My ultimate goal is to grow an artificial brain that can play Wolfenstein 3D . For that, I intend to recycle my old Codename Z code and add a parser for actual Wolf3D maps. But, I figured I should start with something simple so I can have some sense of achievement in the beginning before I rage-quit. For that I made a simple tile-based maze using Prim's Algorithm , and the "player" (or rather the agent) could only step a tile at a time, no in-between positions.
Since NEAT is all about figuring out the inner connections on its own through neuro-evolution, I only need to provide the inputs, outputs, and a fitness evaluation.
The input is what the brain senses. In this case the four neighboring wall's state, whether it is a wall (1) or a free space (0) where it can go next. The output is what comes out of this neural network. It is the consequence of both the input and internal state, and tells what next step the brain chooses to take.
For the outputs, each value is taken as a floating point number between 0 and 1, and interpreted as true if greater than 0.5. By design, the agent does not accept conflicting commands, so only one direction can be active at a time, otherwise no movement is made.
During the simulation, each agent performs the same tasks, and in the end a fitness score is assigned based on its performance. The agent has no idea about its fitness during the simulation. When a whole generation of agents has been evaluated, the ones with the best scores are used to create offspring for the next generation. This happens by copying the neural network of the parent, then randomly adding or removing neurons and connections, and mutating existing ones. This way by random chance they might become better or worse than their parents.
NEAT can also handle multiple simultaneous species. These are separate neural networks that develop independently of each other, but in each generation a certain number of children can inherit properties from two different species. This way they have a chance to inherit and combine beneficial features from both parents and become more successful than them alone.
To measure the fitness I used a single metric, that is the distance to the exit. The closer it got, the better the fitness is. I got it scaled so the starting tile is 0 and the exit is 1. Some dead-ends can be further from the exit than the start, producing a negative value, so those values are clipped to 0 and a fitness score will always be between 0 and 1.
Since most agents will not make it out, I have to cut the simulation short. Each agent has 750 steps to finish the maze, otherwise they are terminated, and a fitness score is assigned based on their position at termination.
I started with a population of 1000, and default genome parameters (mutation rate). The very first generation got it figured out that it needs to go right and down to advance, and then got stuck there for the next some thousand generations.
1st Milestone: Discovering Up and Left
The first evolutional milestone came several hours later, when the agent figured out how to go the other two directions, and started finishing most of the mazes.
At this point, I only evaluated each agent against 20 mazes that were random each time (not just for each generation but for each agent). I realized it was not doing much good, since some mazes are just a straight line across while others require backtracking, which made the fitness based on the probability of getting simpler mazes, rather that actual performance. To fix this, I specified a fixed seed for each maze, and also raised the number of mazes to 100.
While I was at it I figured I might as well cache it too, along with the fitness value (represented by the green tiles). That sped up the simulation significantly.
This change also meant I had to start over with a fresh population, but this time around it was much faster because of the caching. With the discovery of up and left, the best agents in each generation could finish about 70% of the mazes.
2nd Milestone: Full Backtracking
That 70% completion got up by 1% every 100 to 200 generations until it was done. Now, it does not mean it can solve every maze I throw at it, only that it can solve this particular 100. So I tested on a million mazes, and it solved 98.52% of them.
Here are some of the results:
Backtracking was ultimately the most time consuming feature to learn, slowly extending how much it can spring back after hitting a dead end. It's interesting to see whenever the process was struggling to develop that next milestone, the number of extra neurons and connections got up. And when the problem was solved, it gradually got simpler. This simplification and complexification is a key feature of NEAT, that instead of having a predefined number of neurons and connections, it "grows" to the complexity of the task. Take a look at these two examples:
Interestingly it did not figure out the easiest solution, that is to stick to either the left or right wall, and just track along that. My theory is, that it would require a different input/output principle, where the agent has a looking direction, and every sense and movement is relative to that.
SharpNEAT provides a few parameters to tweak regarding the rate of the mutation, number of simultaneous species and the number of cross-species offspring in each generation. Keeping the mutation rate low has the risk of the offspring not being able to mutate enough in one generation to make a difference in the fitness score, while having a higher mutation rate has the risk of generating more defective offspring and possibly devolve the species. During the simulations I constantly changed the values, but these are the ones I generally used:
- Number of species: 30
- Interspecies mating: 0.01% - 0.1%
- Mutate connection weights: 60% - 80% (this is inverse, so 100% means no mutation)
- Mutate add neuron: 0.01% - 0.25% (chance of adding a new neuron)
- Mutate add connection: 0.01% - 0.25% (chance of adding a new connection between two neurons)
- Mutate delete connection: around half of the previous value
It's hard to tell how many generations it ultimately took to finish, because I stopped and tweaked and resumed it so many times, but it's somewhere around 5,000. Maybe for the next one I add a continuing counter.