I'm learning about machine learning algorithms by implementing them in Java. Included in the project are some tests against simulations using the algorithms implemented.
- NEAT Algorithm
- AlphaZero
- Inference implementation
- Gradient descent learning implementation
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
- A Simple Alpha(Go) Zero Tutorial
- Gradient Descent Neural Network
- Q-Learning
- Deep Q-Learning
- XOR test 👍
experiment #1 | experiment #2 |
---|---|
neat algorithm: | |
population size: | |
150 | |
input topology: | |
1 for X | |
1 for Y | |
output topology: | |
1 sigmoid | 2 sigmoid |
bias topology: | |
1 with bias of 1 | |
initial hidden layer topology: | |
0 layers | |
sample results: | |
iteration: 1 generation: 37 species: 45 hidden nodes: 1 expressed connections: 6 total connections: 8 maximum fitness: 3.403556 |
iteration: 1 generation: 4 species: 1 hidden nodes: 0 expressed connections: 6 total connections: 6 maximum fitness: 3.578723 |
experiment #1 | experiment #2 |
---|---|
neat algorithm: | |
population size: | |
150 | |
input topology: | |
1 for cart position | |
1 for cart velocity | |
1 for pole angle | |
1 for pole velocity at tip | |
output topology: | |
1 sigmoid | 2 sigmoid |
bias topology: | |
1 with bias of 1 | |
initial hidden layer topology: | |
0 layers | |
sample results: | |
iteration: 1 generation: 11 species: 28 hidden nodes: 1 expressed connections: 6 total connections: 6 maximum fitness: 60.009998 |
iteration: 1 generation: 3 species: 1 hidden nodes: 0 expressed connections: 10 total connections: 10 maximum fitness: 60.009998 |
experiment #1 | experiment #2 | experiment #3 | experiment #4 |
---|---|---|---|
neat algorithm: | |||
population size: | |||
150 | |||
input topology: | |||
1 for player 1 | |||
1 for player 2 | |||
output topology: | |||
1 tanh (value network) | 2 tanh (value network) | ||
9 sigmoid (policy network) | 18 sigmoid (policy network) | ||
bias topology: | |||
0 | |||
initial hidden layer topology: | |||
0 layers | 2 layers of 5, 5 | ||
classic monte carlo tree search duels: | |||
training: 12 matches (6 as X player and 6 as O player) | |||
acceptance: 55% win rate vs 30 cached classic monte carlo simulations | |||
alpha zero: | |||
maximum expansions: 15 | |||
value reversed on player 2: | |||
state heuristic as value network disabled | |||
policy reversed on player 2 | |||
dirichlet noise on root node disabled | dirichlet noise on root node enabled | ||
shape: 0.03, epsilon: 0.25 | |||
cpuct set to 1 | |||
back propagation set to BackPropagationType.REVERSED_ON_OPPONENT | |||
temperature threshold: 3rd depth | |||
sample results: | |||
iteration: 1 generation: 195 species: 69 hidden nodes: 1 expressed connections: 20 total connections: 23 maximum fitness: 2.208129 |
iteration: 1 generation: 138 species: 74 hidden nodes: 1 expressed connections: 41 total connections: 42 maximum fitness: 1.960799 |
iteration: 1 generation: 130 species: 77 hidden nodes: 3 expressed connections: 42 total connections: 49 maximum fitness: 1.958826 |
iteration: 1 generation: 61 species: 89 hidden nodes: 12 expressed connections: 138 total connections: 141 maximum fitness: 2.234746 |
experiment #1 |
---|
monte carlo tree search: |
heuristics: |
snake shape board: 25%, higher free tile count: 25%, monotonicity: 50% |
maximum selections: |
200 |
maximum simulation rollouts depth: |
8 |