Machine Learned Movement with MLagents (using PPO)

After several misses I trained a working Inverse Pendelum self-balanced robot which can move across the terrain.

1.1. Problems I encountered

some self intersecting colliders blocked the arm movement => used a custom script to ignore Collision between those parts (Physics.IgnoreCollision)
setting the penalty for falling too low generated a faster movement like intended, but greatly worsen the the use of armbalancing
Motor strength too low for hills

1.2. Setup

My setup was very modular, using interfaces for respawning the target and agent, which will make training other agents a lot easier.

It trained 10 hours on a high-end desktop PC, still improving by a smaller rate.

The exact setup is not complicated but involves many detailes and thousands of lines of code.
The main parts are:

all positions, rotations, velocities are relative to the white cube seen in the picture which is rotated towards the target on the x,z plane and aligned with the global y-axis; the motivation behind this is to make the movement relative to the gravity and the relative rotation to the target; which is more stable taking all possible poses into account
the reward is 1*(alligning the forward direction and orientationCube forward) + 10*(Dotproduct(speedVector, goadirection))
inputs:
- velocity, angularVelocity, position, rotation for each joints
- current joint force
- global position of main body part and target (part beyond the head)
- raycasts for terrain height; and raycasts from body for better terrain estimation
outputs:
- joint position (interpolated between max and min for each moveable axis)
- motor torque

1.3. Result

I am very satisfied with the results; they complement our main technical achievement of having visually appealing visualizations. In contrast to hard-coded animations with Inverse Kinematics or even Procedural Animations, our system reacts to external forces very naturally. The built-in springs and physical movements significantly enhance the experience of observing a real robot.

1.4. Next Steps

the next part will be to incoperate it in an RTS system to enable the player to give movement commands to them and to fnd a suitable recovery method if the trip (use some magic force to stand up or try to train it with ML)

Blog

Machine Learned Movement with MLagents (using PPO)

1.1. Problems I encountered

1.2. Setup

1.3. Result

1.4. Next Steps