After several misses I trained a working Inverse Pendelum self-balanced robot which can move across the terrain.

1.1. Problems I encountered

  • some self intersecting colliders blocked the arm movement => used a custom script to ignore Collision between those parts (Physics.IgnoreCollision)
  • setting the penalty for falling too low generated a faster movement like intended, but greatly worsen the the use of armbalancing
  • Motor strength too low for hills

1.2. Setup

My setup was very modular, using interfaces for respawning the target and agent, which will make training other agents a lot easier. 

It trained 10 hours on a high-end desktop PC, still improving by a smaller rate. 

The exact setup is not complicated but involves many detailes and thousands of lines of code.
The main parts are:

  • all positions, rotations, velocities are relative to the white cube seen in the picture which is rotated towards the target on the x,z plane and aligned with the global y-axis; the motivation behind this is to make the movement relative to the gravity and the relative rotation to the target; which is more stable taking all possible poses into account
  • the reward is 1*(alligning the forward direction and orientationCube forward) + 10*(Dotproduct(speedVector, goadirection))
  • inputs:
    •  velocity, angularVelocity, position, rotation for each joints
    • current joint force
    • global position of main body part and target (part beyond the head)
    • raycasts for terrain height; and raycasts from body for better terrain estimation 
  • outputs:
    • joint position (interpolated between max and min for each moveable axis)
    • motor torque


1.3. Result

I am very satisfied with the results; they complement our main technical achievement of having visually appealing visualizations. In contrast to hard-coded animations with Inverse Kinematics or even Procedural Animations, our system reacts to external forces very naturally. The built-in springs and physical movements significantly enhance the experience of observing a real robot.

1.4. Next Steps


the next part will be to incoperate it in an RTS system to enable the player to give movement commands to them and to fnd a suitable recovery method if the  trip (use some magic force to stand up or try to train it with ML)

  • Keine Stichwörter