Balancing the Inverted pendulum

Keywords: #ros2 #kiwi #gazebo #docker #pid #inverted pendulum

I have (finally) gotten around to getting KIWI to balance in simulation. All in took me about an hour to write up the pid_controller and get it to stay upright. A little longer to nail down twist tracking via a /cmd_vel command. The core logic is remarkably simple, and I think serves as a great showcase of the “power of PID”. In the same way I think that people tend to overlook linear regression for their solutions as it’s not shiny enough, I really think people under-estimate how far a well-tuned PID can get you.

Before we look at the control diagram, I wanted to emphasise a line above, where I mentioned “Tracking a cmd_vel command”. There are a million and one inverted pendulum projects online and upon an inspection, many of them stop abruptly at achieving balance, maybe moving forward/back and left/right if the author was feeling clever. I think the reason for this is an under-estimation of two factors:

  1. Where the difficult part of balancing an inverted pendulum actually is:

    In honesty, getting an inverted pendulum to stay upright is actually trivial. Your feedback only really needs to be in the right order of magnitude to achieve “Not falling over”. It’s easy to tunnel vision in and think that once you’ve achieved this, the next phase of navigation and path planning will come just as easily. This usually stems from…

  2. Not understanding how vital it is that you have some form of units with your control.

    Sounds obvious, but hear me out. It’s easy to not realise that a robot (usually) operates in a discretised environment, which is completely different to how a person tends to navigate. To move to a goal, you don’t need to estimate how far it is, calculating how long it will take at your current velocity, or any of that. You simply walk to it. A robot requires a discretisation of not only the navigation space, but also the control space. Your path planner is going to measure distances from your goals, obstacles, and everything in-between. Your commands are going to have units like $\frac{m}{s}$ and $\frac{rad}{s}$, and following a trajectory will require your robot can move at those velocities. If it cannot, you’re in for a rough time when you start attempting navigation.

In a few posts, I’ll go over properly quantifying how well you can follow a trajectory, and use the system for comparing different control methods as part of an automated CI system. But for now, let’s have a look at our pid_control_node. As can be seen, it consists of a few subscribers, a hand-rolled PID class, and a low-pass filter. It’s ugly having them all in the same file, good practice would suggest splitting them up, however I wanted to be able to point towards a single file for readers to look at and get the whole picture. Not seen, is the keyboard_control_node I wrote, which takes in WASD, and outputs a /cmd_vel for kiwi to track. This gives our hierarchy:

keyboard_control_node -> pid_controller -> joint_effort_controller

Or, described visually:

What may surprise you, is how many PIDs we have: 3 in total. It may be tempting to look at the Linear and Balance PIDs and consider them to be an example of cascaded PID. Ignore that voice in your head, as for them to be cascaded at least one of the PIDS must take the output of a parent PID as an input. This forms an inner and outer loop, allowing the outer loop to control the set point, or target for the inner loop to track. No such architecture can be found here, these PIDs are simply being summed. The need for this sort of architecture arises from the fact that we have three control targets that are (sort of) independent in our state space, but are tracked in our control space via the same output. That is to say that achieving any of our target states: Angular/Linear velocities, or Torso angle all happen through a combined effort command sent to the motors.

The result of this being that all three control targets can act antagonistically; fighting each other for their targets, with none of them ending up successful. There is a surprisingly simple solution to this: Simply paying attention to the nature of our control targets. Keeping KIWI upright is a fundamentally different control strategy to achieving a linear velocity, in that balancing has a (Relatively speaking) High frequency solution space, while velocity control is Low frequency. Taking advantage of this, we can pipe our wheel velocity feedback through a low-pass filter, which filters the corrections being made by the high gain balance PID. The result of this is the fast corrections being made by the balance PID are not propogated to the Linear/Angular PIDS. This comes at the cost of response time for the velocity tracking, but that’s alright…

Some driving:

Alright, you may be thinking “Wow, that looks awful!”, and you’d be correct! It IS awful. Sure, I could spend a few hours dialling it in if I wanted to waste precious life fiddling with PIDs, however that would take some of the impact out of the next few posts, whereby I go over more advanced control methods. Don’t get me wrong, there is tuning in our future, however I’m trying to exploit the Pareto principle as much as I can here, I actually have a life (believe it or not).

Stay ready for the next entry, where we do a deep dive into LQR, otherwise known as “babby’s first cost function”! I am personally very excited to get into it.