# A Mixed-signal 3D Footstep Planning SoC for Motion Control of Humanoid Robots with Embedded Zero-Moment-Point based Gait Scheduler and Neural Inverse Kinematics

Qiankai Cao, Juin Chuen Oh, Jie Gu Northwestern University, Evanston, IL, USA

**Abstract** - This work presents a footstep planning SoC chip for humanoid robot. A time-domain graph search engine for 3D footstep planning and mixed-signal zero moment point (ZMP) gait scheduler with neural inverse kinematics is developed for efficient robot motion control. A 65nm SoC chip is fabricated and demonstrated in-situ on a humanoid robot with the state-of-the-art search rate and energy efficiency for humanoid robot control and footstep planning.

#### Introduction

Humanoid robots are recently drawing significant interest. Compared to wheeled mobile robots, humanoid robots with human-like joint systems enable high degree-of-freedom (DOF) locomotion for complex tasks e.g. search and rescue, housework, or medical treatment, etc. However, there are significant challenges in motion control of such robots. First, it is complicated and computationally heavy for 3D footstep planning on humanoid robot [1] with added dimensions of height and special movements, e.g. stepping over or stepping onto objects. Second, the complex 10~20 DOFs kinematic model for robot joint control leads to high computation workload. Third, to maintain balance, special trajectory control of the robot's center of mass (CoM) through zero moment point (ZMP) needs to be judiciously performed for fall prevention. Prior works have investigated 2D path planning such as a wavefront expansion of A\* algorithm for graph search [3] and an oscillator based NeuroSLAM accelerator for mobile wheeled robot [4]. An efficient mixed-signal accelerator was developed for real-time swarm intelligence [5]. A motioncontrol ASIC for robots was also designed mainly focusing on industry robot arms [6]. Despite the above works, there has been a lack of a SoC solution for humanoid robots, as presented in this work. As highlighted in Fig. 1, the contributions of this work include: (1) A time-domain graph search engine for 3D footstep planning featuring 3D search, D\* replanning for onthe-fly adjustment, blocking of redundant paths and efficient readout of search results; (2) An efficient mixed-signal zero moment point (ZMP) gait scheduler for robot balancing; (3) A time-domain neural network based inverse kinematic module for robot joint control; (4) in-situ demonstrations on a real assembled robot with the 65nm SoC rendering 2.7X overall energy saving for graph search and 18.4X higher energy efficiency for motion control compared with prior works.

### **Humanoid Robot SoC Design and Implementation**

Fig. 2 shows the chip top-level architecture which contains (1) a 40x40 time-domain graph search engine with special mixed-signal circuits for high-level 3D footstep planning, (2) a ZMP gait trajectory generator for control of center of mass (CoM) for robot balancing, (3) a hybrid time-digital domain neural network as inverse kinematic estimator for joint control, (4) a motor control module with UART to manage external motors via CAN bus. After the high-level footstep planning is completed, ZMP gait scheduler module is enabled to pass the trajectory of CoM in (X,Y,Z) format to low-level joint control. Subsequently, the top control will switch to low-level control, where the neuro-kinematic module is activated to convert cartesian space of end-effectors to the 10-DoF joint space for

each motor on the robot. Final motion control commands are transmitted through the motor control module using the CAN bus and UART protocol. For demonstration, a mini-FPGA is used for scan chain and memory loading into the SoC.

Fig. 3 elaborates details of the 3D footstep planning and the time-domain graph search engine. While 2D occupancy grid maps typically meet the needs of wheeled robots [4], humanoid robots necessitate additional terrain height information to account for special movements of stepping over/onto objects in 3D space. Different from the widely used A\* algorithm, a more sophisticated D\* replanning algorithm [7] was adopted in this work, enabling the robot to adjust its path while heading to the destination. In the circuit implementation shown in Fig. 4, a 40x40 vertex array is deployed to generate locomotion trajectory. The mapping information, e.g. distance of single step, height of stairs are mapped into a programmable 2-bit delay cell at interconnect of the vertexes. Inside each vertex, time-domain signals are passed from eight directions, including four planar directions similar as prior work [3] and another four directions for the new dimension of height for stepping-over or stepping-on movements. Each "vertex lock" circuit includes multiple NAND, NOR gates and a DFF for catching the earliest time-domain signal and producing a 'Lock' signal for later tasks. A set of "direction lock" (DL) modules are used to record the direction of the first-come time-domain signal. The time-domain signals propagate as a wavefront through the whole map resulting in the shortest path being locked in the DL circuits. Besides static planning, this work also supports D\* on-the-fly replanning when the environment is changed, e.g. an object moved by the robot. As shown in Fig. 4, with a map update, another round of graph search starting at the changed node will be processed without reperforming map search on unrelated predecessors leading to 1.8X savings over the conventional A\* algorithm. For implementation of D\* algorithm, time-domain NOR and NAND gates are used to forward trace the successors of changed nodes. An unlock global signal is issued for replanning to unlock all the successors, followed by relaunching of the remaining search. To further improve the hardware efficiency, a blocking of predetermined redundant path on the map is used to only process the relevant portion of the map rendering an average energy saving of 32.9% in 50 random generated map search tasks. Finally, rather than a full memory scan outputting all direction values as in [3], this design enables only tracing back along the shortest path by utilizing the direction information. This leads to 29.1X speedups compared with a full memory scan method. Overall, a 2.7X energy saving on pathplanning is achieved over prior work [3] thanks to the low power techniques in this work.

Fig. 5 shows details of ZMP based CoM control for gait scheduler and neuro-kinematic circuits for robot joint control. ZMP refers to the point where robot's total moment at the ground is zero. Only if the ZMP is kept inside a supporting region, the robot can maintain stable dynamically. ZMP is used to create the target trajectory of CoM of the robot. A mixed-signal circuit with a VCO and MUXs is used for synthesizing the targeted sinusoidal-like CoM trajectory with 3.4X power

saving compared with equivalent digital counterpart. The resulting sinusoidal trajectory (X, Y, Z) is sent out in the format of time pulses for inverse kinematics (IK) calculation. Due to the highly complex trigonometric computation in IK, a neural network is used to approximate the calculation. As ZMP produces the time-domain pulses, a hybrid neuro-kinematic circuit is implemented consisting of 8-bit (4-bit MSB and 4-bit LSB) time-domain MACs (TDMAC) [8] for the first layer of NN and digital MACs for the later layers of NN. The use of mixed-signal neuro-kinematic circuit achieves 12.1X area saving and 1.8X latency reduction on motion control compared with a digital design. A 2% loss of accuracy is observed using neuro-kinematics and an additional 1% accuracy is lost from the time-domain implementation.

#### **Measurement Results**

A 65nm test chip is fabricated. Fig. 6 shows the demonstration of a real assembled humanoid robot with measured results. Fig. 7 shows the die photo and a comparison table. Due to lack of direct prior works on SoC for humanoid robot, comparisons are made on mobile robots and pathfinding works. The energy efficiency for the motion control in this work is 645Hz/mW, which is 18.4X higher than prior work [6], thanks to the mixed-signal circuit implementations. Compared with prior 2D pathfinding [3], this work demonstrated a more complex 3D footstep planning with a 1.6X higher search rate and an overall 2.7X improvement on energy per task due to forementioned additional low power features.



Fig. 1 Humanoid robot control diagram, challenges, and contributions.



Fig. 2. Chip top-level architecture and processing sequences of this work.



Fig. 3 3D footstep planning, D\* algorithms and assembled robot.



Fig. 4 Circuit design of time-based graph search engine for 3D footstep planning, replanning, blocking of redundant paths and scan-out.



Fig. 5 Detailed description and implementation of ZMP gait scheduler and neuro-kinematics for inverse kinematics in low-level motion control.



Fig. 6 Measurement results and robot demonstration.

|                                                                                                        |        |                                         |                               |                            | ISSCC'18[3]             | ISSCC'20 [4]  | ISSCC'19[5]           | ISSCC'23 [6]   | This work                               |
|--------------------------------------------------------------------------------------------------------|--------|-----------------------------------------|-------------------------------|----------------------------|-------------------------|---------------|-----------------------|----------------|-----------------------------------------|
| 2 mm                                                                                                   | -      | 1.67 mm                                 | Application                   |                            | A* shortest path        | SLAM          | Swarm<br>intelligence | Motion control | 3D footstep planning+<br>Motion control |
|                                                                                                        |        | *************************************** | Hardware                      |                            | 65nm/Time               | Mixed-signal  | 65nm                  | 28nm/Digital   | 65nm/MS Time                            |
|                                                                                                        | -      | 40x40 Graph Array                       | Graph size                    |                            | 40 x 40                 | 7x7           | -                     | -              | 40 x 40                                 |
|                                                                                                        | =      |                                         | Total area (mm <sup>2</sup> ) |                            | 0.4                     | 5             | 2                     | 3.56           | 3.34                                    |
|                                                                                                        | •      |                                         | Memory size                   |                            | 2.3 kB                  | 37.9 kB       | 16 kB                 | -              | 22 kB                                   |
|                                                                                                        | ģ      |                                         | Frequency                     |                            | -                       | 78.2-130.8MHz | 1kHz-1.5MHz           | 200MHz         | 1MHz                                    |
|                                                                                                        | 5      |                                         | Peak Power                    |                            | 26.4 mW                 | 17.87mW       | 3.4uW                 | 142mW          | 432.8uW                                 |
|                                                                                                        | 999999 |                                         | Energy<br>Efficiency          | for control                | -                       | -             | -                     | 35 Hz/mW       | 645 Hz/mW <sup>1)</sup>                 |
|                                                                                                        |        |                                         |                               | for NN                     | -                       | 8.0 TOPS/W    | 1.1-9.1TOPS/W         | -              | 3.2 - 6.5TOPS/W                         |
|                                                                                                        |        | NN Kinematics ZMP 2                     | Path                          | Energy per<br>task         | 1166.2 pJ <sup>2)</sup> | -             | -                     | -              | 424.7pJ                                 |
| ,                                                                                                      |        |                                         | Planning                      | Search Rate<br>(edges/sec) | 559M <sup>3)</sup>      | -             | -                     | -              | 910M <sup>3)</sup>                      |
| 1) Efficiency for control = Control rate/Power (20Hz control rate is used in this work from motor spec |        |                                         |                               |                            |                         |               |                       |                |                                         |

Fig. 7 Die photo and Comparison table with prior works.

Acknowledgements

This work was supported in part by NSF under grant number CCF-1846424.

## Reference

- [1] K. Garimort, et al. ICRA, 2011
- [3] L. Everson, et al. *ISSCC*, 2019
- [5] N. Cao, et al. ISSCC, 2019
- [7] S. Koenig, et al. AAAI, 2002
- [2] C. Chung, et al. *ISSCC*, 2020
- [4] J. Yoon, et al. ISSCC, 2020
- [6] I. Lin, et al. ISSCC, 2023
- [8] Y. Toyama, et al. JSSC, 2019