Journal ArticleSCALES: SCALable and Area-Efficient Systolic Accelerator for Ternary Polynomial MultiplicationCoulon, Samuel; Bao, Tianyou; Xie, JiafengPolynomial multiplication is a key component in many post-quantum cryptography and homomorphic encryption schemes. One recurring variation, ternary polynomial multiplication over ring Zq/(xn+1) where one input polynomial has ternary coefficients {−1,0,1} and the other has large integer coefficients {0, q−1}, has recently drawn significant attention from various communities. Following this trend, this paper presents a novel SCALable and area-Efficient Systolic (SCALES) accelerator for ternary polynomial multiplication. In total, we have carried out three layers of coherent interdependent efforts. First, we have rigorously derived a novel block-processing strategy and algorithm based on the schoolbook method for polynomial multiplication. Then, we have innovatively implemented the proposed algorithm as the SCALES accelerator with the help of a number of field-programmable gate array (FPGA)-oriented optimization techniques. Lastly, we have conducted a thorough implementation analysis to showcase the efficiency of the proposed accelerator. The comparison demonstrated that the SCALES accelerator has at least 19.0% and 23.8% less equivalent area-time product (eATP) than the state-of-the-art designs. We hope this work can stimulate continued research in the field.IEEE2024-07-0110568341IEEE Computer Architecture Letters232243 to 2461556-6056https://doi.org/10.1109/LCA.2024.35058722020625Area-efficientblock-processingFPGAscalablesystolic hardware acceleratorternary polynomial multiplication.National Science Foundation