<?xml version="1.0" encoding="UTF-8"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcq="http://purl.org/dc/terms/"><records count="1" morepages="false" start="1" end="1"><record rownumber="1"><dc:product_type>Conference Paper</dc:product_type><dc:title>Accelerating 1-Bit Llms Via in-Memory Computing Architectures</dc:title><dc:creator>Malekar, Jinendra [Computer Science and Engineering, University of South Carolina,Columbia,SC,29201]; Zand, Ramtin [Computer Science and Engineering, University of South Carolina,Columbia,SC,29201]</dc:creator><dc:corporate_author/><dc:editor/><dc:description>In this paper, we present a novel hybrid computing architecture designed to accelerate inference in 1-bit large language models (LLMs). Our approach combines the strengths of analog in-memory computing (IMC) and digital systolic arrays to address the diverse precision requirements across different layers of 1-bit LLMs. Specifically, we utilize analog IMC to accelerate low-precision matrix multiplication (MatMul) operations within the projection layers, which are naturally amenable to extreme quantization. Meanwhile, digital systolic arrays are employed to efficiently handle high-precision MatMul operations in the attention heads, preserving accuracy where precision is most critical. By partitioning the computational workload based on precision needs, our hybrid architecture increases throughput and energy efficiency. Experimental evaluations demonstrate that our design delivers up to an 80x improvement in tokens processed per second and achieves a 70% increase in energy efficiency (tokens per joule) when compared to conventional digital hardware accelerators.</dc:description><dc:publisher>IEEE</dc:publisher><dc:date>2025-11-25</dc:date><dc:nsf_par_id>10674875</dc:nsf_par_id><dc:journal_name>Conference proceedings</dc:journal_name><dc:journal_volume/><dc:journal_issue/><dc:page_range_or_elocation>178 to 182</dc:page_range_or_elocation><dc:issn>1558-3899</dc:issn><dc:isbn>979-8-3315-8934-9</dc:isbn><dc:doi>https://doi.org/10.1109/MWSCAS53549.2025.11244527</dc:doi><dcq:identifierAwardId>2409697; 2340249</dcq:identifierAwardId><dc:subject/><dc:version_number/><dc:location/><dc:rights/><dc:institution/><dc:sponsoring_org>National Science Foundation</dc:sponsoring_org></record></records></rdf:RDF>