NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

DReX: Accurate and Scalable Dense Retrieval Acceleration via Algorithmic-Hardware Codesign

https://doi.org/10.1145/3695053.3731079

Quinn, Derrick; Yücel, E Ezgi; Prammer, Martin; Fan, Zhenxing; Skadron, Kevin; Patel, Jignesh M; Martínez, José F; Alian, Mohammad (June 2025, ACM)

Free, publicly-accessible full text available June 20, 2026
PIMsynth: A Unified Compiler Framework for Bit-Serial Processing-In-Memory Architectures

https://doi.org/10.1109/LCA.2025.3600588

Guo, Deyuan; Gholamrezaei, Mohammadhosein; Hofmann, Matthew; Venkat, Ashish; Zhang, Zhiru; Skadron, Kevin (January 2025, IEEE Computer Architecture Letters)

Full Text Available
Abakus: Accelerating k -mer Counting with Storage Technology

https://doi.org/10.1145/3632952

Wu, Lingxi; Zhou, Minxuan; Xu, Weihong; Venkat, Ashish; Rosing, Tajana; Skadron, Kevin (March 2024, ACM Transactions on Architecture and Code Optimization)

This work seeks to leverage Processing-with-storage-technology (PWST) to accelerate a key bioinformatics kernel calledk-mer counting, which involves processing large files of sequence data on the disk to build a histogram of fixed-size genome sequence substrings and thereby entails prohibitively high I/O overhead. In particular, this work proposes a set of accelerator designs called Abakus that offer varying degrees of tradeoffs in terms of performance, efficiency, and hardware implementation complexity. The key to these designs is a set of domain-specific hardware extensions to accelerate the key operations fork-mer counting at various levels of the SSD hierarchy, with the goal of enhancing the limited computing capabilities of conventional SSDs, while exploiting the parallelism of the multi-channel, multi-way SSDs. Our evaluation suggests that Abakus can achieve 8.42×, 6.91×, and 2.32× speedup over the CPU-, GPU-, and near-data processing solutions.
more » « less
Full Text Available
Architectural Modeling and Benchmarking for Digital DRAM PIM

https://doi.org/10.1109/IISWC63097.2024.00030

Siddique, Farzana Ahmed; Guo, Deyuan; Fan, Zhenxing; Gholamrezaei, Mohammadhosein; Baradaran, Morteza; Ahmed, Alif; Abbot, Hugo; Durrer, Kyle; Nandagopal, Kumaresh; Ermovick, Ethan; et al (September 2024, IEEE)

Full Text Available
Synthesizing Legacy String Code for FPGAs Using Bounded Automata Learning

https://doi.org/10.1109/MM.2022.3178037

Angstadt, Kevin; Tracy, Tommy; Skadron, Kevin; Jeannin, Jean-Baptiste; Weimer, Westley (September 2022, IEEE Micro)

Full Text Available
Speculative Code Compaction: Eliminating Dead Code via Speculative Microcode Transformations

https://doi.org/10.1109/MICRO56248.2022.00024

Moody, Logan; Qi, Wei; Sharifi, Abdolrasoul; Berry, Layne; Rudek, Joey; Gaur, Jayesh; Parkhurst, Jeff; Subramoney, Sreenivas; Skadron, Kevin; Venkat, Ashish (October 2022, 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO))

Full Text Available
Ultra Efficient Acceleration for De Novo Genome Assembly via Near-Memory Computing

https://doi.org/10.1109/PACT52795.2021.00022

Zhou, Minxuan; Wu, Lingxi; Li, Muzhou; Moshiri, Niema; Skadron, Kevin; Rosing, Tajana (September 2021, 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT))

Full Text Available
Sieve: Scalable In-situ DRAM-based Accelerator Designs for Massively Parallel k-mer Matching

https://doi.org/10.1109/ISCA52012.2021.00028

Wu, Lingxi; Sharifi, Rasool; Lenjani, Marzieh; Skadron, Kevin; Venkat, Ashish (June 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA))
null (Ed.)
Full Text Available
A Scalable Solution for Rule-Based Part-of-Speech Tagging on Novel Hardware Accelerators

https://doi.org/10.1145/3219819.3219889

Sadredini, Elaheh; Guo, Deyuan; Bo, Chunkun; Rahimi, Reza; Skadron, Kevin; Wang, Hongning (August 2018, KDD '18 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)

Part-of-speech (POS) tagging is the foundation of many natural language processing applications. Rule-based POS tagging is a wellknown solution, which assigns tags to the words using a set of predefined rules. Many researchers favor statistical-based approaches over rule-based methods for better empirical accuracy. However, until now, the computational cost of rule-based POS tagging has made it difficult to study whether more complex rules or larger rulesets could lead to accuracy competitive with statistical approaches. In this paper, we leverage two hardware accelerators, the Automata Processor (AP) and Field Programmable Gate Arrays (FPGA), to accelerate rule-based POS tagging by converting rules to regular expressions and exploiting the highly-parallel regular-expressionmatching ability of these accelerators. We study the relationship between rule set size and accuracy, and observe that adding more rules only poses minimal overhead on the AP and FPGA. This allows a substantial increase in the number and complexity of rules, leading to accuracy improvement. Our experiments on Treebank and Brown corpora achieve up to 2,600X and 1,914X speedups on the AP and on the FPGA respectively over rule-based methods on the CPU in the rule-matching stage, up to 58× speedup over the Perceptron POS tagger on the CPU in total testing time, and up to 253× speedup over the LSTM tagger on the GPU in total testing time, while showing a competitive accuracy compared to neural-network and statistical solutions.
more » « less
Full Text Available
ASPEN: A Scalable In-SRAM Architecture for Pushdown Automata

https://doi.org/10.1109/MICRO.2018.00079

Angstadt, Kevin; Subramaniyan, Arun; Sadredini, Elaheh; Rahimi, Reza; Skadron, Kevin; Weimer, Westley; Das, Reetuparna (October 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO))

Full Text Available

Search for: All records