skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2329096

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Reinforcement learning (RL) relies on Gaussian and sigmoid functions to balance exploration and exploitation, but implementing these functions in hardware typically requires iterative computations, increasing power and circuit complexity. Here, Gaussian‐sigmoid reinforcement transistors (GS‐RTs) are reported that integrate both activation functions into a single device. The transistors feature a vertical n‐p‐i‐p heterojunction stack composed of a‐IGZO and DNTT, with asymmetric source–drain contacts and a parylene interlayer that enables voltage‐tunable transitions between sigmoid, Gaussian, and mixed responses. This architecture emulates the behavior of three transistors in one, reducing the required circuit complexity from dozens of transistors to fewer than a few. The GS‐RT exhibits a peak current of 5.95 µA at VG= −17 V and supports nonlinear transfer characteristics suited for neuromorphic computing. In a multi‐armed bandit task, GS‐RT‐based RL policies demonstrate 20% faster convergence and 30% higher final reward compared to conventional sigmoid‐ or Gaussian‐based approaches. Extending this advantage further, GS‐RT‐based activation function in deep RL for cartpole balancing significantly outperforms the traditional ReLU‐based activation function in terms of faster learning and tolerance to input perturbations. 
    more » « less