Poor-quality facial images pose challenges in biometric authentication, especially in passport photo acquisition and recognition. This study proposes a novel and open-source solution to address these issues by introducing a real-time facial image quality analysis utilizing computer vision technology on a low-power single-board computer. We present an open-source complete hardware solution that consists of a Jetson processor, a 16 MP autofocus RGB camera, a custom enclosure, and a touch sensor LCD for user interaction. To ensure the integrity and confidentiality of captured facial data, Advanced Encryption Standard (AES) is used for secure image storage. Using the pilot data collection, the system demonstrated its ability to capture high-quality images, achieving 98.98% accuracy in storing images of acceptable quality. This open-source, readily deployable, secure system offers promising potential for diverse real-time applications such as passport verification, security systems, etc. 
                        more » 
                        « less   
                    
                            
                            Ookami: Deployment and Initial Experiences
                        
                    
    
            Ookami [3] is a computer technology testbed supported by the United States National Science Foundation. It provides researchers with access to the A64FX processor developed by Fujitsu [17] in collaboration with RIKΞN [35, 37] for the Japanese path to exascale computing, as deployed in Fugaku [36], the fastest computer in the world [34]. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vector processor with ultrahigh-bandwidth memory promises to retain familiar and successful programming models while achieving very high performance for a wide range of applications. We review relevant technology and system details, and the main body of the paper focuses on initial experiences with the hardware and software ecosystem for micro-benchmarks, mini-apps, and full applications, and starts to answer questions about where such technologies fit into the NSF ecosystem. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10292742
- Date Published:
- Journal Name:
- PEARC '21: Practice and Experience in Advanced Research Computing
- Page Range / eLocation ID:
- 1 to 8
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)The design of computing systems has changed dramatically over the past decade, but most courses in advanced computer architecture remain unchanged. Computer architecture education lies at the intersection between computer science and electrical engineering, with practical exercises in classes based on appropriate levels of abstraction in the computing system design stack. Hardware-centric lab exercises often require broad infrastructure resources and tend to navigate around tedious practical implementation concepts, while software-centric exercises leave a gap between modeling and system implementation implications that students later need to overcome in professional settings. Vertical integration trends in domain-specific compute systems, as well as software-hardware co-design, are often covered in classroom lectures, but are not reflected in laboratory exercises due to complex tooling and simulation infrastructure. We describe our experiences with a joint hardware-software approach to exploring computer architecture concepts in class exercises, by using opensource processor hardware implementations, generator-based hardware design methodologies, and cloud-hosted FPGAs. This approach further enables scaling course enrollment, remote learning and a cross-class collaborative lab ecosystem, creating a connecting thread between computer science and electrical engineering experience-based curricula.more » « less
- 
            Innovative processor architectures aim to play a critical role in future sustainment of performance improvements under severe limitations imposed by the end of Moore’s Law. The Reconfigurable Optical Computer (ROC) is one such innovative, Post-Moore’s Law processor. ROC is designed to solve partial differential equations in one shot as opposed to existing solutions, which are based on costly iterative computations. This is achieved by leveraging physical properties of a mesh of optical components that behave analogously to lumped electrical components. However, virtualization is required to combat shortfalls of the accelerator hardware. Namely, 1) the infeasibility of building large photonic arrays to accommodate arbitrarily large problems, and 2) underutilization brought about by mismatches in problem and accelerator mesh sizes due to future advances in manufacturing technology. In this work, we introduce an architecture and methodology for light-weight virtualization of ROC which exploits advantages borne from optical computing technology. Specifically, we apply temporal and spatial virtualization to ROC and then extend the accelerator scheduling tradespace with the introduction of spectral virtualization. Additionally, we investigate multiple resource scheduling strategies for a system-on-chip (SoC)-based PDE acceleration architecture and show that virtual configuration management offers a speedup of approximately 2 ×. Finally, we show that overhead from virtualization is minimal, and our experimental results show two orders of magnitude increased speed as compared to microprocessor execution while keeping errors due to virtualization under 10%.more » « less
- 
            Abstract Objective.A major challenge in designing closed-loop brain-computer interfaces is finding optimal stimulation patterns as a function of ongoing neural activity for different subjects and different objectives. Traditional approaches, such as those currently used for deep brain stimulation, have largely followed a manual trial-and-error strategy to search for effective open-loop stimulation parameters, a strategy that is inefficient and does not generalize to closed-loop activity-dependent stimulation.Approach.To achieve goal-directed closed-loop neurostimulation, we propose the use of brain co-processors, devices which exploit artificial intelligence to shape neural activity and bridge injured neural circuits for targeted repair and restoration of function. Here we investigate a specific type of co-processor called a ‘neural co-processor’ which uses artificial neural networks and deep learning to learn optimal closed-loop stimulation policies. The co-processor adapts the stimulation policy as the biological circuit itself adapts to the stimulation, achieving a form of brain-device co-adaptation. Here we use simulations to lay the groundwork for futurein vivotests of neural co-processors. We leverage a previously published cortical model of grasping, to which we applied various forms of simulated lesions. We used our simulations to develop the critical learning algorithms and study adaptations to non-stationarity in preparation for futurein vivotests.Main results.Our simulations show the ability of a neural co-processor to learn a stimulation policy using a supervised learning approach, and to adapt that policy as the underlying brain and sensors change. Our co-processor successfully co-adapted with the simulated brain to accomplish the reach-and-grasp task after a variety of lesions were applied, achieving recovery towards healthy function in the range 75%–90%.Significance.Our results provide the first proof-of-concept demonstration, using computer simulations, of a neural co-processor for adaptive activity-dependent closed-loop neurostimulation for optimizing a rehabilitation goal after injury. While a significant gap remains between simulations andin vivoapplications, our results provide insights on how such co-processors may eventually be developed for learning complex adaptive stimulation policies for a variety of neural rehabilitation and neuroprosthetic applications.more » « less
- 
            The pin count largely determines the cost of a chip package, which is often comparable to the cost of a die. In 3D processor-memory designs, power and ground (P/G) pins can account for the majority of the pins. This is because packages include separate pins for the disjoint processor and memory power delivery networks (PDNs). Supporting separate PDNs and P/G pins for processor and memory is inefficient, as each set has to be provisioned for the worst-case power delivery requirements. In this paper, we propose to reduce the number of P/G pins of both processor and memory in a 3D design, and dynamically and opportunistically divert some power between the two PDNs on demand. To perform the power transfer, we use a small bidirectional on-chip voltage regulator that connects the two PDNs. Our concept, called Snatch, is effective. It allows the computer to execute code sections with high processor or memory power requirements without having to throttle performance. We evaluate Snatch with simulations of an 8-core multicore stacked with two memory dies. In a set of compute-intensive codes, the processor snatches memory power for 30% of the time on average, speeding-up the codes by up to 23% over advanced turbo-boosting; in memory-intensive codes, the memory snatches processor power. Alternatively, Snatch can reduce the package cost by about 30%.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    