Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
ObjectiveThis study aims to improve workers’ postures and thus reduce the risk of musculoskeletal disorders in human-robot collaboration by developing a novel model-free reinforcement learning method. BackgroundHuman-robot collaboration has been a flourishing work configuration in recent years. Yet, it could lead to work-related musculoskeletal disorders if the collaborative tasks result in awkward postures for workers. MethodsThe proposed approach follows two steps: first, a 3D human skeleton reconstruction method was adopted to calculate workers’ continuous awkward posture (CAP) score; second, an online gradient-based reinforcement learning algorithm was designed to dynamically improve workers’ CAP score by adjusting the positions and orientations of the robot end effector. ResultsIn an empirical experiment, the proposed approach can significantly improve the CAP scores of the participants during a human-robot collaboration task when compared with the scenarios where robot and participants worked together at a fixed position or at the individual elbow height. The questionnaire outcomes also showed that the working posture resulted from the proposed approach was preferred by the participants. ConclusionThe proposed model-free reinforcement learning method can learn the optimal worker postures without the need for specific biomechanical models. The data-driven nature of this method can make it adaptive to provide personalized optimal work posture. ApplicationThe proposed method can be applied to improve the occupational safety in robot-implemented factories. Specifically, the personalized robot working positions and orientations can proactively reduce exposure to awkward postures that increase the risk of musculoskeletal disorders. The algorithm can also reactively protect workers by reducing the workload in specific joints.more » « less
-
Generative AI holds promise for advancing human factors and ergonomics. However, limited training data can reduce variability in generative models. This study investigates using transfer learning to enhance variability in generative posture prediction. We pre-trained a conditional diffusion model on lifting postures where hands are near body center and fine-tuned it on limited extended-reach postures. Compared to training from scratch, transfer learning significantly improved joint angle variability across multiple body segments while maintaining similar accuracy in posture similarity and validity. Additionally, it reduced training time by over 90%, demonstrating efficiency benefits. These findings highlight transfer learning’s potential to enrich generative model outputs with more variable ergonomics data, supporting scalable and adaptive workplace design.more » « less
-
Fine-tuning large pretrained Transformer models can focus on either introducing a small number of new learnable parameters (parameter efficiency) or editing representations of a small number of tokens using lightweight modules (representation efficiency). While the pioneering method LoRA (Low-Rank Adaptation) inherently balances parameter, compute, and memory efficiency, many subsequent variants trade off compute and memory efficiency and/or performance to further reduce fine-tuning parameters. To address this limitation and unify parameter-efficient and representation-efficient fine-tuning, we propose Weight-Generative Fine-Tuning (WeGeFT, pronounced wee-gift), a novel approach that learns to generate fine-tuning weights directly from the pretrained weights. WeGeFT employs a simple low-rank formulation consisting of two linear layers, either shared across multiple layers of the pretrained model or individually learned for different layers. This design achieves multifaceted efficiency in parameters, representations, compute, and memory, while maintaining or exceeding the performance of LoRA and its variants. Extensive experiments on commonsense reasoning, arithmetic reasoning, instruction following, code generation, and visual recognition verify the effectiveness of our proposed WeGeFT.more » « lessFree, publicly-accessible full text available May 1, 2026
-
White-box targeted adversarial attacks reveal core vulnerabilities in Deep Neural Networks (DNNs), yet two key challenges persist: (i) How many target classes can be attacked simultaneously in a specified order, known as the ordered top-K attack problem (K ≥ 1)? (ii) How to compute the corresponding adversarial perturbations for a given benign image directly in the image space? We address both by showing that ordered top-K perturbations can be learned via iteratively optimizing linear combinations of the right singular vectors of the adversarial Jacobian (i.e., the logit-to-image Jacobian constrained by target ranking). These vectors span an orthogonal, informative subspace in the image domain. We introduce RisingAttacK, a novel Sequential Quadratic Programming (SQP)-based method that exploits this structure. We propose a holistic figure-of-merits (FoM) metric combining attack success rates (ASRs) and ℓp-norms (p = 1, 2, ∞). Extensive experiments on ImageNet-1k across six ordered top-K levels (K = 1, 5, 10, 15, 20, 25, 30) and four models (ResNet-50, DenseNet-121, ViTB, DEiT-B) show RisingAttacK consistently surpasses the state-of-the-art QuadAttacK.more » « lessFree, publicly-accessible full text available May 1, 2026
-
Existing score-based adversarial attacks mainly focus on crafting top-1 adversarial examples against classifiers with single-label classification. Their attack success rate and query efficiency are often less than satisfactory, particularly under small perturbation requirements; moreover, the vulnerability of classifiers with multilabel learning is yet to be studied. In this paper, we propose a comprehensive surrogate free score-based attack, named geometric score-based black-box attack (GSBAK), to craft adversarial examples in an aggressive top-K setting for both untargeted and targeted attacks, where the goal is to change the top-K predictions of the target classifier. We introduce novel gradient-based methods to find a good initial boundary point to attack. Our iterative method employs novel gradient estimation techniques, particularly effective in top-K setting, on the decision boundary to effectively exploit the geometry of the decision boundary. Additionally, GSBAK can be used to attack against classifiers with top-K multi-label learning. Extensive experimental results on ImageNet and PASCAL VOC datasets validate the effectiveness of GSBAK in crafting top-K adversarial examples.more » « lessFree, publicly-accessible full text available May 1, 2026
-
The adversarial vulnerability of Deep Neural Networks (DNNs) has been wellknown and widely concerned, often under the context of learning top-1 attacks (e.g., fooling a DNN to classify a cat image as dog). This paper shows that the concern is much more serious by learning significantly more aggressive ordered top-K clearbox 1 targeted attacks proposed in [Zhang and Wu, 2020]. We propose a novel and rigorous quadratic programming (QP) method of learning ordered top-K attacks with low computing cost, dubbed as QuadAttacK. Our QuadAttacK directly solves the QP to satisfy the attack constraint in the feature embedding space (i.e., the input space to the final linear classifier), which thus exploits the semantics of the feature embedding space (i.e., the principle of class coherence). With the optimized feature embedding vector perturbation, it then computes the adversarial perturbation in the data space via the vanilla one-step back-propagation. In experiments, the proposed QuadAttacK is tested in the ImageNet-1k classification using ResNet-50, DenseNet-121, and Vision Transformers (ViT-B and DEiT-S). It successfully pushes the boundary of successful ordered top-K attacks from K = 10 up to K = 20 at a cheap budget (1 × 60) and further improves attack success rates for K = 5 for all tested models, while retaining the performance for K = 1.more » « less
-
Advances in robotics have contributed to the prevalence of human-robot collaboration (HRC). Working and interacting with collaborative robots in close proximity can be psychologically stressful. Therefore, it is important to understand the impacts of human-robot interaction (HRI) on mental stress to promote psychological well-being at the workplace. To this end, this study investigated how the HRI presence, complexity, and modality affect psychological stress in humans and discussed possible HRI design criteria during HRC. An experimental setup was implemented in which human operators worked with a collaborative robot on a Lego assembly task, using different interaction paradigms involving pressing buttons, showing hand gestures, and giving verbal commands. The NASA-Task Load Index, as a subjective measure, and the physiological galvanic skin conductance response, as an objective measure, were used to assess the levels of mental stress. The results revealed that the introduction of interactions during HRC helped reduce mental stress and that complex interactions resulted in higher mental stress than simple interactions. Meanwhile, the use of certain interaction modalities, such as verbal commands or hand gestures, led to significantly higher mental stress than pressing buttons, while no significant difference on mental stress was found between showing hand gestures and giving verbal commands.more » « less
-
Decision-based black-box attacks often necessitate a large number of queries to craft an adversarial example. Moreover, decision-based attacks based on querying boundary points in the estimated normal vector direction often suffer from inefficiency and convergence issues. In this paper, we propose a novel query-efficient \b curvature-aware \b geometric decision-based \b black-box \b attack (CGBA) that conducts boundary search along a semicircular path on a restricted 2D plane to ensure finding a boundary point successfully irrespective of the boundary curvature. While the proposed CGBA attack can work effectively for an arbitrary decision boundary, it is particularly efficient in exploiting the low curvature to craft high-quality adversarial examples, which is widely seen and experimentally verified in commonly used classifiers under non-targeted attacks. In contrast, the decision boundaries often exhibit higher curvature under targeted attacks. Thus, we develop a new query-efficient variant, CGBA-H, that is adapted for the targeted attack. In addition, we further design an algorithm to obtain a better initial boundary point at the expense of some extra queries, which considerably enhances the performance of the targeted attack. Extensive experiments are conducted to evaluate the performance of our proposed methods against some well-known classifiers on the ImageNet and CIFAR10 datasets, demonstrating the superiority of CGBA and CGBA-H over state-of-the-art non-targeted and targeted attacks, respectively.more » « less
-
The main challenge of monocular 3D object detection is the accurate localization of 3D center. Motivated by a new and strong observation that this challenge can be remedied by a 3D-space local-grid search scheme in an ideal case, we propose a stage-wise approach, which combines the information flow from 2D-to-3D (3D bounding box proposal generation with a single 2D image) and 3D-to-2D (proposal verification by denoising with 3D-to-2D contexts) in a topdown manner. Specifically, we first obtain initial proposals from off-the-shelf backbone monocular 3D detectors. Then, we generate a 3D anchor space by local-grid sampling from the initial proposals. Finally, we perform 3D bounding box denoising at the 3D-to-2D proposal verification stage. To effectively learn discriminative features for denoising highly overlapped proposals, this paper presents a method of using the Perceiver I/O model [20] to fuse the 3D-to-2D geometric information and the 2D appearance information. With the encoded latent representation of a proposal, the verification head is implemented with a self-attention module. Our method, named as MonoXiver, is generic and can be easily adapted to any backbone monocular 3D detectors. Experimental results on the well-established KITTI dataset and the challenging large-scale Waymo dataset show that MonoXiver consistently achieves improvement with limited computation overhead.more » « less
An official website of the United States government

Full Text Available