skip to main content

Title: RL-LABEL: A Deep Reinforcement Learning Approach Intended for AR Label Placement in Dynamic Scenarios
Labels are widely used in augmented reality (AR) to display digital information. Ensuring the readability of AR labels requires placing them occlusion-free while keeping visual linkings legible, especially when multiple labels exist in the scene. Although existing optimization-based methods, such as force-based methods, are effective in managing AR labels in static scenarios, they often struggle in dynamic scenarios with constantly moving objects. This is due to their focus on generating layouts optimal for the current moment, neglecting future moments and leading to sub-optimal or unstable layouts over time. In this work, we present RL-LABEL, a deep reinforcement learning-based method for managing the placement of AR labels in scenarios involving moving objects. RL-LABEL considers the current and predicted future states of objects and labels, such as positions and velocities, as well as the user’s viewpoint, to make informed decisions about label placement. It balances the trade-offs between immediate and long-term objectives. Our experiments on two real-world datasets show that RL-LABEL effectively learns the decision-making process for long-term optimization, outperforming two baselines (i.e., no view management and a force-based method) by minimizing label occlusions, line intersections, and label movement distance. Additionally, a user study involving 18 participants indicates that RL-LABEL excels over the baselines in aiding users to identify, compare, and summarize data on AR labels within dynamic scenes.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
IEEE transactions on visualization and computer graphics
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A critical learning outcome of undergraduate engineering mechanics courses is the ability to understand how a structure's internal forces and bending moment will change in response to static and dynamic loads. One of the major challenges associated with both teaching and learning these concepts is the invisible nature of the internal effects. Although concentrated forces applied to the top of the beam can be easily visualized, observing the corresponding changes in the shear and bending moment diagrams is not a trivial task. Nonetheless, proficiency in this concept is vital for students to succeed in subsequent mechanics courses and, ultimately, as a professional practitioner. One promising technology that can enable students to see the invisible internal effects is augmented reality (AR), where virtual or digital objects can be seen through a device such as a smart phone or headset. This paper describes the proof-of-concept development of a Unity®-based AR application called "AR Stairs" that allows students to visualize (in-situ) the relative magnitude of the internal bending moment in an actual structure. The app is specifically tailored to an existing 40-foot long, 16-foot high steel staircase structure located at the authors' institution. This paper details the application design, analysis assumptions, calculations, technical challenges encountered, development environment, and content development. The key features of the app are discussed, which include: (a) coordinate system identification and placement, (b) automatic mapping of a stairs model in-situ, (c) creation of a virtual 2-dimensional staircase model, (d) object detection and tracking of people moving on the stairs, (e) image recognition to approximate people's weight, (f) overlays of virtual force vectors onto moving people, and (g) use of a chromatic scale to visually convey the relative intensity of the internal bending moment at nodes spaced over the length of the structure. It is the authors' intention to also provide the reader with an overall picture of the resources needed to develop AR applications for use in pedagogical settings, the design decision tradeoffs, and practical issues related to deployment. As AR technologies continually improve, they are expected to become an integral part of the pedagogical toolset used by engineering educators to improve the quality of education delivered to engineering students. 
    more » « less
  2. Matni, N ; Morari, M ; Pappas, G.J. (Ed.)
    One of the long-term objectives of Machine Learning is to endow machines with the capacity of structuring and interpreting the world as we do. This is particularly challenging in scenes involving time series, such as video sequences, since seemingly different data can correspond to the same underlying dynamics. Recent approaches seek to decompose video sequences into their composing objects, attributes and dynamics in a self-supervised fashion, thus simplifying the task of learning suitable features that can be used to analyze each component. While existing methods can successfully disentangle dynamics from other components, there have been relatively few efforts in learning parsimonious representations of these underlying dynamics. In this paper, motivated by recent advances in non-linear identification, we propose a method to decompose a video into moving objects, their attributes and the dynamic modes of their trajectories. We model video dynamics as the output of a Koopman operator to be learned from the available data. In this context, the dynamic information contained in the scene is encapsulated in the eigenvalues and eigenvectors of the Koopman operator, providing an interpretable and parsimonious representation. We show that such decomposition can be used for instance to perform video analytics, predict future frames or generate synthetic video. We test our framework in a variety of datasets that encompass different dynamic scenarios, while illustrating the novel features that emerge from our dynamic modes decomposition: Video dynamics interpretation and user manipulation at test-time. We successfully forecast challenging object trajectories from pixels, achieving competitive performance while drawing useful insights. 
    more » « less
  3. Semantic cues and statistical regularities in real-world environment layouts can improve efficiency for navigation in novel environments. This paper learns and leverages such semantic cues for navigating to objects of interest in novel environments, by simply watching YouTube videos. This is challenging because YouTube videos do not come with labels for actions or goals, and may not even showcase optimal behavior. Our method tackles these challenges through the use of Q-learning on pseudo-labeled transition quadruples (image, action, next image, reward). We show that such off-policy Q-learning from passive data is able to learn meaningful semantic cues for navigation. These cues, when used in a hierarchical navigation policy, lead to improved efficiency at the ObjectGoal task in visually realistic simulations. We observe a relative improvement of 15-83% over end-to-end RL, behavior cloning, and classical methods, while using minimal direct interaction. 
    more » « less
  4. Understanding how environmental characteristics affect bio- diversity patterns, from individual species to communities of species, is critical for mitigating effects of global change. A central goal for conservation planning and monitoring is the ability to accurately predict the occurrence of species com- munities and how these communities change over space and time. This in turn leads to a challenging and long-standing problem in the field of computer science - how to perform ac- curate multi-label classification with hundreds of labels? The key challenge of this problem is its exponential-sized output space with regards to the number of labels to be predicted. Therefore, it is essential to facilitate the learning process by exploiting correlations (or dependency) among labels. Previ- ous methods mostly focus on modelling the correlation on label pairs; however, complex relations between real-world objects often go beyond second order. In this paper, we pro- pose a novel framework for multi-label classification, High- order Tie-in Variational Autoencoder (HOT-VAE), which per- forms adaptive high-order label correlation learning. We ex- perimentally verify that our model outperforms the existing state-of-the-art approaches on a bird distribution dataset on both conventional F1 scores and a variety of ecological met- rics. To show our method is general, we also perform em- pirical analysis on seven other public real-world datasets in several application domains, and Hot-VAE exhibits superior performance to previous methods. 
    more » « less
  5. Abstract We present the Arctic atmospheric river (AR) climatology based on twelve sets of labels derived from ERA5 and MERRA-2 reanalyses for 1980–2019. The ARs were identified and tracked in the 3-hourly reanalysis data with a multifactorial approach based on either atmospheric column-integrated water vapor (IWV) or integrated water vapor transport (IVT) exceeding one of the three climate thresholds (75th, 85th, and 95th percentiles). Time series analysis of the AR event counts from the AR labels showed overall upward trends from the mid-1990s to 2019. The 75th IVT- and IWV-based labels, as well as the 85th IWV-based labels, are likely more sensitive to Arctic surface warming, therefore, detected some broadening of AR-affected areas over time, while the rest of the labels did not. Spatial exploratory analysis of these labels revealed that the AR frequency of occurrence maxima shifted poleward from over-land in 1980–1999 to over the Arctic Ocean and its outlying Seas in 2000–2019. Regions across the Atlantic, the Arctic, to the Pacific Oceans trended higher AR occurrence, surface temperature, and column-integrated moisture. Meanwhile, ARs were increasingly responsible for the rising moisture transport into the Arctic. Even though the increase of Arctic AR occurrence was primarily associated with long-term Arctic surface warming and moistening, the effects of changing atmospheric circulation could stand out locally, such as on the Pacific side over the Chukchi Sea. The changing teleconnection patterns strongly modulated AR activities in time and space, with prominent anomalies in the Arctic-Pacific sector during the latest decade. Besides, the extreme events identified by the 95th-percentile labels displayed the most significant changes and were most influenced by the teleconnection patterns. The twelve Arctic AR labels and the detailed graphics in the atlas can help navigate the uncertainty of detecting and quantifying Arctic ARs and their associated effects in current and future studies. 
    more » « less