Embeddings, low-dimensional vector representation of objects, are fundamental in building modern machine learning systems. In industrial settings, there is usually an embedding team that trains an embedding model to solve intended tasks (e.g., product recommendation). The produced embeddings are then widely consumed by consumer teams to solve their unintended tasks (e.g., fraud detection). However, as the embedding model gets updated and retrained to improve performance on the intended task, the newly-generated embeddings are no longer compatible with the existing consumer models. This means that historical versions of the embeddings can never be retired or all consumer teams have to retrain their models to make them compatible with the latest version of the embeddings, both of which are extremely costly in practice. Here we study the problem of embedding version updates and their backward compatibility. We formalize the problem where the goal is for the embedding team to keep updating the embedding version, while the consumer teams do not have to retrain their models. We develop a solution based on learning backward compatible embeddings, which allows the embedding model version to be updated frequently, while also allowing the latest version of the embedding to be quickly transformed into any backward compatible historical version of it, so that consumer teams do not have to retrain their models. Our key idea is that whenever a new embedding model is trained, we learn it together with a light-weight backward compatibility transformation that aligns the new embedding to the previous version of it. Our learned backward transformations can then be composed to produce any historical version of embedding. Under our framework, we explore six methods and systematically evaluate them on a real-world recommender system application. We show that the best method, which we call BC-Aligner, maintains backward compatibility with existing unintended tasks even after multiple model version updates. Simultaneously, BC-Aligner achieves the intended task performance similar to the embedding model that is solely optimized for the intended task.
more »
« less
Investigating a classical neuropsychological test in a real world context
This study was performed to investigate the validity of a real world version of the Trail Making Test (TMT) across age strata, compared to the current standard TMT which is delivered using a pen-paper protocol. We developed a real world version of the TMT, the Can-TMT, that involves the retrieval of food cans, with numeric or alphanumerical labels, from a shelf in ascending order. Eye tracking data was acquired during the Can-TMT to calculate task completion time and compared to that of the Paper-TMT. Results indicated a strong significant correlation between the real world and paper tasks for both TMTA and TMTB versions of the tasks, indicative of the validity of the real world task. Moreover, the two age groups exhibited significant differences on the TMTA and TMTB versions of both task modalities (paper and can), further supporting the validity of the real world task. This work will have a significant impact on our ability to infer skill or impairment with visual search, spatial reasoning, working memory, and motor proficiency during complex real-world tasks. Thus, we hope to fill a critical need for an exam with the resolution capable of determining deficits which subjective or reductionist assessments may otherwise miss.
more »
« less
- PAR ID:
- 10357674
- Date Published:
- Journal Name:
- 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)
- Page Range / eLocation ID:
- 1566 to 1569
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Representing and reasoning about uncertainty is crucial for autonomous agents acting in partially observable environments with noisy sensors. Partially observable Markov decision processes (POMDPs) serve as a general framework for representing problems in which uncertainty is an important factor. Online sample-based POMDP methods have emerged as efficient approaches to solving large POMDPs and have been shown to extend to continuous domains. However, these solutions struggle to find long-horizon plans in problems with significant uncertainty. Exploration heuristics can help guide planning, but many real-world settings contain significant task-irrelevant uncertainty that might distract from the task objective. In this paper, we propose STRUG, an online POMDP solver capable of handling domains that require long-horizon planning with significant task-relevant and task-irrelevant uncertainty. We demonstrate our solution on several temporally extended versions of toy POMDP problems as well as robotic manipulation of articulated objects using a neural perception frontend to construct a distribution of possible models. Our results show that STRUG outperforms the current samplebased online POMDP solvers on several tasks.more » « less
-
In this paper, we present a shared manipulation task performed both in virtual reality with a simulated robot and in the real world with a physical robot. A collaborative assembly task where the human and robot work together to construct as simple electrical circuit was chosen. While there are platforms available for conducting human robot interactions using virtual reality, there has not been significant work investigating how it can influence human perception of tasks that are typically done in person. We present an overview of the simulation environment used, describe the paired experiment being performed, and finally enumerate a set of design desiderata to be considered when conducting sim2real experiment involving humans in a virtual setting.more » « less
-
Using GUI-based workflows for data analysis is an iterative process. During each iteration, an analyst makes changes to the workflow to improve it, generating a new version each time. The results produced by executing these versions are materialized to help users refer to them in the future. In many cases, a new version of the workflow, when submitted for execution, produces a result equivalent to that of a previous one. Identifying such equivalence can save computational resources and time by reusing the materialized result. One way to optimize the performance of executing a new version is to compare the current version with a previous one and test if they produce the same results using a workflow version equivalence verifier. As the number of versions grows, this testing can become a computational bottleneck. In this paper, we present Raven, an optimization framework to accelerate the execution of a new version request by detecting and reusing the results of previous equivalent versions with the help of a version equivalence verifier. Raven ranks and prunes the set of prior versions to quickly identify those that may produce an equivalent result to the version execution request. Additionally, when the verifier performs computation to verify the equivalence of a version pair, there may be a significant overlap with previously tested version pairs. Raven identifies and avoids such repeated computations by extending the verifier to reuse previous knowledge of equivalence tests. We evaluated the effectiveness of Raven compared to baselines on real workflows and datasets.more » « less
-
The present study evaluated at the behavioral and neurophysiological level the effect of feedback validity on learning in adults and children. Participants (82 children aged 7-11; 42 adults aged 18-25) completed a two-choice classification task, in which they sorted items from eight different categories into one of two bins, by pressing one of two buttons on a response box. Each response was followed by positive or negative feedback. Four of the eight categories were mapped consistently to a specific response, leading to consistent valid feedback. The other four were mapped to a specific response 80% of the time; in 20% of these trials, participants received invalid feedback. As participants performed the task, their EEG data were recorded. Behaviorally, accuracy was greater for the consistently valid condition than the inconsistently valid condition for both adults and children. There were no significant differences in accuracy between adults and children. Feedback-related event related potentials (ERPs) were evaluated and compared between the two groups. The amplitudes of the feedback related negativity (FRN) and fronto-central positivity (FCP) were sensitive to valence and age group, with FRN being larger in children, and FCP larger in adults. Interaction effects suggested that FRN response to positive feedback was sensitive to feedback validity in both age groups. However, the FCP was sensitive to validity for only for positive feedback in children and only for negative feedback in adults. These results further evidence of differing neurophysiological reactions to feedback in learning between children and adults.more » « less
An official website of the United States government

