skip to main content

Title: Effect of Image Captioning with Description on the Working Memory
Working memory plays an important role in human activities across academic, professional, and social settings. Working memory is defined as the memory extensively involved in goal-directed behaviors in which information must be retained and manipulated to ensure successful task execution. The aim of this research is to understand the effect of image captioning with image description on an individual’s working memory. A study was conducted with eight neutral images comprising situations relatable to daily life such that each image could have a positive or negative description associated with the outcome of the situation in the image. The study consisted of three rounds where the first and second round involved two parts and the third round consisted of one part. The image was captioned a total of five times across the entire study. The findings highlighted that only 25% of participants were able to recall the captions which they captioned for an image after a span of 9–15 days; when comparing the recall rate of the captions, 50% of participants were able to recall the image caption from the previous round in the present round; and out of the positive and negative description associated with the image, 65% of participants recalled more » the former description rather than the latter. « less
Authors:
; ;
Award ID(s):
1828010
Publication Date:
NSF-PAR ID:
10344385
Journal Name:
International Conference on Human-Computer Interaction (HCII) 2022. Lecture Notes in Computer Science
Volume:
13096
Page Range or eLocation-ID:
107–120
Sponsoring Org:
National Science Foundation
More Like this
  1. This research paper is a study of the support needs of nontraditional students in engineering (NTSE). Nontraditional students in engineering are one segment of the student body that has traditionally not been a part of the conversation in engineering education– those students who do not go through a typical four-year college degree largely at a residential campus. It is only by better understanding the range of issues that NTSE face that we will be able to design interventions and support systems that can assist them. Recent work in engineering education particularly argues that co-curricular support is a critical factor in student success as it effects curricular progress but there has been no work looking specifically at co-curricular support for NTSE and their retention and persistence. The population of NTSE is increasing across campuses as more students take on jobs to support their education and as those in the workforce return to complete their education. It is imperative that higher educational systems understand how to serve the needs of these students better. Although there are a range of ways in which nontraditional students (NTS) are defined, the NCES has proposed a comprehensive definition that includes enrollment criteria, financial and family status,more »and high school graduation status. Overall, the seven characteristics specifically associated with NTS are: (1) Delayed enrollment by a year or more after high school, (2) attended part-time, (3) having dependents, (4) being a single parent, (5) working full time while enrolled, (6) being financially independent from parents, and (7) did not receive a standard high school diploma. We ground our research in the Model of Co-Curricular Support (MCCS) which suggests it is the role of the institution to provide the necessary support for integration. If students are aware and have access to resources, which lead to their success, then they will integrate into the university environment at higher rates than those students who are not aware and have access to those resources. This research study focuses on answering one research question: How do NTSE engage with co-curricular supports as they progress through their degree programs? To answer this question, we recruited 11 NTSE with a range of nontraditional characteristics to complete prompted reflective journaling assignments five times throughout the Fall 2021 semester. Qualitative results showcase the nuanced lives of NTSE as they pursue their engineering degrees. In particular, results indicate students interact with faculty, classmates, and friends/peers the most, and only interact with advising when required. Students rarely reach out to larger student support for help or are involved with campus or other events happening. Classmate and friend/peer interactions are the most positive, while interactions with faculty had the largest negative outcomes.« less
  2. Abstract: Jury notetaking can be controversial despite evidence suggesting benefits for recall and understanding. Research on note taking has historically focused on the deliberation process. Yet, little research explores the notes themselves. We developed a 10-item coding guide to explore what jurors take notes on (e.g., simple vs. complex evidence) and how they take notes (e.g., gist vs. specific representation). In general, jurors made gist representations of simple and complex information in their notes. This finding is consistent with Fuzzy Trace Theory (Reyna & Brainerd, 1995) and suggests notes may serve as a general memory aid, rather than verbatim representation. Summary: The practice of jury notetaking in the courtroom is often contested. Some states allow it (e.g., Nebraska: State v. Kipf, 1990), while others forbid it (e.g., Louisiana: La. Code of Crim. Proc., Art. 793). Some argue notes may serve as a memory aid, increase juror confidence during deliberation, and help jurors engage in the trial (Hannaford & Munsterman, 2001; Heuer & Penrod, 1988, 1994). Others argue notetaking may distract jurors from listening to evidence, that juror notes may be given undue weight, and that those who took notes may dictate the deliberation process (Dann, Hans, & Kaye, 2005). Whilemore »research has evaluated the efficacy of juror notes on evidence comprehension, little work has explored the specific content of juror notes. In a similar project on which we build, Dann, Hans, and Kaye (2005) found jurors took on average 270 words of notes each with 85% including references to jury instructions in their notes. In the present study we use a content analysis approach to examine how jurors take notes about simple and complex evidence. We were particularly interested in how jurors captured gist and specific (verbatim) information in their notes as they have different implications for information recall during deliberation. According to Fuzzy Trace Theory (Reyna & Brainerd, 1995), people extract “gist” or qualitative meaning from information, and also exact, verbatim representations. Although both are important for helping people make well-informed judgments, gist-based understandings are purported to be even more important than verbatim understanding (Reyna, 2008; Reyna & Brainer, 2007). As such, it could be useful to examine how laypeople represent information in their notes during deliberation of evidence. Methods Prior to watching a 45-minute mock bank robbery trial, jurors were given a pen and notepad and instructed they were permitted to take notes. The evidence included testimony from the defendant, witnesses, and expert witnesses from prosecution and defense. Expert testimony described complex mitochondrial DNA (mtDNA) evidence. The present analysis consists of pilot data representing 2,733 lines of notes from 52 randomly-selected jurors across 41 mock juries. Our final sample for presentation at AP-LS will consist of all 391 juror notes in our dataset. Based on previous research exploring jury note taking as well as our specific interest in gist vs. specific encoding of information, we developed a coding guide to quantify juror note-taking behaviors. Four researchers independently coded a subset of notes. Coders achieved acceptable interrater reliability [(Cronbach’s Alpha = .80-.92) on all variables across 20% of cases]. Prior to AP-LS, we will link juror notes with how they discuss scientific and non-scientific evidence during jury deliberation. Coding Note length. Before coding for content, coders counted lines of text. Each notepad line with at minimum one complete word was coded as a line of text. Gist information vs. Specific information. Any line referencing evidence was coded as gist or specific. We coded gist information as information that did not contain any specific details but summarized the meaning of the evidence (e.g., “bad, not many people excluded”). Specific information was coded as such if it contained a verbatim descriptive (e.g.,“<1 of people could be excluded”). We further coded whether this information was related to non-scientific evidence or related to the scientific DNA evidence. Mentions of DNA Evidence vs. Other Evidence. We were specifically interested in whether jurors mentioned the DNA evidence and how they captured complex evidence. When DNA evidence was mention we coded the content of the DNA reference. Mentions of the characteristics of mtDNA vs nDNA, the DNA match process or who could be excluded, heteroplasmy, references to database size, and other references were coded. Reliability. When referencing DNA evidence, we were interested in whether jurors mentioned the evidence reliability. Any specific mention of reliability of DNA evidence was noted (e.g., “MT DNA is not as powerful, more prone to error”). Expert Qualification. Finally, we were interested in whether jurors noted an expert’s qualifications. All references were coded (e.g., “Forensic analyst”). Results On average, jurors took 53 lines of notes (range: 3-137 lines). Most (83%) mentioned jury instructions before moving on to case specific information. The majority of references to evidence were gist references (54%) focusing on non-scientific evidence and scientific expert testimony equally (50%). When jurors encoded information using specific references (46%), they referenced non-scientific evidence and expert testimony equally as well (50%). Thirty-three percent of lines were devoted to expert testimony with every juror including at least one line. References to the DNA evidence were usually focused on who could be excluded from the FBIs database (43%), followed by references to differences between mtDNA vs nDNA (30%), and mentions of the size of the database (11%). Less frequently, references to DNA evidence focused on heteroplasmy (5%). Of those references that did not fit into a coding category (11%), most focused on the DNA extraction process, general information about DNA, and the uniqueness of DNA. We further coded references to DNA reliability (15%) as well as references to specific statistical information (14%). Finally, 40% of jurors made reference to an expert’s qualifications. Conclusion Jury note content analysis can reveal important information about how jurors capture trial information (e.g., gist vs verbatim), what evidence they consider important, and what they consider relevant and irrelevant. In our case, it appeared jurors largely created gist representations of information that focused equally on non-scientific evidence and scientific expert testimony. This finding suggests note taking may serve not only to represent information verbatim, but also and perhaps mostly as a general memory aid summarizing the meaning of evidence. Further, jurors’ references to evidence tended to be equally focused on the non-scientific evidence and the scientifically complex DNA evidence. This observation suggests jurors may attend just as much to non-scientific evidence as they to do complex scientific evidence in cases involving complicated evidence – an observation that might inform future work on understanding how jurors interpret evidence in cases with complex information. Learning objective: Participants will be able to describe emerging evidence about how jurors take notes during trial.« less
  3. In March 2020, the global COVID-19 pandemic forced universities across the United States to immediately stop face-to-face activities and transition to virtual instruction. While this transition was not easy for anyone, the shift to online learning was especially difficult for STEM courses, particularly engineering, which has a strong practical/laboratory component. Additionally, underrepresented students (URMs) in engineering experienced a range of difficulties during this transition. The purpose of this paper is to highlight underrepresented engineering students’ experiences as a result of COVID-19. In particular, we aim to highlight stories shared by participants who indicated a desire to share their experience with their instructor. In order to better understand these experiences, research participants were asked to share a story, using the novel data collection platform SenseMaker, based on the following prompt: Imagine you are chatting with a friend or family member about the evolving COVID-19 crisis. Tell them about something you have experienced recently as an engineering student. Conducting a SenseMaker study involves four iterative steps: 1) Initiation is the process of designing signifiers, testing, and deploying the instrument; 2) Story Collection is the process of collecting data through narratives; 3) Sense-making is the process of exploring and analyzing patterns of themore »collection of narratives; and 4) Response is the process of amplifying positive stories and dampening negative stories to nudge the system to an adjacent possible (Van der Merwe et al. 2019). Unlike traditional surveys or other qualitative data collection methods, SenseMaker encourages participants to think more critically about the stories they share by inviting them to make sense of their story using a series of triads and dyads. After completing their narrative, participants were asked a series of triadic, dyadic, and sentiment-based multiple-choice questions (MCQ) relevant to their story. For one MCQ, in particular, participants were required to answer was “If you could do so without fear of judgment or retaliation, who would you share this story with?” and were given the following options: 1) Family 2) Instructor 3) Peers 4) Prefer not to answer 5) Other. A third of the participants indicated that they would share their story with their instructor. Therefore, we further explored this particular question. Additionally, this paper aims to highlight this subset of students whose primary motivation for their actions were based on Necessity. High-level qualitative findings from the data show that students valued Grit and Perseverance, recent experiences influenced their Sense of Purpose, and their decisions were majorly made based on Intuition. Chi-squared tests showed that there were not any significant differences between race and the desire to share with their instructor, however, there were significant differences when factoring in gender suggesting that gender has a large impact on the complexity of navigating school during this time. Lastly, ~50% of participants reported feeling negative or extremely negative about their experiences, ~30% reported feeling neutral, and ~20% reported feeling positive or extremely positive about their experiences. In the study, a total of 500 micro-narratives from underrepresented engineering students were collected from June – July 2020. Undergraduate and graduate students were recruited for participation through the researchers’ personal networks, social media, and through organizations like NSBE. Participants had the option to indicate who is able to read their stories 1) Everyone 2) Researchers Only, or 3) No one. This work presents qualitative stories of those who granted permission for everyone to read.« less
  4. Obeid, I. (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing the Temple University Digital Pathology Corpus (TUDP), an open source database of high-resolution images from scanned pathology samples [1], as part of its National Science Foundation-funded Major Research Instrumentation grant titled “MRI: High Performance Digital Pathology Using Big Data and Machine Learning” [2]. The long-term goal of this project is to release one million images. We have currently scanned over 100,000 images and are in the process of annotating breast tissue data for our first official corpus release, v1.0.0. This release contains 3,505 annotated images of breast tissue including 74 patients with cancerous diagnoses (out of a total of 296 patients). In this poster, we will present an analysis of this corpus and discuss the challenges we have faced in efficiently producing high quality annotations of breast tissue. It is well known that state of the art algorithms in machine learning require vast amounts of data. Fields such as speech recognition [3], image recognition [4] and text processing [5] are able to deliver impressive performance with complex deep learning models because they have developed large corpora to support training of extremely high-dimensional models (e.g., billions of parameters). Other fields that do notmore »have access to such data resources must rely on techniques in which existing models can be adapted to new datasets [6]. A preliminary version of this breast corpus release was tested in a pilot study using a baseline machine learning system, ResNet18 [7], that leverages several open-source Python tools. The pilot corpus was divided into three sets: train, development, and evaluation. Portions of these slides were manually annotated [1] using the nine labels in Table 1 [8] to identify five to ten examples of pathological features on each slide. Not every pathological feature is annotated, meaning excluded areas can include focuses particular to these labels that are not used for training. A summary of the number of patches within each label is given in Table 2. To maintain a balanced training set, 1,000 patches of each label were used to train the machine learning model. Throughout all sets, only annotated patches were involved in model development. The performance of this model in identifying all the patches in the evaluation set can be seen in the confusion matrix of classification accuracy in Table 3. The highest performing labels were background, 97% correct identification, and artifact, 76% correct identification. A correlation exists between labels with more than 6,000 development patches and accurate performance on the evaluation set. Additionally, these results indicated a need to further refine the annotation of invasive ductal carcinoma (“indc”), inflammation (“infl”), nonneoplastic features (“nneo”), normal (“norm”) and suspicious (“susp”). This pilot experiment motivated changes to the corpus that will be discussed in detail in this poster presentation. To increase the accuracy of the machine learning model, we modified how we addressed underperforming labels. One common source of error arose with how non-background labels were converted into patches. Large areas of background within other labels were isolated within a patch resulting in connective tissue misrepresenting a non-background label. In response, the annotation overlay margins were revised to exclude benign connective tissue in non-background labels. Corresponding patient reports and supporting immunohistochemical stains further guided annotation reviews. The microscopic diagnoses given by the primary pathologist in these reports detail the pathological findings within each tissue site, but not within each specific slide. The microscopic diagnoses informed revisions specifically targeting annotated regions classified as cancerous, ensuring that the labels “indc” and “dcis” were used only in situations where a micropathologist diagnosed it as such. Further differentiation of cancerous and precancerous labels, as well as the location of their focus on a slide, could be accomplished with supplemental immunohistochemically (IHC) stained slides. When distinguishing whether a focus is a nonneoplastic feature versus a cancerous growth, pathologists employ antigen targeting stains to the tissue in question to confirm the diagnosis. For example, a nonneoplastic feature of usual ductal hyperplasia will display diffuse staining for cytokeratin 5 (CK5) and no diffuse staining for estrogen receptor (ER), while a cancerous growth of ductal carcinoma in situ will have negative or focally positive staining for CK5 and diffuse staining for ER [9]. Many tissue samples contain cancerous and non-cancerous features with morphological overlaps that cause variability between annotators. The informative fields IHC slides provide could play an integral role in machine model pathology diagnostics. Following the revisions made on all the annotations, a second experiment was run using ResNet18. Compared to the pilot study, an increase of model prediction accuracy was seen for the labels indc, infl, nneo, norm, and null. This increase is correlated with an increase in annotated area and annotation accuracy. Model performance in identifying the suspicious label decreased by 25% due to the decrease of 57% in the total annotated area described by this label. A summary of the model performance is given in Table 4, which shows the new prediction accuracy and the absolute change in error rate compared to Table 3. The breast tissue subset we are developing includes 3,505 annotated breast pathology slides from 296 patients. The average size of a scanned SVS file is 363 MB. The annotations are stored in an XML format. A CSV version of the annotation file is also available which provides a flat, or simple, annotation that is easy for machine learning researchers to access and interface to their systems. Each patient is identified by an anonymized medical reference number. Within each patient’s directory, one or more sessions are identified, also anonymized to the first of the month in which the sample was taken. These sessions are broken into groupings of tissue taken on that date (in this case, breast tissue). A deidentified patient report stored as a flat text file is also available. Within these slides there are a total of 16,971 total annotated regions with an average of 4.84 annotations per slide. Among those annotations, 8,035 are non-cancerous (normal, background, null, and artifact,) 6,222 are carcinogenic signs (inflammation, nonneoplastic and suspicious,) and 2,714 are cancerous labels (ductal carcinoma in situ and invasive ductal carcinoma in situ.) The individual patients are split up into three sets: train, development, and evaluation. Of the 74 cancerous patients, 20 were allotted for both the development and evaluation sets, while the remain 34 were allotted for train. The remaining 222 patients were split up to preserve the overall distribution of labels within the corpus. This was done in hope of creating control sets for comparable studies. Overall, the development and evaluation sets each have 80 patients, while the training set has 136 patients. In a related component of this project, slides from the Fox Chase Cancer Center (FCCC) Biosample Repository (https://www.foxchase.org/research/facilities/genetic-research-facilities/biosample-repository -facility) are being digitized in addition to slides provided by Temple University Hospital. This data includes 18 different types of tissue including approximately 38.5% urinary tissue and 16.5% gynecological tissue. These slides and the metadata provided with them are already anonymized and include diagnoses in a spreadsheet with sample and patient ID. We plan to release over 13,000 unannotated slides from the FCCC Corpus simultaneously with v1.0.0 of TUDP. Details of this release will also be discussed in this poster. Few digitally annotated databases of pathology samples like TUDP exist due to the extensive data collection and processing required. The breast corpus subset should be released by November 2021. By December 2021 we should also release the unannotated FCCC data. We are currently annotating urinary tract data as well. We expect to release about 5,600 processed TUH slides in this subset. We have an additional 53,000 unprocessed TUH slides digitized. Corpora of this size will stimulate the development of a new generation of deep learning technology. In clinical settings where resources are limited, an assistive diagnoses model could support pathologists’ workload and even help prioritize suspected cancerous cases. ACKNOWLEDGMENTS This material is supported by the National Science Foundation under grants nos. CNS-1726188 and 1925494. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. REFERENCES [1] N. Shawki et al., “The Temple University Digital Pathology Corpus,” in Signal Processing in Medicine and Biology: Emerging Trends in Research and Applications, 1st ed., I. Obeid, I. Selesnick, and J. Picone, Eds. New York City, New York, USA: Springer, 2020, pp. 67 104. https://www.springer.com/gp/book/9783030368432. [2] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning.” Major Research Instrumentation (MRI), Division of Computer and Network Systems, Award No. 1726188, January 1, 2018 – December 31, 2021. https://www. isip.piconepress.com/projects/nsf_dpath/. [3] A. Gulati et al., “Conformer: Convolution-augmented Transformer for Speech Recognition,” in Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), 2020, pp. 5036-5040. https://doi.org/10.21437/interspeech.2020-3015. [4] C.-J. Wu et al., “Machine Learning at Facebook: Understanding Inference at the Edge,” in Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2019, pp. 331–344. https://ieeexplore.ieee.org/document/8675201. [5] I. Caswell and B. Liang, “Recent Advances in Google Translate,” Google AI Blog: The latest from Google Research, 2020. [Online]. Available: https://ai.googleblog.com/2020/06/recent-advances-in-google-translate.html. [Accessed: 01-Aug-2021]. [6] V. Khalkhali, N. Shawki, V. Shah, M. Golmohammadi, I. Obeid, and J. Picone, “Low Latency Real-Time Seizure Detection Using Transfer Deep Learning,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2021, pp. 1 7. https://www.isip. piconepress.com/publications/conference_proceedings/2021/ieee_spmb/eeg_transfer_learning/. [7] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning,” Philadelphia, Pennsylvania, USA, 2020. https://www.isip.piconepress.com/publications/reports/2020/nsf/mri_dpath/. [8] I. Hunt, S. Husain, J. Simons, I. Obeid, and J. Picone, “Recent Advances in the Temple University Digital Pathology Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2019, pp. 1–4. https://ieeexplore.ieee.org/document/9037859. [9] A. P. Martinez, C. Cohen, K. Z. Hanley, and X. (Bill) Li, “Estrogen Receptor and Cytokeratin 5 Are Reliable Markers to Separate Usual Ductal Hyperplasia From Atypical Ductal Hyperplasia and Low-Grade Ductal Carcinoma In Situ,” Arch. Pathol. Lab. Med., vol. 140, no. 7, pp. 686–689, Apr. 2016. https://doi.org/10.5858/arpa.2015-0238-OA.« less
  5. The prefrontal cortex is larger than would be predicted by body size or visual cortex volume in great apes compared with monkeys. Because prefrontal cortex is critical for working memory, we hypothesized that recognition memory tests would engage working memory in orangutans more robustly than in rhesus monkeys. In contrast to working memory, the familiarity response that results from repetition of an image is less cognitively taxing and has been associated with nonfrontal brain regions. Across three experiments, we observed a striking species difference in the control of behavior by these two types of memory. First, we found that recognition memory performance in orangutans was controlled by working memory under conditions in which this memory system plays little role in rhesus monkeys. Second, we found that unlike the case in monkeys, familiarity was not involved in recognition memory performance in orangutans, shown by differences with monkeys across three different measures. Memory in orangutans was not improved by use of novel images, was always impaired by a concurrent cognitive load, and orangutans did not accurately identify images seen minutes ago. These results are surprising and puzzling, but do support the view that prefrontal expansion in great apes favored working memory. Atmore »least in orangutans, increased dependence on working memory may come at a cost in terms of the availability of familiarity.« less