skip to main content


Title: FRAMING A STEM EDUCATION-CAREER BRIDGE PROGRAM WITH A GLOBAL PARTNERSHIP MODEL AND FORENSICS ANALYTICS
The National Science Foundation funded the University of Central Oklahoma (UCO) for a three-year bridge program to broaden the participation in the Science, Technology, Engineering, and Mathematics (STEM) by female students. UCO is a state university in the United States. The project team proposed a global government-university-industry (GUI) model to collaborate with partnering institutions at the international, federal, and state levels. Partnering institutions included IBM, the FBI, the Oklahoma State Bureau of Investigation (OSBI), the Oklahoma Center for the Advancement of Science and Technology, the Francis Tuttle Innovation Center, and the Established Program to Stimulate Competitive Research. Representatives from these partnering institutions served in the roles of advisory board members and internship sponsors who identified skill requirements and job trends. For phase one (2018), the focus was the research and development (R&D) and the implementation of a STEM program with a focus on Forensics Analytics (FA). The STEM+FA curriculum was designed with real-world applications and emerging technologies (e.g. IBM Watson, simulation, virtual reality). The STEM+FA pilot program consisted of simulated learning environments, STEM modules, cloud-based tutorials, and relational databases. These databases were similar to the Combined DNA Index System and Automated Fingerprint Identification System which have been adopted by the FBI and the OSBI to solve modern-day crimes (e.g. cyber security, homicide). Researchers pilot tested the STEM+FA program by collecting and analyzing quantitative and qualitative data. Findings derived from the pilot study evidenced that the STEM+FA pilot program had positive effects on female student career awareness and perceived competencies; whereas career interest remained unchanged.  more » « less
Award ID(s):
1758975
NSF-PAR ID:
10124700
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Journal of applied global research
Volume:
11
Issue:
26
ISSN:
1940-1841
Page Range / eLocation ID:
61-77
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Obeid, I. (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing the Temple University Digital Pathology Corpus (TUDP), an open source database of high-resolution images from scanned pathology samples [1], as part of its National Science Foundation-funded Major Research Instrumentation grant titled “MRI: High Performance Digital Pathology Using Big Data and Machine Learning” [2]. The long-term goal of this project is to release one million images. We have currently scanned over 100,000 images and are in the process of annotating breast tissue data for our first official corpus release, v1.0.0. This release contains 3,505 annotated images of breast tissue including 74 patients with cancerous diagnoses (out of a total of 296 patients). In this poster, we will present an analysis of this corpus and discuss the challenges we have faced in efficiently producing high quality annotations of breast tissue. It is well known that state of the art algorithms in machine learning require vast amounts of data. Fields such as speech recognition [3], image recognition [4] and text processing [5] are able to deliver impressive performance with complex deep learning models because they have developed large corpora to support training of extremely high-dimensional models (e.g., billions of parameters). Other fields that do not have access to such data resources must rely on techniques in which existing models can be adapted to new datasets [6]. A preliminary version of this breast corpus release was tested in a pilot study using a baseline machine learning system, ResNet18 [7], that leverages several open-source Python tools. The pilot corpus was divided into three sets: train, development, and evaluation. Portions of these slides were manually annotated [1] using the nine labels in Table 1 [8] to identify five to ten examples of pathological features on each slide. Not every pathological feature is annotated, meaning excluded areas can include focuses particular to these labels that are not used for training. A summary of the number of patches within each label is given in Table 2. To maintain a balanced training set, 1,000 patches of each label were used to train the machine learning model. Throughout all sets, only annotated patches were involved in model development. The performance of this model in identifying all the patches in the evaluation set can be seen in the confusion matrix of classification accuracy in Table 3. The highest performing labels were background, 97% correct identification, and artifact, 76% correct identification. A correlation exists between labels with more than 6,000 development patches and accurate performance on the evaluation set. Additionally, these results indicated a need to further refine the annotation of invasive ductal carcinoma (“indc”), inflammation (“infl”), nonneoplastic features (“nneo”), normal (“norm”) and suspicious (“susp”). This pilot experiment motivated changes to the corpus that will be discussed in detail in this poster presentation. To increase the accuracy of the machine learning model, we modified how we addressed underperforming labels. One common source of error arose with how non-background labels were converted into patches. Large areas of background within other labels were isolated within a patch resulting in connective tissue misrepresenting a non-background label. In response, the annotation overlay margins were revised to exclude benign connective tissue in non-background labels. Corresponding patient reports and supporting immunohistochemical stains further guided annotation reviews. The microscopic diagnoses given by the primary pathologist in these reports detail the pathological findings within each tissue site, but not within each specific slide. The microscopic diagnoses informed revisions specifically targeting annotated regions classified as cancerous, ensuring that the labels “indc” and “dcis” were used only in situations where a micropathologist diagnosed it as such. Further differentiation of cancerous and precancerous labels, as well as the location of their focus on a slide, could be accomplished with supplemental immunohistochemically (IHC) stained slides. When distinguishing whether a focus is a nonneoplastic feature versus a cancerous growth, pathologists employ antigen targeting stains to the tissue in question to confirm the diagnosis. For example, a nonneoplastic feature of usual ductal hyperplasia will display diffuse staining for cytokeratin 5 (CK5) and no diffuse staining for estrogen receptor (ER), while a cancerous growth of ductal carcinoma in situ will have negative or focally positive staining for CK5 and diffuse staining for ER [9]. Many tissue samples contain cancerous and non-cancerous features with morphological overlaps that cause variability between annotators. The informative fields IHC slides provide could play an integral role in machine model pathology diagnostics. Following the revisions made on all the annotations, a second experiment was run using ResNet18. Compared to the pilot study, an increase of model prediction accuracy was seen for the labels indc, infl, nneo, norm, and null. This increase is correlated with an increase in annotated area and annotation accuracy. Model performance in identifying the suspicious label decreased by 25% due to the decrease of 57% in the total annotated area described by this label. A summary of the model performance is given in Table 4, which shows the new prediction accuracy and the absolute change in error rate compared to Table 3. The breast tissue subset we are developing includes 3,505 annotated breast pathology slides from 296 patients. The average size of a scanned SVS file is 363 MB. The annotations are stored in an XML format. A CSV version of the annotation file is also available which provides a flat, or simple, annotation that is easy for machine learning researchers to access and interface to their systems. Each patient is identified by an anonymized medical reference number. Within each patient’s directory, one or more sessions are identified, also anonymized to the first of the month in which the sample was taken. These sessions are broken into groupings of tissue taken on that date (in this case, breast tissue). A deidentified patient report stored as a flat text file is also available. Within these slides there are a total of 16,971 total annotated regions with an average of 4.84 annotations per slide. Among those annotations, 8,035 are non-cancerous (normal, background, null, and artifact,) 6,222 are carcinogenic signs (inflammation, nonneoplastic and suspicious,) and 2,714 are cancerous labels (ductal carcinoma in situ and invasive ductal carcinoma in situ.) The individual patients are split up into three sets: train, development, and evaluation. Of the 74 cancerous patients, 20 were allotted for both the development and evaluation sets, while the remain 34 were allotted for train. The remaining 222 patients were split up to preserve the overall distribution of labels within the corpus. This was done in hope of creating control sets for comparable studies. Overall, the development and evaluation sets each have 80 patients, while the training set has 136 patients. In a related component of this project, slides from the Fox Chase Cancer Center (FCCC) Biosample Repository (https://www.foxchase.org/research/facilities/genetic-research-facilities/biosample-repository -facility) are being digitized in addition to slides provided by Temple University Hospital. This data includes 18 different types of tissue including approximately 38.5% urinary tissue and 16.5% gynecological tissue. These slides and the metadata provided with them are already anonymized and include diagnoses in a spreadsheet with sample and patient ID. We plan to release over 13,000 unannotated slides from the FCCC Corpus simultaneously with v1.0.0 of TUDP. Details of this release will also be discussed in this poster. Few digitally annotated databases of pathology samples like TUDP exist due to the extensive data collection and processing required. The breast corpus subset should be released by November 2021. By December 2021 we should also release the unannotated FCCC data. We are currently annotating urinary tract data as well. We expect to release about 5,600 processed TUH slides in this subset. We have an additional 53,000 unprocessed TUH slides digitized. Corpora of this size will stimulate the development of a new generation of deep learning technology. In clinical settings where resources are limited, an assistive diagnoses model could support pathologists’ workload and even help prioritize suspected cancerous cases. ACKNOWLEDGMENTS This material is supported by the National Science Foundation under grants nos. CNS-1726188 and 1925494. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. REFERENCES [1] N. Shawki et al., “The Temple University Digital Pathology Corpus,” in Signal Processing in Medicine and Biology: Emerging Trends in Research and Applications, 1st ed., I. Obeid, I. Selesnick, and J. Picone, Eds. New York City, New York, USA: Springer, 2020, pp. 67 104. https://www.springer.com/gp/book/9783030368432. [2] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning.” Major Research Instrumentation (MRI), Division of Computer and Network Systems, Award No. 1726188, January 1, 2018 – December 31, 2021. https://www. isip.piconepress.com/projects/nsf_dpath/. [3] A. Gulati et al., “Conformer: Convolution-augmented Transformer for Speech Recognition,” in Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), 2020, pp. 5036-5040. https://doi.org/10.21437/interspeech.2020-3015. [4] C.-J. Wu et al., “Machine Learning at Facebook: Understanding Inference at the Edge,” in Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2019, pp. 331–344. https://ieeexplore.ieee.org/document/8675201. [5] I. Caswell and B. Liang, “Recent Advances in Google Translate,” Google AI Blog: The latest from Google Research, 2020. [Online]. Available: https://ai.googleblog.com/2020/06/recent-advances-in-google-translate.html. [Accessed: 01-Aug-2021]. [6] V. Khalkhali, N. Shawki, V. Shah, M. Golmohammadi, I. Obeid, and J. Picone, “Low Latency Real-Time Seizure Detection Using Transfer Deep Learning,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2021, pp. 1 7. https://www.isip. piconepress.com/publications/conference_proceedings/2021/ieee_spmb/eeg_transfer_learning/. [7] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning,” Philadelphia, Pennsylvania, USA, 2020. https://www.isip.piconepress.com/publications/reports/2020/nsf/mri_dpath/. [8] I. Hunt, S. Husain, J. Simons, I. Obeid, and J. Picone, “Recent Advances in the Temple University Digital Pathology Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2019, pp. 1–4. https://ieeexplore.ieee.org/document/9037859. [9] A. P. Martinez, C. Cohen, K. Z. Hanley, and X. (Bill) Li, “Estrogen Receptor and Cytokeratin 5 Are Reliable Markers to Separate Usual Ductal Hyperplasia From Atypical Ductal Hyperplasia and Low-Grade Ductal Carcinoma In Situ,” Arch. Pathol. Lab. Med., vol. 140, no. 7, pp. 686–689, Apr. 2016. https://doi.org/10.5858/arpa.2015-0238-OA. 
    more » « less
  2. Need/Motivation (e.g., goals, gaps in knowledge) The ESTEEM implemented a STEM building capacity project through students’ early access to a sustainable and innovative STEM Stepping Stones, called Micro-Internships (MI). The goal is to reap key benefits of a full-length internship and undergraduate research experiences in an abbreviated format, including access, success, degree completion, transfer, and recruiting and retaining more Latinx and underrepresented students into the STEM workforce. The MIs are designed with the goals to provide opportunities for students at a community college and HSI, with authentic STEM research and applied learning experiences (ALE), support for appropriate STEM pathway/career, preparation and confidence to succeed in STEM and engage in summer long REUs, and with improved outcomes. The MI projects are accessible early to more students and build momentum to better overcome critical obstacles to success. The MIs are shorter, flexibly scheduled throughout the year, easily accessible, and participation in multiple MI is encouraged. ESTEEM also establishes a sustainable and collaborative model, working with partners from BSCS Science Education, for MI’s mentor, training, compliance, and building capacity, with shared values and practices to maximize the improvement of student outcomes. New Knowledge (e.g., hypothesis, research questions) Research indicates that REU/internship experiences can be particularly powerful for students from Latinx and underrepresented groups in STEM. However, those experiences are difficult to access for many HSI-community college students (85% of our students hold off-campus jobs), and lack of confidence is a barrier for a majority of our students. The gap between those who can and those who cannot is the “internship access gap.” This project is at a central California Community College (CCC) and HSI, the only affordable post-secondary option in a region serving a historically underrepresented population in STEM, including 75% Hispanic, and 87% have not completed college. MI is designed to reduce inequalities inherent in the internship paradigm by providing access to professional and research skills for those underserved students. The MI has been designed to reduce barriers by offering: shorter duration (25 contact hours); flexible timing (one week to once a week over many weeks); open access/large group; and proximal location (on-campus). MI mentors participate in week-long summer workshops and ongoing monthly community of practice with the goal of co-constructing a shared vision, engaging in conversations about pedagogy and learning, and sustaining the MI program going forward. Approach (e.g., objectives/specific aims, research methodologies, and analysis) Research Question and Methodology: We want to know: How does participation in a micro-internship affect students’ interest and confidence to pursue STEM? We used a mixed-methods design triangulating quantitative Likert-style survey data with interpretive coding of open-responses to reveal themes in students’ motivations, attitudes toward STEM, and confidence. Participants: The study sampled students enrolled either part-time or full-time at the community college. Although each MI was classified within STEM, they were open to any interested student in any major. Demographically, participants self-identified as 70% Hispanic/Latinx, 13% Mixed-Race, and 42 female. Instrument: Student surveys were developed from two previously validated instruments that examine the impact of the MI intervention on student interest in STEM careers and pursuing internships/REUs. Also, the pre- and post (every e months to assess longitudinal outcomes) -surveys included relevant open response prompts. The surveys collected students’ demographics; interest, confidence, and motivation in pursuing a career in STEM; perceived obstacles; and past experiences with internships and MIs. 171 students responded to the pre-survey at the time of submission. Outcomes (e.g., preliminary findings, accomplishments to date) Because we just finished year 1, we lack at this time longitudinal data to reveal if student confidence is maintained over time and whether or not students are more likely to (i) enroll in more internships, (ii) transfer to a four-year university, or (iii) shorten the time it takes for degree attainment. For short term outcomes, students significantly Increased their confidence to continue pursuing opportunities to develop within the STEM pipeline, including full-length internships, completing STEM degrees, and applying for jobs in STEM. For example, using a 2-tailed t-test we compared means before and after the MI experience. 15 out of 16 questions that showed improvement in scores were related to student confidence to pursue STEM or perceived enjoyment of a STEM career. Finding from the free-response questions, showed that the majority of students reported enrolling in the MI to gain knowledge and experience. After the MI, 66% of students reported having gained valuable knowledge and experience, and 35% of students spoke about gaining confidence and/or momentum to pursue STEM as a career. Broader Impacts (e.g., the participation of underrepresented minorities in STEM; development of a diverse STEM workforce, enhanced infrastructure for research and education) The ESTEEM project has the potential for a transformational impact on STEM undergraduate education’s access and success for underrepresented and Latinx community college students, as well as for STEM capacity building at Hartnell College, a CCC and HSI, for students, faculty, professionals, and processes that foster research in STEM and education. Through sharing and transfer abilities of the ESTEEM model to similar institutions, the project has the potential to change the way students are served at an early and critical stage of their higher education experience at CCC, where one in every five community college student in the nation attends a CCC, over 67% of CCC students identify themselves with ethnic backgrounds that are not White, and 40 to 50% of University of California and California State University graduates in STEM started at a CCC, thus making it a key leverage point for recruiting and retaining a more diverse STEM workforce. 
    more » « less
  3. To remain competitive in the global economy, the United States needs skilled technical workers in occupations requiring a high level of domain-specific technical knowledge to meet the country’s anticipated shortage of 5 million technically-credentialed workers. The changing demographics of the country are of increasing importance to addressing this workforce challenge. According to federal data, half the students earning a certificate in 2016-17 received credentials from community colleges where the percent enrollment of Latinx (a gender-neutral term referencing Latin American cultural or racial identity) students (56%) exceeds that of other post-secondary sectors. If this enrollment rate persists, then by 2050 over 25% of all students enrolled in higher education will be Latinx. Hispanic Serving Institutions (HSIs) are essential points of access as they enroll 64% of all Latinx college students, and nearly 50% of all HSIs are 2-year institutions. Census estimates predict Latinxs are the fastest-growing segment reaching 30% of the U.S. population while becoming the youngest group comprising 33.5% of those under 18 years by 2060. The demand for skilled workers in STEM fields will be met when workers reflect the diversity of the population, therefore more students—of all ages and backgrounds—must be brought into community colleges and supported through graduation: a central focus of community colleges everywhere. While Latinx students of color are as likely as white students to major in STEM, their completion numbers drop dramatically: Latinx students often have distinct needs that evolved from a history of discrimination in the educational system. HSI ATE Hub is a three-year collaborative research project funded by the National Science Foundation Advanced Technological Education Program (NSF ATE) being implemented by Florence Darlington Technical College and Science Foundation Arizona Center for STEM at Arizona State University to address the imperative that 2-year Hispanic Serving Institutions (HSIs) develop and improve engineering technology and related technician education programs in a way that is culturally inclusive. Interventions focus on strengthening grant-writing skills among CC HSIs to fund advancements in technician education and connecting 2-year HSIs with resources for faculty development and program improvement. A mixed methods approach will explore the following research questions: 1) What are the unique barriers and challenges for 2-year HSIs related to STEM program development and grant-writing endeavors? 2) How do we build capacity at 2-year HSIs to address these barriers and challenges? 3) How do mentoring efforts/styles need to differ? 4) How do existing ATE resources need to be augmented to better serve 2-year HSIs? 5) How do proposal submission and success rates compare for 2-year HSIs that have gone through the KS STEM planning process but not M-C, through the M-C cohort mentoring process but not KS, and through both interventions? The project will identify HSI-relevant resources, augment existing ATE resources, and create new ones to support 2-year HSI faculty as potential ATE grantees. To address the distinct needs of Latinx students in STEM, resources representing best practices and frameworks for cultural inclusivity, as well as faculty development will be included. Throughout, the community-based tradition of the ATE Program is being fostered with particular emphasis on forming, nurturing, and serving participating 2-year HSIs. This paper will discuss the need, baseline data, and early results for the three-year program, setting the stage for a series of annual papers that report new findings. 
    more » « less
  4. To remain competitive in the global economy, the United States needs skilled technical workers in occupations requiring a high level of domain-specific technical knowledge to meet the country’s anticipated shortage of 5 million technically-credentialed workers. The changing demographics of the country are of increasing importance to addressing this workforce challenge. According to federal data, half the students earning a certificate in 2016-17 received credentials from community colleges where the percent enrollment of Latinx (a gender-neutral term referencing Latin American cultural or racial identity) students (56%) exceeds that of other post-secondary sectors. If this enrollment rate persists, then by 2050 over 25% of all students enrolled in higher education will be Latinx. Hispanic Serving Institutions (HSIs) are essential points of access as they enroll 64% of all Latinx college students, and nearly 50% of all HSIs are 2-year institutions. Census estimates predict Latinxs are the fastest-growing segment reaching 30% of the U.S. population while becoming the youngest group comprising 33.5% of those under 18 years by 2060. The demand for skilled workers in STEM fields will be met when workers reflect the diversity of the population, therefore more students—of all ages and backgrounds—must be brought into community colleges and supported through graduation: a central focus of community colleges everywhere. While Latinx students of color are as likely as white students to major in STEM, their completion numbers drop dramatically: Latinx students often have distinct needs that evolved from a history of discrimination in the educational system. HSI ATE Hub is a three-year collaborative research project funded by the National Science Foundation Advanced Technological Education Program (NSF ATE) being implemented by Florence Darlington Technical College and Science Foundation Arizona Center for STEM at Arizona State University to address the imperative that 2-year Hispanic Serving Institutions (HSIs) develop and improve engineering technology and related technician education programs in a way that is culturally inclusive. Interventions focus on strengthening grant-writing skills among CC HSIs to fund advancements in technician education and connecting 2-year HSIs with resources for faculty development and program improvement. A mixed methods approach will explore the following research questions: 1) What are the unique barriers and challenges for 2-year HSIs related to STEM program development and grant-writing endeavors? 2) How do we build capacity at 2-year HSIs to address these barriers and challenges? 3) How do mentoring efforts/styles need to differ? 4) How do existing ATE resources need to be augmented to better serve 2-year HSIs? 5) How do proposal submission and success rates compare for 2-year HSIs that have gone through the KS STEM planning process but not M-C, through the M-C cohort mentoring process but not KS, and through both interventions? The project will identify HSI-relevant resources, augment existing ATE resources, and create new ones to support 2-year HSI faculty as potential ATE grantees. To address the distinct needs of Latinx students in STEM, resources representing best practices and frameworks for cultural inclusivity, as well as faculty development will be included. Throughout, the community-based tradition of the ATE Program is being fostered with particular emphasis on forming, nurturing, and serving participating 2-year HSIs. This paper will discuss the need, baseline data, and early results for the three-year program, setting the stage for a series of annual papers that report new findings. 
    more » « less
  5. Research and evidence-based practices that center sense of belonging and engineering identity development drive strong outcomes for undergraduate students in engineering—especially those who are first-generation college students, from low-income families, and identify as other underrepresented groups in engineering (Deil-Amen, 2011; Hurtado, Cabrera, Lin, & Arellano, 2009; Patrick & Prybutok, 2018). The process from ideation to organizational implementation is not well-documented in the literature on student success, leaving a gap in practitioners’ understanding of how to bring strong, research-informed practices to fruition in their institutions. Implementation is arguably as important as the design of a student intervention and knowing how to implement a good idea is an art and a science. This paper explores the various people and processes that take theory to practice for a National Science Foundation Improving Undergraduate STEM Education funded program. In this paper, I invoke an autoethnographic approach to reflect on the experience of designing a student-facing program while managing the organizational systems that empower or restrain transformative organizational change for students. Autoethnography as a methodology can be a helpful mode to understanding practice, as the researcher can move more fluidly between their lived experience and the organizational, sociological, or psychosocial theory that it mirrors (Berry & Hodges, 2015). The proposed paper discusses my team’s approaches to working with stakeholders and gatekeepers in our organization and in our community to execute a program designed to build sense of belonging and engineering identity while supporting academic attainment of underserved student populations using Community Cultural Wealth (Yosso, 2005) and Street-Level Bureaucracy (Lipsky, 1980) as theoretical lenses. A small, summer-intensive program required the cooperation and capital of gatekeepers across the campus of our large, research university in the southwestern United States. This program, which serves students from marginalized ethnic and socioeconomic backgrounds in engineering disciplines, became the basis for an NSF Improving Undergraduate STEM Education award. Students spent part of their summer (six weeks during the pilot program, which evolved to ten weeks for the second cohort) taking summer classes that helped them advance into their sophomore year of an engineering degree. They also took a career development class, which featured regular field trips to various regional engineering employers. Outcomes from the pilot program and subsequent year are promising, and include high rates of persistence, strong academic performance, and increased sense of engineering identity, but this paper focuses on the structure of the program, the need for collaborators, and the way that the team implemented an initiative which challenges the assumptions of stakeholders from within and outside of the institution. Major themes discussed are personal reflections of the process of coalition-building, gaining buy-in from critical partners on-campus and in the community, and co-investing in programmatic improvement with early cohorts of participating students. 
    more » « less