This content will become publicly available on December 10, 2024
- PAR ID:
- 10491341
- Publisher / Repository:
- PMLR
- Date Published:
- Journal Name:
- Proceedings of the 3rd Machine Learning for Health Symposium
- Format(s):
- Medium: X
- Location:
- New Orleans, LA
- Sponsoring Org:
- National Science Foundation
More Like this
-
Mortazavi, Bobak J ; Sarker, Tasmie ; Beam, Andrew ; Ho, Joyce C (Ed.)Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. To solve the challenge, we propose the \textbf{T}oken \textbf{Im}balance Adapt\textbf{er} (\textit{TIMER}), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.more » « less
-
Deep generative models have enabled the automated synthesis of high-quality data for diverse applications. However, the most effective generative models are specialized to data from a single domain (e.g., images or text). Real-world applications such as healthcare require multi-modal data from multiple domains (e.g., both images and corresponding text), which are difficult to acquire due to limited availability and privacy concerns and are much harder to synthesize. To tackle this joint synthesis challenge, we propose an End-to-end MultImodal X-ray genERative model (EMIXER) for jointly synthesizing x-ray images and corresponding free-text reports, all conditional on diagnosis labels. EMIXER is an conditional generative adversarial model by 1) generating an image based on a label, 2) encoding the image to a hidden embedding, 3) producing the corresponding text via a hierarchical decoder from the image embedding, and 4) a joint discriminator for assessing both the image and the corresponding text. EMIXER also enables self-supervision to leverage vast amount of unlabeled data. Extensive experiments with real X-ray reports data illustrate how data augmentation using synthesized multimodal samples can improve the performance of a variety of supervised tasks including COVID-19 X-ray classification with very limited samples. The quality of generated images and reports are also confirmed by radiologists. We quantitatively show that EMIXER generated synthetic datasets can augment X-ray image classification, report generation models to achieve 5.94% and 6.9% improvement on models trained only on real data samples. Taken together, our results highlight the promise of state of generative models to advance clinical machine learning.more » « less
-
Summary A fast pink‐beam X‐ray microtomography methodology was developed at the GSECARS 13‐BMD beamline at the Advanced Photon Source to study multiphase flow in porous media. The white beam X‐ray distribution of the Advanced Photon Source is modified using a 1‐mm copper filter and the beam is reflected off a platinum mirror angled at 1.5 mrad, resulting in a pink beam with X‐ray intensities predominately in the range of 40–60 keV. Bubble formation in the wetting phase and wettability alteration of the solid phase from x‐ray exposure can be a problem with high flux and high energy beams, but the suggested pink‐beam configuration mitigates these effects. With a 14‐second acquisition time for capturing a complete dataset, the evolving fluid‐fronts of nonequilibrium three‐dimensional multiphase flow can be studied in real‐time and the images contain adequate image contrast and quality to measure important multiphase quantities such as contact angles and interfacial areas.
Lay Description Understanding how fluids are transported through porous materials is pertinent to many important societal processes in the environment (e.g. groundwater flow for drinking water) and industry (e.g. drying of industrial materials such as pulp and paper). To develop accurate models and theories of this fluid transportation, experiments need to track fluids in 3‐dimensions quickly. This is difficult to do as most materials are opaque and therefore cameras cannot capture fluid movement directly. But, with the help of x‐rays, scientists can track fluids in 3D using an imaging technique called x‐ray microtomography (μCT). Standard μCT takes about 15 minutes for one image which can produce blurry images if fluids are flowing quickly through the material. We present a technique, fast μCT, which uses a larger spectrum of x‐rays than the standard technique and acquires a 3D image in 14 seconds. With the large amount of x‐rays utilized in this technique, bubbles can start to form in the fluids from x‐ray exposure. We optimized the utilized x‐ray spectrum to limit bubble formation while still achieving a rapid 3D image acquisition that has adequate image quality and contrast. With this technique, scientists can study fluid transport in 3D porous materials in near real‐time for the improvement of models used to ensure public and environmental health.
-
Creating a large-scale dataset of abnormality annotation on medical images is a labor-intensive and costly task. Leveraging weak supervision from readily available data such as radiology reports can compensate lack of large-scale data for anomaly detection methods. However, most of the current methods only use image-level pathological observations, failing to utilize the relevant anatomy mentions in reports. Furthermore, Natural Language Processing (NLP)-mined weak labels are noisy due to label sparsity and linguistic ambiguity. We propose an Anatomy-Guided chest X-ray Network (AGXNet) to address these issues of weak annotation. Our framework consists of a cascade of two net- works, one responsible for identifying anatomical abnormalities and the second responsible for pathological observations. The critical component in our framework is an anatomy-guided attention module that aids the downstream observation network in focusing on the relevant anatomical regions generated by the anatomy network. We use Positive Unlabeled (PU) learning to account for the fact that lack of mention does not nec- essarily mean a negative label. Our quantitative and qualitative results on the MIMIC-CXR dataset demonstrate the effectiveness of AGXNet in disease and anatomical abnormality localization. Experiments on the NIH Chest X-ray dataset show that the learned feature representations are transferable and can achieve the state-of-the-art performances in dis- ease classification and competitive disease localization results. Our code is available at https://github.com/batmanlab/AGXNet.more » « less
-
Photobombing occurs very often in photography. This causes inconvenience to the main person(s) in the photos. Therefore, there is a legitimate need to remove the photobombing from taken images to produce a pleasing image. In this paper, the aim is to conduct a benchmark on this aforementioned problem. To this end, we first collect a dataset of images with undesired and distracting elements which requires the removal of photobombing. Then, we annotate the photobombed regions which should be removed. Next, different image inpainting methods are leveraged to remove the photobombed regions and reconstruct the image. We further invited professional photoshoppers to remove the unwanted regions. These photoshopped images are considered as the groundtruth. In our benchmark, several performance metrics are leveraged to compare the results of different methods with the groundtruth. The experiments provide insightful results which demonstrate the effectiveness of inpainting methods in this particular problem.more » « less