This work discusses an optimization framework to embed dictionary learning frameworks with the wave equation as a strategy for incorporating prior scientific knowledge into a machine learning algorithm. We modify dictionary learning to study ultrasonic guided wave-based defect detection for non-destructive structural health monitoring systems. Specifically, this work involves altering the popular-SVD algorithm for dictionary learning by enforcing prior knowledge about the ultrasonic guided wave problem through a physics-based regularization derived from the wave equation. We confer it the name “wave-informed K-SVD.” Training dictionary on data simulated from a fixed string added with noise using both K-SVD and wave-informed K-SVD, we show an improved physical consistency of columns of dictionary matrix with the known modal behavior of different one-dimensional wave simulations is observed.
more »
« less
K-SVD based Periodicity Dictionary Learning
It has recently been shown that periodicity in discrete-time data can be analyzed using Ramanujan sums and associated dictionaries. This paper explores the role of dictionary learning methods in the context of period estimation and periodic signal representation using dictionaries. It is shown that a wellknown dictionary learning algorithm, namely K-SVD, is able to learn Ramanujan and Farey periodicity dictionaries from the noisy, sparse coefficient data generated from them without imposing any periodicity structure in the learning stage. This similarity between the learned dictionary and the underlying original periodicity dictionary reaffirms the power of the KSVD in predicting the right dictionary from data without explicit application-specific constraints. The paper also examines how the choice of different parameter values affect the similarity of the learned dictionary to the underlying dictionary. Two versions of K-SVD along with different initializations are analyzed for their effect on representation and denoising error for the data.
more »
« less
- Award ID(s):
- 1712633
- PAR ID:
- 10275652
- Date Published:
- Journal Name:
- Proc. Asil. Conf. Sig., Sys., and Comp
- Page Range / eLocation ID:
- 1333 to 1337
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The inverse problem of inferring clinical gold-standard electrocardiogram (ECG) from photoplethysmogram (PPG) that can be measured by affordable wearable Internet of Healthcare Things (IoHT) devices is a research direction receiving growing attention. It combines the easy measurability of PPG and the rich clinical knowledge of ECG for long-term continuous cardiac monitoring. The prior art for reconstruction using a universal basis, such as discrete cosine transform (DCT), has limited fidelity for uncommon ECG shapes due to the lack of representative power. To better utilize the data and improve data representation, we design two dictionary learning frameworks, the cross-domain joint dictionary learning (XDJDL), and the label-consistent XDJDL (LC-XDJDL), to further improve the ECG inference quality and enrich the PPG-based diagnosis knowledge. Building on the K-SVD technique, the proposed joint dictionary learning frameworks extend the expressive power by optimizing simultaneously a pair of signal dictionaries for PPG and ECG with the transforms to relate their sparse codes and disease information. The proposed models are evaluated with a variety of PPG and ECG morphologies from two benchmark datasets that cover various age groups and disease types. The results show the proposed frameworks achieve better inference performance than previous methods with average Pearson coefficients being 0.88 using XDJDL and 0.92 using LC-XDJDL, suggesting an encouraging potential for ECG screening using PPG based on the proactively learned PPG-ECG relationship. By enabling the dynamic monitoring and analysis of the health status of an individual, the proposed frameworks contribute to the emerging digital twins paradigm for personalized healthcare.more » « less
-
Detecting and locating damage information from waves reflected off damage is a common practice in non-destructive structural health monitoring systems. Yet, the transmitted ultrasonic guided waves are affected by the physical and material properties of the structure and are often complicated to model mathematically. This calls for data-driven approaches to model the behaviour of waves, where patterns in wave data due to damage can be learned and distinguished from non-damage data. Recent works have used a popular dictionary learning algorithm, K-SVD, to learn an overcomplete dictionary for waves propagating in a metal plate. However, the domain knowledge is not utilized. This may lead to fruitless results in the case where there are strong patterns in the data that are not of interest to the domain. In this work, instead of treating the K-SVD algorithm as a black box, we create a novel modification by enforcing domain knowledge. In particular, we look at how regularizing the K-SVD algorithm with the one-dimensional wave equation affects the dictionary learned in the simple case of vibrating string. By adding additional non-wave patterns (noise) to the data, we demonstrate that the “wave-informed K-SVD” does not learn patterns which do not obey the wave equation hence learning patterns from data and not the noise.more » « less
-
Dictionaries remain the most well studied class of data structures. A dictionary supports insertions, deletions, membership queries, and usually successor, predecessor, and extract-min. In a RAM, all such operations take O(log n) time on n elements. Dictionaries are often cross-referenced as follows. Consider a set of tuples {〈ai,bi,ci…〉}. A database might include more than one dictionary on such a set, for example, one indexed on the a ‘s, another on the b‘s, and so on. Once again, in a RAM, inserting into a set of L cross-referenced dictionaries takes O(L log n) time, as does deleting. The situation is more interesting in external memory. On a Disk Access Machine (DAM), B-trees achieve O(logB N) I/Os for insertions and deletions on a single dictionary and K-element range queries take optimal O(logB N + K/B) I/Os. These bounds are also achievable by a B-tree on cross-referenced dictionaries, with a slowdown of an L factor on insertion and deletions. In recent years, both the theory and practice of external- memory dictionaries has been revolutionized by write- optimization techniques. A dictionary is write optimized if it is close to a B-tree for query time while beating B-trees on insertions. The best (and optimal) dictionaries achieve a substantially improved insertion and deletion cost of amortized I/Os on a single dictionary while maintaining optimal O(log1+B∊ N + K/B)- I/O range queries. Although write optimization still helps for insertions into cross-referenced dictionaries, its value for deletions would seem to be greatly reduced. A deletion into a cross- referenced dictionary only specifies a key a. It seems to be necessary to look up the associated values b, c … in order to delete them from the other dictionaries. This takes Ω(logB N) I/Os, well above the per-dictionary write-optimization budget of So the total deletion cost is In short, for deletions, write optimization offers an advantage over B-trees in that L multiplies a lower order term, but when L = 2, write optimization seems to offer no asymptotic advantage over B-trees. That is, no known query- optimal solution for pairs of cross-referenced dictionaries seem to beat B-trees for deletions. In this paper, we show a lower bound establishing that a pair of cross-referenced dictionaries that are optimal for range queries and that supports deletions cannot match the write optimization bound available to insert-only dictionaries. This result thus establishes a limit to the applicability of write-optimization techniques on which many new databases and file systems are based. Read More: http://epubs.siam.org/doi/10.1137/1.9781611974782.99more » « less
-
null (Ed.)It is common practice for data providers to include text descriptions for each column when publishing data sets in the form of data dictionaries. While these documents are useful in helping an end-user properly interpret the meaning of a column in a data set, existing data dictionaries typically are not machine-readable and do not follow a common specification standard. We introduce the Semantic Data Dictionary, a specification that formalizes the assignment of a semantic representation of data, enabling standardization and harmonization across diverse data sets. In this paper, we present our Semantic Data Dictionary work in the context of our work with biomedical data; however, the approach can and has been used in a wide range of domains. The rendition of data in this form helps promote improved discovery, interoperability, reuse, traceability, and reproducibility. We present the associated research and describe how the Semantic Data Dictionary can help address existing limitations in the related literature. We discuss our approach, present an example by annotating portions of the publicly available National Health and Nutrition Examination Survey data set, present modeling challenges, and describe the use of this approach in sponsored research, including our work on a large National Institutes of Health (NIH)-funded exposure and health data portal and in the RPI-IBM collaborative Health Empowerment by Analytics, Learning, and Semantics project. We evaluate this work in comparison with traditional data dictionaries, mapping languages, and data integration tools.more » « less
An official website of the United States government

