Traditional machine learning approaches for recognizing modes of transportation rely heavily on hand-crafted feature extraction methods which require domain knowledge. So, we propose a hybrid deep learning model: Deep Convolutional Bidirectional-LSTM (DCBL) which combines convolutional and bidirectional LSTM layers and is trained directly on raw sensor data to predict the transportation modes. We compare our model to the traditional machine learning approaches of training Support Vector Machines and Multilayer Perceptron models on extracted features. In our experiments, DCBL performs better than the feature selection methods in terms of accuracy and simplifies the data processing pipeline. The models are trained on the Sussex-Huawei Locomotion-Transportation (SHL) dataset. The submission of our team, Vahan, to SHL recognition challenge uses an ensemble of DCBL models trained on raw data using the different combination of sensors and window sizes and achieved an F1-score of 0.96 on our test data. 
                        more » 
                        « less   
                    
                            
                            Amino Acid k-mer Feature Extraction for Quantitative Antimicrobial Resistance (AMR) Prediction by Machine Learning and Model Interpretation for Biological Insights
                        
                    
    
            Machine learning algorithms can learn mechanisms of antimicrobial resistance from the data of DNA sequence without any a priori information. Interpreting a trained machine learning algorithm can be exploited for validating the model and obtaining new information about resistance mechanisms. Different feature extraction methods, such as SNP calling and counting nucleotide k-mers have been proposed for presenting DNA sequences to the model. However, there are trade-offs between interpretability, computational complexity and accuracy for different feature extraction methods. In this study, we have proposed a new feature extraction method, counting amino acid k-mers or oligopeptides, which provides easier model interpretation compared to counting nucleotide k-mers and reaches the same or even better accuracy in comparison with different methods. Additionally, we have trained machine learning algorithms using different feature extraction methods and compared the results in terms of accuracy, model interpretability and computational complexity. We have built a new feature selection pipeline for extraction of important features so that new AMR determinants can be discovered by analyzing these features. This pipeline allows the construction of models that only use a small number of features and can predict resistance accurately. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1936791
- PAR ID:
- 10290339
- Date Published:
- Journal Name:
- Biology
- Volume:
- 9
- Issue:
- 11
- ISSN:
- 2079-7737
- Page Range / eLocation ID:
- 365
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            The melting temperature is important for materials design because of its relationship with thermal stability, synthesis, and processing conditions. Current empirical and computational melting point estimation techniques are limited in scope, computational feasibility, or interpretability. We report the development of a machine learning methodology for predicting melting temperatures of binary ionic solid materials. We evaluated different machine-learning models trained on a dataset of the melting points of 476 non-metallic crystalline binary compounds using materials embeddings constructed from elemental properties and density-functional theory calculations as model inputs. A direct supervised-learning approach yields a mean absolute error of around 180 K but suffers from low interpretability. We find that the fidelity of predictions can further be improved by introducing an additional unsupervised-learning step that first classifies the materials before the melting-point regression. Not only does this two-step model exhibit improved accuracy, but the approach also provides a level of interpretability with insights into feature importance and different types of melting that depend on the specific atomic bonding inside a material. Motivated by this finding, we used a symbolic learning approach to find interpretable physical models for the melting temperature, which recovered the best-performing features from both prior models and provided additional interpretability.more » « less
- 
            Under the trend of deeper renewable energy integration, active distribution networks are facing increasing uncertainty and security issues, among which the arcing fault detection (AFD) has baffled researchers for years. Existing machine learning based AFD methods are deficient in feature extraction and model interpretability. To overcome these limitations in learning algorithms, we have designed a way to translate the non-transparent machine learning prediction model into an implementable logic for AFD. Moreover, the AFD logic is tested under different fault scenarios and realistic renewable generation data, with the help of our self-developed AFD software. The performance from various tests shows that the interpretable prediction model has high accuracy, dependability, security and speed under the integration of renewable energy.more » « less
- 
            null (Ed.)X-ray CT imaging provides a 3D view of a sample and is a powerful tool for investigating the internal features of porous rock. Reliable phase segmentation in these images is highly necessary but, like any other digital rock imaging technique, is time-consuming, labor-intensive, and subjective. Combining 3D X-ray CT imaging with machine learning methods that can simultaneously consider several extracted features in addition to color attenuation, is a promising and powerful method for reliable phase segmentation. Machine learning-based phase segmentation of X-ray CT images enables faster data collection and interpretation than traditional methods. This study investigates the performance of several filtering techniques with three machine learning methods and a deep learning method to assess the potential for reliable feature extraction and pixel-level phase segmentation of X-ray CT images. Features were first extracted from images using well-known filters and from the second convolutional layer of the pre-trained VGG16 architecture. Then, K-means clustering, Random Forest, and Feed Forward Artificial Neural Network methods, as well as the modified U-Net model, were applied to the extracted input features. The models’ performances were then compared and contrasted to determine the influence of the machine learning method and input features on reliable phase segmentation. The results showed considering more dimensionality has promising results and all classification algorithms result in high accuracy ranging from 0.87 to 0.94. Feature-based Random Forest demonstrated the best performance among the machine learning models, with an accuracy of 0.88 for Mancos and 0.94 for Marcellus. The U-Net model with the linear combination of focal and dice loss also performed well with an accuracy of 0.91 and 0.93 for Mancos and Marcellus, respectively. In general, considering more features provided promising and reliable segmentation results that are valuable for analyzing the composition of dense samples, such as shales, which are significant unconventional reservoirs in oil recovery.more » « less
- 
            null (Ed.)The application of machine learning models and algorithms towards describing atomic interactions has been a major area of interest in materials simulations in recent years, as machine learning interatomic potentials (MLIPs) are seen as being more flexible and accurate than their classical potential counterparts. This increase in accuracy of MLIPs over classical potentials has come at the cost of significantly increased complexity, leading to higher computational costs and lower physical interpretability and spurring research into improving the speeds and interpretability of MLIPs. As an alternative, in this work we leverage “machine learning” fitting databases and advanced optimization algorithms to fit a class of spline-based classical potentials, showing that they can be systematically improved in order to achieve accuracies comparable to those of low-complexity MLIPs. These results demonstrate that high model complexities may not be strictly necessary in order to achieve near-DFT accuracy in interatomic potentials and suggest an alternative route towards sampling the high accuracy, low complexity region of model space by starting with forms that promote simpler and more interpretable inter- atomic potentials.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    