Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

This paper revisits the kmismatch shortest unique substring finding problem and demonstrates that a technique recently presented in the context of solving the kmismatch average common substring problem can be adapted and combined with parts of the existing solution, resulting in a new algorithm which has expected time complexity of O(n log^k n), while maintaining a practical space complexity at O(kn), where n is the string length. When , which is the hard case, our new proposal significantly improves the anycase O(n^2) time complexity of the prior best method for kmismatch shortest unique substring finding. Experimental study shows that ourmore »

Let T[1,n] be a string of length n and T[i,j] be the substring of T starting at position i and ending at position j. A substring T[i,j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which wemore »

Abstract Background Alignmentfree methods for sequence comparisons have become popular in many bioinformatics applications, specifically in the estimation of sequence similarity measures to construct phylogenetic trees. Recently, the average common substring measure, ACS , and its k mismatch counterpart, ACS k , have been shown to produce results as effective as multiplesequence alignment based methods for reconstruction of phylogeny trees. Since computing ACS k takes O ( n log k n ) time and hence impractical for large datasets, multiple heuristics that can approximate ACS k have been introduced. Results In this paper, we present a novel lineartime heuristic tomore »

In recent years several compressed indexes based on variants of the BurrowsWheeler transformation have been introduced. Some of these are used to index structures far more complex than a single string, as was originally done with the FMindex [Ferragina and Manzini, J. ACM 2005]. As such, there has been an increasing effort to better understand under which conditions such an indexing scheme is possible. This has led to the introduction of Wheeler graphs [Gagie et al., Theor. Comput. Sci., 2017]. Gagie et al. showed that de Bruijn graphs, generalized compressed suffix arrays, and several other BWT related structures can bemore »

Destruction of pharmaceuticals excreted in urine can be an efficient approach to eliminate these environmental pollutants. However, urine contains high concentrations of chloride, ammonium, and bicarbonate, which may hinder treatment processes. This study evaluated the application of ferrate(VI) (FeVIO42, Fe(VI)) to oxidize pharmaceuticals (carbamazepine (CBZ), naproxen (NAP), trimethoprim (TMP) and sulfonamide antibiotics (SAs)) in synthetic hydrolyzed human urine and uncovered new effects from urine’s major inorganic constituents. Chloride slightly decreased pharmaceuticals’ removal rate by Fe(VI) due to the ionic strength effect. Ammonium (0.5 M) in undiluted hydrolyzed urine posed a strong scavenging effect, but lower concentrations (≤ 0.25 M) ofmore »