Class MotifTools


  • public class MotifTools
    extends java.lang.Object
    MotifTools contains utility methods for sequence motifs.
    Author:
    Keith James
    • Constructor Summary

      Constructors 
      Constructor Description
      MotifTools()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String createRegex​(SymbolList motif)
      createRegex creates a regular expression which matches the SymbolList.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • createRegex

        public static java.lang.String createRegex​(SymbolList motif)

        createRegex creates a regular expression which matches the SymbolList. Ambiguous Symbols are simply transformed into character classes. For example the nucleotide sequence "AAGCTT" becomes "A{2}GCT{2}" and "CTNNG" is expanded to "CT[ABCDGHKMNRSTVWY]{2}G". The character class is generated using the getMatches method of an ambiguity symbol to obtain the alphabet of AtomicSymbols it matches, followed by calling getAllSymbols on this alphabet, removal of any gap symbols and then tokenization of the remainder. The ordering of the tokens in a character class is by ascending numerical order of their tokens as determined by Arrays.sort(char []).

        The Alphabet of the SymbolList must be finite and must have a character token type. Regular expressions may be generated for any such SymbolList, not just DNA, RNA and protein.

        Parameters:
        motif - a SymbolList.
        Returns:
        a String regular expression.