Class ArrayBasedCharEscaper


  • @Beta
    @GwtCompatible
    public abstract class ArrayBasedCharEscaper
    extends CharEscaper
    A CharEscaper that uses an array to quickly look up replacement characters for a given char value. An additional safe range is provided that determines whether char values without specific replacements are to be considered safe and left unescaped or should be escaped in a general way.

    A good example of usage of this class is for Java source code escaping where the replacement array contains information about special ASCII characters such as \\t and \\n while escapeUnsafe(char) is overridden to handle general escaping of the form \\uxxxx.

    The size of the data structure used by ArrayBasedCharEscaper is proportional to the highest valued character that requires escaping. For example a replacement map containing the single character '\u1000' will require approximately 16K of memory. If you need to create multiple escaper instances that have the same character replacement mapping consider using ArrayBasedEscaperMap.

    Since:
    15.0
    Author:
    Sven Mawson, David Beaumont
    • Constructor Detail

      • ArrayBasedCharEscaper

        protected ArrayBasedCharEscaper(Map<Character,String> replacementMap,
                                        char safeMin,
                                        char safeMax)
        Creates a new ArrayBasedCharEscaper instance with the given replacement map and specified safe range. If safeMax < safeMin then no characters are considered safe.

        If a character has no mapped replacement then it is checked against the safe range. If it lies outside that, then escapeUnsafe(char) is called, otherwise no escaping is performed.

        Parameters:
        replacementMap - a map of characters to their escaped representations
        safeMin - the lowest character value in the safe range
        safeMax - the highest character value in the safe range
      • ArrayBasedCharEscaper

        protected ArrayBasedCharEscaper(ArrayBasedEscaperMap escaperMap,
                                        char safeMin,
                                        char safeMax)
        Creates a new ArrayBasedCharEscaper instance with the given replacement map and specified safe range. If safeMax < safeMin then no characters are considered safe. This initializer is useful when explicit instances of ArrayBasedEscaperMap are used to allow the sharing of large replacement mappings.

        If a character has no mapped replacement then it is checked against the safe range. If it lies outside that, then escapeUnsafe(char) is called, otherwise no escaping is performed.

        Parameters:
        escaperMap - the mapping of characters to be escaped
        safeMin - the lowest character value in the safe range
        safeMax - the highest character value in the safe range
    • Method Detail

      • escape

        public final String escape(String s)
        Description copied from class: CharEscaper
        Returns the escaped form of a given literal string.
        Overrides:
        escape in class CharEscaper
        Parameters:
        s - the literal string to be escaped
        Returns:
        the escaped form of string
      • escape

        protected final char[] escape(char c)
        Escapes a single character using the replacement array and safe range values. If the given character does not have an explicit replacement and lies outside the safe range then escapeUnsafe(char) is called.
        Specified by:
        escape in class CharEscaper
        Parameters:
        c - the character to escape if necessary
        Returns:
        the replacement characters, or null if no escaping was needed
      • escapeUnsafe

        protected abstract char[] escapeUnsafe(char c)
        Escapes a char value that has no direct explicit value in the replacement array and lies outside the stated safe range. Subclasses should override this method to provide generalized escaping for characters.

        Note that arrays returned by this method must not be modified once they have been returned. However it is acceptable to return the same array multiple times (even for different input characters).

        Parameters:
        c - the character to escape
        Returns:
        the replacement characters, or null if no escaping was required