Class XMLEncodingDetector

java.lang.Object
org.apache.jasper.xmlparser.XMLEncodingDetector

public class XMLEncodingDetector extends Object
  • Field Details

    • stream

      private InputStream stream
    • encoding

      private String encoding
    • isEncodingSetInProlog

      private boolean isEncodingSetInProlog
    • isBigEndian

      private Boolean isBigEndian
    • hasBom

      private Boolean hasBom
    • reader

      private Reader reader
    • DEFAULT_BUFFER_SIZE

      public static final int DEFAULT_BUFFER_SIZE
      See Also:
    • DEFAULT_XMLDECL_BUFFER_SIZE

      public static final int DEFAULT_XMLDECL_BUFFER_SIZE
      See Also:
    • fAllowJavaEncodings

      private boolean fAllowJavaEncodings
    • fSymbolTable

      private SymbolTable fSymbolTable
    • fCurrentEntity

      private XMLEncodingDetector fCurrentEntity
    • fBufferSize

      private int fBufferSize
    • lineNumber

      private int lineNumber
    • columnNumber

      private int columnNumber
    • literal

      private boolean literal
    • ch

      private char[] ch
    • position

      private int position
    • count

      private int count
    • mayReadChunks

      private boolean mayReadChunks
    • fString

      private XMLString fString
    • fStringBuffer

      private XMLStringBuffer fStringBuffer
    • fStringBuffer2

      private XMLStringBuffer fStringBuffer2
    • fVersionSymbol

      private static final String fVersionSymbol
      See Also:
    • fEncodingSymbol

      private static final String fEncodingSymbol
      See Also:
    • fStandaloneSymbol

      private static final String fStandaloneSymbol
      See Also:
    • fMarkupDepth

      private int fMarkupDepth
    • fStrings

      private String[] fStrings
    • err

      private ErrorDispatcher err
  • Constructor Details

    • XMLEncodingDetector

      public XMLEncodingDetector()
      Constructor
  • Method Details

    • getEncoding

      public static Object[] getEncoding(String fname, JarFile jarFile, JspCompilationContext ctxt, ErrorDispatcher err) throws IOException, JasperException
      Autodetects the encoding of the XML document supplied by the given input stream. Encoding autodetection is done according to the XML 1.0 specification, Appendix F.1: Detection Without External Encoding Information.
      Returns:
      Two-element array, where the first element (of type java.lang.String) contains the name of the (auto)detected encoding, and the second element (of type java.lang.Boolean) specifies whether the encoding was specified using the 'encoding' attribute of an XML prolog (TRUE) or autodetected (FALSE).
      Throws:
      IOException
      JasperException
    • getEncoding

      private Object[] getEncoding(InputStream in, ErrorDispatcher err) throws IOException, JasperException
      Throws:
      IOException
      JasperException
    • endEntity

      void endEntity()
    • createInitialReader

      private void createInitialReader() throws IOException, JasperException
      Throws:
      IOException
      JasperException
    • createReader

      private Reader createReader(InputStream inputStream, String encoding, Boolean isBigEndian) throws IOException, JasperException
      Creates a reader capable of reading the given input stream in the specified encoding.
      Parameters:
      inputStream - The input stream.
      encoding - The encoding name that the input stream is encoded using. If the user has specified that Java encoding names are allowed, then the encoding name may be a Java encoding name; otherwise, it is an ianaEncoding name.
      isBigEndian - For encodings (like uCS-4), whose names cannot specify a byte order, this tells whether the order is bigEndian. null means unknown or not relevant.
      Returns:
      Returns a reader.
      Throws:
      IOException
      JasperException
    • getEncodingName

      private Object[] getEncodingName(byte[] b4, int count)
      Returns the IANA encoding name that is auto-detected from the bytes specified, with the endian-ness of that encoding where appropriate.
      Parameters:
      b4 - The first four bytes of the input.
      count - The number of bytes actually read.
      Returns:
      a 2-element array: the first element, an IANA-encoding string, the second element a Boolean which is true iff the document is big endian, false if it's little-endian, and null if the distinction isn't relevant.
    • isExternal

      public boolean isExternal()
      Returns true if the current entity being scanned is external.
    • peekChar

      public int peekChar() throws IOException
      Returns the next character on the input.

      Note: The character is not consumed.

      Throws:
      IOException - Thrown if i/o error occurs.
      EOFException - Thrown on end of file.
    • scanChar

      public int scanChar() throws IOException
      Returns the next character on the input.

      Note: The character is consumed.

      Throws:
      IOException - Thrown if i/o error occurs.
      EOFException - Thrown on end of file.
    • scanName

      public String scanName() throws IOException
      Returns a string matching the Name production appearing immediately on the input as a symbol, or null if no Name string is present.

      Note: The Name characters are consumed.

      Note: The string returned must be a symbol. The SymbolTable can be used for this purpose.

      Throws:
      IOException - Thrown if i/o error occurs.
      EOFException - Thrown on end of file.
      See Also:
    • scanLiteral

      public int scanLiteral(int quote, XMLString content) throws IOException
      Scans a range of attribute value data, setting the fields of the XMLString structure, appropriately.

      Note: The characters are consumed.

      Note: This method does not guarantee to return the longest run of attribute value data. This method may return before the quote character due to reaching the end of the input buffer or any other reason.

      Note: The fields contained in the XMLString structure are not guaranteed to remain valid upon subsequent calls to the entity scanner. Therefore, the caller is responsible for immediately using the returned character data or making a copy of the character data.

      Parameters:
      quote - The quote character that signifies the end of the attribute value data.
      content - The content structure to fill.
      Returns:
      Returns the next character on the input, if known. This value may be -1 but this does note designate end of file.
      Throws:
      IOException - Thrown if i/o error occurs.
      EOFException - Thrown on end of file.
    • scanData

      public boolean scanData(String delimiter, XMLStringBuffer buffer) throws IOException
      Scans a range of character data up to the specified delimiter, setting the fields of the XMLString structure, appropriately.

      Note: The characters are consumed.

      Note: This assumes that the internal buffer is at least the same size, or bigger, than the length of the delimiter and that the delimiter contains at least one character.

      Note: This method does not guarantee to return the longest run of character data. This method may return before the delimiter due to reaching the end of the input buffer or any other reason.

      Note: The fields contained in the XMLString structure are not guaranteed to remain valid upon subsequent calls to the entity scanner. Therefore, the caller is responsible for immediately using the returned character data or making a copy of the character data.

      Parameters:
      delimiter - The string that signifies the end of the character data to be scanned.
      buffer - The data structure to fill.
      Returns:
      Returns true if there is more data to scan, false otherwise.
      Throws:
      IOException - Thrown if i/o error occurs.
      EOFException - Thrown on end of file.
    • skipChar

      public boolean skipChar(int c) throws IOException
      Skips a character appearing immediately on the input.

      Note: The character is consumed only if it matches the specified character.

      Parameters:
      c - The character to skip.
      Returns:
      Returns true if the character was skipped.
      Throws:
      IOException - Thrown if i/o error occurs.
      EOFException - Thrown on end of file.
    • skipSpaces

      public boolean skipSpaces() throws IOException
      Skips space characters appearing immediately on the input.

      Note: The characters are consumed only if they are space characters.

      Returns:
      Returns true if at least one space character was skipped.
      Throws:
      IOException - Thrown if i/o error occurs.
      EOFException - Thrown on end of file.
      See Also:
    • skipString

      public boolean skipString(String s) throws IOException
      Skips the specified string appearing immediately on the input.

      Note: The characters are consumed only if they are space characters.

      Parameters:
      s - The string to skip.
      Returns:
      Returns true if the string was skipped.
      Throws:
      IOException - Thrown if i/o error occurs.
      EOFException - Thrown on end of file.
    • load

      final boolean load(int offset, boolean changeEntity) throws IOException
      Loads a chunk of text.
      Parameters:
      offset - The offset into the character buffer to read the next batch of characters.
      changeEntity - True if the load should change entities at the end of the entity, otherwise leave the current entity in place and the entity boundary will be signaled by the return value.
      Throws:
      IOException
    • scanXMLDecl

      private void scanXMLDecl() throws IOException, JasperException
      Throws:
      IOException
      JasperException
    • scanXMLDeclOrTextDecl

      private void scanXMLDeclOrTextDecl(boolean scanningTextDecl) throws IOException, JasperException
      Scans an XML or text declaration.

       [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
       [24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ")
       [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' |  "'" EncName "'" )
       [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
       [32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'")
                       | ('"' ('yes' | 'no') '"'))
      
       [77] TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>'
       
      Parameters:
      scanningTextDecl - True if a text declaration is to be scanned instead of an XML declaration.
      Throws:
      IOException
      JasperException
    • scanXMLDeclOrTextDecl

      private void scanXMLDeclOrTextDecl(boolean scanningTextDecl, String[] pseudoAttributeValues) throws IOException, JasperException
      Scans an XML or text declaration.

       [23] XMLDecl ::= 'invalid input: '<'?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
       [24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ")
       [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' |  "'" EncName "'" )
       [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
       [32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'")
                       | ('"' ('yes' | 'no') '"'))
      
       [77] TextDecl ::= 'invalid input: '<'?xml' VersionInfo? EncodingDecl S? '?>'
       
      Parameters:
      scanningTextDecl - True if a text declaration is to be scanned instead of an XML declaration.
      pseudoAttributeValues - An array of size 3 to return the version, encoding and standalone pseudo attribute values (in that order). Note: This method uses fString, anything in it at the time of calling is lost.
      Throws:
      IOException
      JasperException
    • scanPseudoAttribute

      public String scanPseudoAttribute(boolean scanningTextDecl, XMLString value) throws IOException, JasperException
      Scans a pseudo attribute.
      Parameters:
      scanningTextDecl - True if scanning this pseudo-attribute for a TextDecl; false if scanning XMLDecl. This flag is needed to report the correct type of error.
      value - The string to fill in with the attribute value.
      Returns:
      The name of the attribute Note: This method uses fStringBuffer2, anything in it at the time of calling is lost.
      Throws:
      IOException
      JasperException
    • scanPIData

      private void scanPIData(String target, XMLString data) throws IOException, JasperException
      Scans a processing data. This is needed to handle the situation where a document starts with a processing instruction whose target name starts with "xml". (e.g. xmlfoo) Note: This method uses fStringBuffer, anything in it at the time of calling is lost.
      Parameters:
      target - The PI target
      data - The string to fill in with the data
      Throws:
      IOException
      JasperException
    • scanSurrogates

      private boolean scanSurrogates(XMLStringBuffer buf) throws IOException, JasperException
      Scans surrogates and append them to the specified buffer.

      Note: This assumes the current char has already been identified as a high surrogate.

      Parameters:
      buf - The StringBuffer to append the read surrogates to.
      Throws:
      IOException
      JasperException
    • reportFatalError

      private void reportFatalError(String msgId, String arg) throws JasperException
      Convenience function used in all XML scanners.
      Throws:
      JasperException