Class AbstractDOMParser

  • All Implemented Interfaces:
    XMLDocumentHandler, XMLDTDContentModelHandler, XMLDTDHandler
    Direct Known Subclasses:
    DOMParser, DOMParserImpl

    public class AbstractDOMParser
    extends AbstractXMLDocumentParser
    This is the base class of all DOM parsers. It implements the XNI callback methods to create the DOM tree. After a successful parse of an XML document, the DOM Document object can be queried using the getDocument method. The actual pipeline is defined in parser configuration.
    Version:
    $Id$
    Author:
    Arnaud Le Hors, IBM, Andy Clark, IBM, Elena Litani, IBM
    • Field Detail

      • CREATE_ENTITY_REF_NODES

        protected static final String CREATE_ENTITY_REF_NODES
        Feature id: create entity ref nodes.
        See Also:
        Constant Field Values
      • INCLUDE_COMMENTS_FEATURE

        protected static final String INCLUDE_COMMENTS_FEATURE
        Feature id: include comments.
        See Also:
        Constant Field Values
      • CREATE_CDATA_NODES_FEATURE

        protected static final String CREATE_CDATA_NODES_FEATURE
        Feature id: create cdata nodes.
        See Also:
        Constant Field Values
      • INCLUDE_IGNORABLE_WHITESPACE

        protected static final String INCLUDE_IGNORABLE_WHITESPACE
        Feature id: include ignorable whitespace.
        See Also:
        Constant Field Values
      • DEFER_NODE_EXPANSION

        protected static final String DEFER_NODE_EXPANSION
        Feature id: defer node expansion.
        See Also:
        Constant Field Values
      • DOCUMENT_CLASS_NAME

        protected static final String DOCUMENT_CLASS_NAME
        Property id: document class name.
        See Also:
        Constant Field Values
      • DEFAULT_DOCUMENT_CLASS_NAME

        protected static final String DEFAULT_DOCUMENT_CLASS_NAME
        Default document class name.
        See Also:
        Constant Field Values
      • fInDTD

        protected boolean fInDTD
        True if inside DTD.
      • fCreateEntityRefNodes

        protected boolean fCreateEntityRefNodes
        Create entity reference nodes.
      • fIncludeIgnorableWhitespace

        protected boolean fIncludeIgnorableWhitespace
        Include ignorable whitespace.
      • fIncludeComments

        protected boolean fIncludeComments
        Include Comments.
      • fCreateCDATANodes

        protected boolean fCreateCDATANodes
        Create cdata nodes.
      • fDocument

        protected Document fDocument
        The document.
      • fDocumentImpl

        protected CoreDocumentImpl fDocumentImpl
        The default Xerces document implementation, if used.
      • fStorePSVI

        protected boolean fStorePSVI
        Whether to store PSVI information in DOM tree.
      • fDocumentClassName

        protected String fDocumentClassName
        The document class name to use.
      • fDocumentType

        protected DocumentType fDocumentType
        The document type node.
      • fCurrentNode

        protected Node fCurrentNode
        Current node.
      • fCurrentCDATASection

        protected CDATASection fCurrentCDATASection
      • fCurrentEntityDecl

        protected EntityImpl fCurrentEntityDecl
      • fDeferredEntityDecl

        protected int fDeferredEntityDecl
      • fStringBuffer

        protected final StringBuffer fStringBuffer
        Character buffer
      • fInternalSubset

        protected StringBuffer fInternalSubset
        Internal subset buffer.
      • fDeferNodeExpansion

        protected boolean fDeferNodeExpansion
      • fNamespaceAware

        protected boolean fNamespaceAware
      • fDocumentIndex

        protected int fDocumentIndex
      • fDocumentTypeIndex

        protected int fDocumentTypeIndex
      • fCurrentNodeIndex

        protected int fCurrentNodeIndex
      • fCurrentCDATASectionIndex

        protected int fCurrentCDATASectionIndex
      • fInDTDExternalSubset

        protected boolean fInDTDExternalSubset
        True if inside DTD external subset.
      • fRoot

        protected Node fRoot
        Root element node.
      • fInCDATASection

        protected boolean fInCDATASection
        True if inside CDATA section.
      • fFirstChunk

        protected boolean fFirstChunk
        True if saw the first chunk of characters
      • fFilterReject

        protected boolean fFilterReject
        LSParserFilter: specifies that element with given QNAME and all its children must be rejected
      • fBaseURIStack

        protected final Stack fBaseURIStack
        Base uri stack
      • fRejectedElementDepth

        protected int fRejectedElementDepth
        LSParserFilter: tracks the element depth within a rejected subtree.
      • fSkippedElemStack

        protected Stack fSkippedElemStack
        LSParserFilter: store depth of skipped elements
      • fInEntityRef

        protected boolean fInEntityRef
        LSParserFilter: true if inside entity reference
    • Constructor Detail

    • Method Detail

      • getDocumentClassName

        protected String getDocumentClassName()
        This method retreives the name of current document class.
      • setDocumentClassName

        protected void setDocumentClassName​(String documentClassName)
        This method allows the programmer to decide which document factory to use when constructing the DOM tree. However, doing so will lose the functionality of the default factory. Also, a document class other than the default will lose the ability to defer node expansion on the DOM tree produced.
        Parameters:
        documentClassName - The fully qualified class name of the document factory to use when constructing the DOM tree.
        See Also:
        getDocumentClassName(), DEFAULT_DOCUMENT_CLASS_NAME
      • getDocument

        public Document getDocument()
        Returns the DOM document object.
      • dropDocumentReferences

        public final void dropDocumentReferences()
        Drops all references to the last DOM which was built by this parser.
      • setLocale

        public void setLocale​(Locale locale)
        Set the locale to use for messages.
        Parameters:
        locale - The locale object to use for localization of messages.
      • startGeneralEntity

        public void startGeneralEntity​(String name,
                                       XMLResourceIdentifier identifier,
                                       String encoding,
                                       Augmentations augs)
                                throws XNIException
        This method notifies the start of a general entity.

        Note: This method is not called for entity references appearing as part of attribute values.

        Specified by:
        startGeneralEntity in interface XMLDocumentHandler
        Overrides:
        startGeneralEntity in class AbstractXMLDocumentParser
        Parameters:
        name - The name of the general entity.
        identifier - The resource identifier.
        encoding - The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal entities or a document entity that is parsed from a java.io.Reader).
        augs - Additional information that may include infoset augmentations
        Throws:
        XNIException - Thrown by handler to signal an error.
      • textDecl

        public void textDecl​(String version,
                             String encoding,
                             Augmentations augs)
                      throws XNIException
        Notifies of the presence of a TextDecl line in an entity. If present, this method will be called immediately following the startEntity call.

        Note: This method will never be called for the document entity; it is only called for external general entities referenced in document content.

        Note: This method is not called for entity references appearing as part of attribute values.

        Specified by:
        textDecl in interface XMLDocumentHandler
        Specified by:
        textDecl in interface XMLDTDHandler
        Overrides:
        textDecl in class AbstractXMLDocumentParser
        Parameters:
        version - The XML version, or null if not specified.
        encoding - The IANA encoding name of the entity.
        augs - Additional information that may include infoset augmentations
        Throws:
        XNIException - Thrown by handler to signal an error.
      • processingInstruction

        public void processingInstruction​(String target,
                                          XMLString data,
                                          Augmentations augs)
                                   throws XNIException
        A processing instruction. Processing instructions consist of a target name and, optionally, text data. The data is only meaningful to the application.

        Typically, a processing instruction's data will contain a series of pseudo-attributes. These pseudo-attributes follow the form of element attributes but are not parsed or presented to the application as anything other than text. The application is responsible for parsing the data.

        Specified by:
        processingInstruction in interface XMLDocumentHandler
        Specified by:
        processingInstruction in interface XMLDTDHandler
        Overrides:
        processingInstruction in class AbstractXMLDocumentParser
        Parameters:
        target - The target.
        data - The data or null if none specified.
        augs - Additional information that may include infoset augmentations
        Throws:
        XNIException - Thrown by handler to signal an error.
      • startDocument

        public void startDocument​(XMLLocator locator,
                                  String encoding,
                                  NamespaceContext namespaceContext,
                                  Augmentations augs)
                           throws XNIException
        The start of the document.
        Specified by:
        startDocument in interface XMLDocumentHandler
        Overrides:
        startDocument in class AbstractXMLDocumentParser
        Parameters:
        locator - The system identifier of the entity if the entity is external, null otherwise.
        encoding - The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal entities or a document entity that is parsed from a java.io.Reader).
        namespaceContext - The namespace context in effect at the start of this document. This object represents the current context. Implementors of this class are responsible for copying the namespace bindings from the the current context (and its parent contexts) if that information is important.
        augs - Additional information that may include infoset augmentations
        Throws:
        XNIException - Thrown by handler to signal an error.
      • xmlDecl

        public void xmlDecl​(String version,
                            String encoding,
                            String standalone,
                            Augmentations augs)
                     throws XNIException
        Notifies of the presence of an XMLDecl line in the document. If present, this method will be called immediately following the startDocument call.
        Specified by:
        xmlDecl in interface XMLDocumentHandler
        Overrides:
        xmlDecl in class AbstractXMLDocumentParser
        Parameters:
        version - The XML version.
        encoding - The IANA encoding name of the document, or null if not specified.
        standalone - The standalone value, or null if not specified.
        augs - Additional information that may include infoset augmentations
        Throws:
        XNIException - Thrown by handler to signal an error.
      • doctypeDecl

        public void doctypeDecl​(String rootElement,
                                String publicId,
                                String systemId,
                                Augmentations augs)
                         throws XNIException
        Notifies of the presence of the DOCTYPE line in the document.
        Specified by:
        doctypeDecl in interface XMLDocumentHandler
        Overrides:
        doctypeDecl in class AbstractXMLDocumentParser
        Parameters:
        rootElement - The name of the root element.
        publicId - The public identifier if an external DTD or null if the external DTD is specified using SYSTEM.
        systemId - The system identifier if an external DTD, null otherwise.
        augs - Additional information that may include infoset augmentations
        Throws:
        XNIException - Thrown by handler to signal an error.
      • startElement

        public void startElement​(QName element,
                                 XMLAttributes attributes,
                                 Augmentations augs)
                          throws XNIException
        The start of an element. If the document specifies the start element by using an empty tag, then the startElement method will immediately be followed by the endElement method, with no intervening methods.
        Specified by:
        startElement in interface XMLDocumentHandler
        Overrides:
        startElement in class AbstractXMLDocumentParser
        Parameters:
        element - The name of the element.
        attributes - The element attributes.
        augs - Additional information that may include infoset augmentations
        Throws:
        XNIException - Thrown by handler to signal an error.
      • ignorableWhitespace

        public void ignorableWhitespace​(XMLString text,
                                        Augmentations augs)
                                 throws XNIException
        Ignorable whitespace. For this method to be called, the document source must have some way of determining that the text containing only whitespace characters should be considered ignorable. For example, the validator can determine if a length of whitespace characters in the document are ignorable based on the element content model.
        Specified by:
        ignorableWhitespace in interface XMLDocumentHandler
        Overrides:
        ignorableWhitespace in class AbstractXMLDocumentParser
        Parameters:
        text - The ignorable whitespace.
        augs - Additional information that may include infoset augmentations
        Throws:
        XNIException - Thrown by handler to signal an error.
      • handleBaseURI

        protected final void handleBaseURI​(Node node)
        Record baseURI information for the Element (by adding xml:base attribute) or for the ProcessingInstruction (by setting a baseURI field) Non deferred DOM.
        Parameters:
        node -
      • handleBaseURI

        protected final void handleBaseURI​(int node)
        Record baseURI information for the Element (by adding xml:base attribute) or for the ProcessingInstruction (by setting a baseURI field) Deferred DOM.
        Parameters:
        node -
      • startDTD

        public void startDTD​(XMLLocator locator,
                             Augmentations augs)
                      throws XNIException
        The start of the DTD.
        Specified by:
        startDTD in interface XMLDTDHandler
        Overrides:
        startDTD in class AbstractXMLDocumentParser
        Parameters:
        locator - The document locator, or null if the document location cannot be reported during the parsing of the document DTD. However, it is strongly recommended that a locator be supplied that can at least report the base system identifier of the DTD.
        augs - Additional information that may include infoset augmentations.
        Throws:
        XNIException - Thrown by handler to signal an error.
      • internalEntityDecl

        public void internalEntityDecl​(String name,
                                       XMLString text,
                                       XMLString nonNormalizedText,
                                       Augmentations augs)
                                throws XNIException
        An internal entity declaration.
        Specified by:
        internalEntityDecl in interface XMLDTDHandler
        Overrides:
        internalEntityDecl in class AbstractXMLDocumentParser
        Parameters:
        name - The name of the entity. Parameter entity names start with '%', whereas the name of a general entity is just the entity name.
        text - The value of the entity.
        nonNormalizedText - The non-normalized value of the entity. This value contains the same sequence of characters that was in the internal entity declaration, without any entity references expanded.
        augs - Additional information that may include infoset augmentations.
        Throws:
        XNIException - Thrown by handler to signal an error.
      • startParameterEntity

        public void startParameterEntity​(String name,
                                         XMLResourceIdentifier identifier,
                                         String encoding,
                                         Augmentations augs)
                                  throws XNIException
        This method notifies of the start of a parameter entity. The parameter entity name start with a '%' character.
        Specified by:
        startParameterEntity in interface XMLDTDHandler
        Overrides:
        startParameterEntity in class AbstractXMLDocumentParser
        Parameters:
        name - The name of the parameter entity.
        identifier - The resource identifier.
        encoding - The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal parameter entities).
        augs - Additional information that may include infoset augmentations.
        Throws:
        XNIException - Thrown by handler to signal an error.
      • attributeDecl

        public void attributeDecl​(String elementName,
                                  String attributeName,
                                  String type,
                                  String[] enumeration,
                                  String defaultType,
                                  XMLString defaultValue,
                                  XMLString nonNormalizedDefaultValue,
                                  Augmentations augs)
                           throws XNIException
        An attribute declaration.
        Specified by:
        attributeDecl in interface XMLDTDHandler
        Overrides:
        attributeDecl in class AbstractXMLDocumentParser
        Parameters:
        elementName - The name of the element that this attribute is associated with.
        attributeName - The name of the attribute.
        type - The attribute type. This value will be one of the following: "CDATA", "ENTITY", "ENTITIES", "ENUMERATION", "ID", "IDREF", "IDREFS", "NMTOKEN", "NMTOKENS", or "NOTATION".
        enumeration - If the type has the value "ENUMERATION" or "NOTATION", this array holds the allowed attribute values; otherwise, this array is null.
        defaultType - The attribute default type. This value will be one of the following: "#FIXED", "#IMPLIED", "#REQUIRED", or null.
        defaultValue - The attribute default value, or null if no default value is specified.
        nonNormalizedDefaultValue - The attribute default value with no normalization performed, or null if no default value is specified.
        augs - Additional information that may include infoset augmentations.
        Throws:
        XNIException - Thrown by handler to signal an error.
      • createElementNode

        protected Element createElementNode​(QName element)
      • createAttrNode

        protected Attr createAttrNode​(QName attrQName)
      • setCharacterData

        protected void setCharacterData​(boolean sawChars)