Class XMLEntityManager

  • All Implemented Interfaces:
    XMLComponent, XMLEntityResolver

    public class XMLEntityManager
    extends Object
    implements XMLComponent, XMLEntityResolver
    The entity manager handles the registration of general and parameter entities; resolves entities; and starts entities. The entity manager is a central component in a standard parser configuration and this class works directly with the entity scanner to manage the underlying xni.

    This component requires the following features and properties from the component manager that uses it:

    • http://xml.org/sax/features/validation
    • http://xml.org/sax/features/external-general-entities
    • http://xml.org/sax/features/external-parameter-entities
    • http://apache.org/xml/features/allow-java-encodings
    • http://apache.org/xml/properties/internal/symbol-table
    • http://apache.org/xml/properties/internal/error-reporter
    • http://apache.org/xml/properties/internal/entity-resolver
    Version:
    $Id$
    Author:
    Andy Clark, IBM, Arnaud Le Hors, IBM
    • Field Detail

      • DEFAULT_BUFFER_SIZE

        public static final int DEFAULT_BUFFER_SIZE
        Default buffer size (2048).
        See Also:
        Constant Field Values
      • DEFAULT_XMLDECL_BUFFER_SIZE

        public static final int DEFAULT_XMLDECL_BUFFER_SIZE
        Default buffer size before we've finished with the XMLDecl:
        See Also:
        Constant Field Values
      • DEFAULT_INTERNAL_BUFFER_SIZE

        public static final int DEFAULT_INTERNAL_BUFFER_SIZE
        Default internal entity buffer size (512).
        See Also:
        Constant Field Values
      • EXTERNAL_GENERAL_ENTITIES

        protected static final String EXTERNAL_GENERAL_ENTITIES
        Feature identifier: external general entities.
        See Also:
        Constant Field Values
      • EXTERNAL_PARAMETER_ENTITIES

        protected static final String EXTERNAL_PARAMETER_ENTITIES
        Feature identifier: external parameter entities.
        See Also:
        Constant Field Values
      • ALLOW_JAVA_ENCODINGS

        protected static final String ALLOW_JAVA_ENCODINGS
        Feature identifier: allow Java encodings.
        See Also:
        Constant Field Values
      • WARN_ON_DUPLICATE_ENTITYDEF

        protected static final String WARN_ON_DUPLICATE_ENTITYDEF
        Feature identifier: warn on duplicate EntityDef
        See Also:
        Constant Field Values
      • STANDARD_URI_CONFORMANT

        protected static final String STANDARD_URI_CONFORMANT
        Feature identifier: standard uri conformant
        See Also:
        Constant Field Values
      • SECURITY_MANAGER

        protected static final String SECURITY_MANAGER
        property identifier: security manager.
        See Also:
        Constant Field Values
      • fValidation

        protected boolean fValidation
        Validation. This feature identifier is: http://xml.org/sax/features/validation
      • fExternalGeneralEntities

        protected boolean fExternalGeneralEntities
        External general entities. This feature identifier is: http://xml.org/sax/features/external-general-entities
      • fExternalParameterEntities

        protected boolean fExternalParameterEntities
        External parameter entities. This feature identifier is: http://xml.org/sax/features/external-parameter-entities
      • fAllowJavaEncodings

        protected boolean fAllowJavaEncodings
        Allow Java encoding names. This feature identifier is: http://apache.org/xml/features/allow-java-encodings
      • fWarnDuplicateEntityDef

        protected boolean fWarnDuplicateEntityDef
        warn on duplicate Entity declaration. http://apache.org/xml/features/warn-on-duplicate-entitydef
      • fStrictURI

        protected boolean fStrictURI
        standard uri conformant (strict uri). http://apache.org/xml/features/standard-uri-conformant
      • fSymbolTable

        protected SymbolTable fSymbolTable
        Symbol table. This property identifier is: http://apache.org/xml/properties/internal/symbol-table
      • fErrorReporter

        protected XMLErrorReporter fErrorReporter
        Error reporter. This property identifier is: http://apache.org/xml/properties/internal/error-reporter
      • fEntityResolver

        protected XMLEntityResolver fEntityResolver
        Entity resolver. This property identifier is: http://apache.org/xml/properties/internal/entity-resolver
      • fValidationManager

        protected ValidationManager fValidationManager
        Validation manager. This property identifier is: http://apache.org/xml/properties/internal/validation-manager
      • fBufferSize

        protected int fBufferSize
        Buffer size. We get this value from a property. The default size is used if the input buffer size property is not specified. REVISIT: do we need a property for internal entity buffer size?
      • fStandalone

        protected boolean fStandalone
        True if the document entity is standalone. This should really only be set by the document source (e.g. XMLDocumentScanner).
      • fHasPEReferences

        protected boolean fHasPEReferences
        True if the current document contains parameter entity references.
      • fInExternalSubset

        protected boolean fInExternalSubset
      • fEntityScanner

        protected XMLEntityScanner fEntityScanner
        Current entity scanner.
      • fXML10EntityScanner

        protected XMLEntityScanner fXML10EntityScanner
        XML 1.0 entity scanner.
      • fXML11EntityScanner

        protected XMLEntityScanner fXML11EntityScanner
        XML 1.1 entity scanner.
      • fEntityExpansionLimit

        protected int fEntityExpansionLimit
      • fEntityExpansionCount

        protected int fEntityExpansionCount
      • fEntities

        protected final Hashtable fEntities
        Entities.
      • fEntityStack

        protected final Stack fEntityStack
        Entity stack.
      • fDeclaredEntities

        protected Hashtable fDeclaredEntities
        Shared declared entities.
      • fReaderStack

        protected Stack fReaderStack
    • Constructor Detail

      • XMLEntityManager

        public XMLEntityManager()
        Default constructor.
      • XMLEntityManager

        public XMLEntityManager​(XMLEntityManager entityManager)
        Constructs an entity manager that shares the specified entity declarations during each parse.

        REVISIT: We might want to think about the "right" way to expose the list of declared entities. For now, the knowledge how to access the entity declarations is implicit.

    • Method Detail

      • setStandalone

        public void setStandalone​(boolean standalone)
        Sets whether the document entity is standalone.
        Parameters:
        standalone - True if document entity is standalone.
      • isStandalone

        public boolean isStandalone()
        Returns true if the document entity is standalone.
      • setEntityHandler

        public void setEntityHandler​(XMLEntityHandler entityHandler)
        Sets the entity handler. When an entity starts and ends, the entity handler is notified of the change.
        Parameters:
        entityHandler - The new entity handler.
      • addInternalEntity

        public void addInternalEntity​(String name,
                                      String text,
                                      int paramEntityRefs)
        Adds an internal entity declaration.

        Note: This method ignores subsequent entity declarations.

        Note: The name should be a unique symbol. The SymbolTable can be used for this purpose.

        Parameters:
        name - The name of the entity.
        text - The text of the entity.
        paramEntityRefs - Count of direct and indirect references to parameter entities in the value of the entity.
        See Also:
        SymbolTable
      • addInternalEntity

        public void addInternalEntity​(String name,
                                      String text)
        Adds an internal entity declaration.

        Note: This method ignores subsequent entity declarations.

        Note: The name should be a unique symbol. The SymbolTable can be used for this purpose.

        Parameters:
        name - The name of the entity.
        text - The text of the entity.
        See Also:
        SymbolTable
      • getParamEntityRefCount

        public int getParamEntityRefCount​(String entityName)
        Returns the number of direct and indirect references to parameter entities in the value of the entity. This value will only be non-zero for an internal parameter entity.
        Parameters:
        entityName - The name of the entity to check.
        Returns:
        Count of direct and indirect references to parameter entities in the value of the entity
      • addExternalEntity

        public void addExternalEntity​(String name,
                                      String publicId,
                                      String literalSystemId,
                                      String baseSystemId)
                               throws IOException
        Adds an external entity declaration.

        Note: This method ignores subsequent entity declarations.

        Note: The name should be a unique symbol. The SymbolTable can be used for this purpose.

        Parameters:
        name - The name of the entity.
        publicId - The public identifier of the entity.
        literalSystemId - The system identifier of the entity.
        baseSystemId - The base system identifier of the entity. This is the system identifier of the entity where the entity being added and is used to expand the system identifier when the system identifier is a relative URI. When null the system identifier of the first external entity on the stack is used instead.
        Throws:
        IOException
        See Also:
        SymbolTable
      • isExternalEntity

        public boolean isExternalEntity​(String entityName)
        Checks whether an entity given by name is external.
        Parameters:
        entityName - The name of the entity to check.
        Returns:
        True if the entity is external, false otherwise (including when the entity is not declared).
      • isEntityDeclInExternalSubset

        public boolean isEntityDeclInExternalSubset​(String entityName)
        Checks whether the declaration of an entity given by name is // in the external subset.
        Parameters:
        entityName - The name of the entity to check.
        Returns:
        True if the entity was declared in the external subset, false otherwise (including when the entity is not declared).
      • addUnparsedEntity

        public void addUnparsedEntity​(String name,
                                      String publicId,
                                      String systemId,
                                      String baseSystemId,
                                      String notation)
        Adds an unparsed entity declaration.

        Note: This method ignores subsequent entity declarations.

        Note: The name should be a unique symbol. The SymbolTable can be used for this purpose.

        Parameters:
        name - The name of the entity.
        publicId - The public identifier of the entity.
        systemId - The system identifier of the entity.
        notation - The name of the notation.
        See Also:
        SymbolTable
      • isUnparsedEntity

        public boolean isUnparsedEntity​(String entityName)
        Checks whether an entity given by name is unparsed.
        Parameters:
        entityName - The name of the entity to check.
        Returns:
        True if the entity is unparsed, false otherwise (including when the entity is not declared).
      • isDeclaredEntity

        public boolean isDeclaredEntity​(String entityName)
        Checks whether an entity given by name is declared.
        Parameters:
        entityName - The name of the entity to check.
        Returns:
        True if the entity is declared, false otherwise.
      • resolveEntity

        public XMLInputSource resolveEntity​(XMLResourceIdentifier resourceIdentifier)
                                     throws IOException,
                                            XNIException
        Resolves the specified public and system identifiers. This method first attempts to resolve the entity based on the EntityResolver registered by the application. If no entity resolver is registered or if the registered entity handler is unable to resolve the entity, then default entity resolution will occur.
        Specified by:
        resolveEntity in interface XMLEntityResolver
        Parameters:
        resourceIdentifier - The XMLResourceIdentifier for the resource to resolve.
        Returns:
        Returns an input source that wraps the resolved entity. This method will never return null.
        Throws:
        IOException - Thrown on i/o error.
        XNIException - Thrown by entity resolver to signal an error.
        See Also:
        XMLResourceIdentifier
      • startEntity

        public void startEntity​(String entityName,
                                boolean literal)
                         throws IOException,
                                XNIException
        Starts a named entity.
        Parameters:
        entityName - The name of the entity to start.
        literal - True if this entity is started within a literal value.
        Throws:
        IOException - Thrown on i/o error.
        XNIException - Thrown by entity handler to signal an error.
      • startDocumentEntity

        public void startDocumentEntity​(XMLInputSource xmlInputSource)
                                 throws IOException,
                                        XNIException
        Starts the document entity. The document entity has the "[xml]" pseudo-name.
        Parameters:
        xmlInputSource - The input source of the document entity.
        Throws:
        IOException - Thrown on i/o error.
        XNIException - Thrown by entity handler to signal an error.
      • startDTDEntity

        public void startDTDEntity​(XMLInputSource xmlInputSource)
                            throws IOException,
                                   XNIException
        Starts the DTD entity. The DTD entity has the "[dtd]" pseudo-name.
        Parameters:
        xmlInputSource - The input source of the DTD entity.
        Throws:
        IOException - Thrown on i/o error.
        XNIException - Thrown by entity handler to signal an error.
      • startExternalSubset

        public void startExternalSubset()
      • endExternalSubset

        public void endExternalSubset()
      • startEntity

        public void startEntity​(String name,
                                XMLInputSource xmlInputSource,
                                boolean literal,
                                boolean isExternal)
                         throws IOException,
                                XNIException
        Starts an entity.

        This method can be used to insert an application defined XML entity stream into the parsing stream.

        Parameters:
        name - The name of the entity.
        xmlInputSource - The input source of the entity.
        literal - True if this entity is started within a literal value.
        isExternal - whether this entity should be treated as an internal or external entity.
        Throws:
        IOException - Thrown on i/o error.
        XNIException - Thrown by entity handler to signal an error.
      • setupCurrentEntity

        public String setupCurrentEntity​(String name,
                                         XMLInputSource xmlInputSource,
                                         boolean literal,
                                         boolean isExternal)
                                  throws IOException,
                                         XNIException
        This method uses the passed-in XMLInputSource to make fCurrentEntity usable for reading.
        Parameters:
        name - name of the entity (XML is it's the document entity)
        xmlInputSource - the input source, with sufficient information to begin scanning characters.
        literal - True if this entity is started within a literal value.
        isExternal - whether this entity should be treated as an internal or external entity.
        Returns:
        the encoding of the new entity or null if a character stream was employed
        Throws:
        IOException - if anything can't be read XNIException If any parser-specific goes wrong.
        XNIException
      • setScannerVersion

        public void setScannerVersion​(short version)
      • getEntityScanner

        public XMLEntityScanner getEntityScanner()
        Returns the entity scanner.
      • closeReaders

        public void closeReaders()
        Close all opened InputStreams and Readers opened by this parser.
      • reset

        public void reset​(XMLComponentManager componentManager)
                   throws XMLConfigurationException
        Resets the component. The component can query the component manager about any features and properties that affect the operation of the component.
        Specified by:
        reset in interface XMLComponent
        Parameters:
        componentManager - The component manager.
        Throws:
        SAXException - Thrown by component on initialization error. For example, if a feature or property is required for the operation of the component, the component manager may throw a SAXNotRecognizedException or a SAXNotSupportedException.
        XMLConfigurationException
      • reset

        public void reset()
      • getRecognizedFeatures

        public String[] getRecognizedFeatures()
        Returns a list of feature identifiers that are recognized by this component. This method may return null if no features are recognized by this component.
        Specified by:
        getRecognizedFeatures in interface XMLComponent
      • setFeature

        public void setFeature​(String featureId,
                               boolean state)
                        throws XMLConfigurationException
        Sets the state of a feature. This method is called by the component manager any time after reset when a feature changes state.

        Note: Components should silently ignore features that do not affect the operation of the component.

        Specified by:
        setFeature in interface XMLComponent
        Parameters:
        featureId - The feature identifier.
        state - The state of the feature.
        Throws:
        SAXNotRecognizedException - The component should not throw this exception.
        SAXNotSupportedException - The component should not throw this exception.
        XMLConfigurationException - Thrown for configuration error. In general, components should only throw this exception if it is really a critical error.
      • getRecognizedProperties

        public String[] getRecognizedProperties()
        Returns a list of property identifiers that are recognized by this component. This method may return null if no properties are recognized by this component.
        Specified by:
        getRecognizedProperties in interface XMLComponent
      • setProperty

        public void setProperty​(String propertyId,
                                Object value)
                         throws XMLConfigurationException
        Sets the value of a property. This method is called by the component manager any time after reset when a property changes value.

        Note: Components should silently ignore properties that do not affect the operation of the component.

        Specified by:
        setProperty in interface XMLComponent
        Parameters:
        propertyId - The property identifier.
        value - The value of the property.
        Throws:
        SAXNotRecognizedException - The component should not throw this exception.
        SAXNotSupportedException - The component should not throw this exception.
        XMLConfigurationException - Thrown for configuration error. In general, components should only throw this exception if it is really a critical error.
      • getFeatureDefault

        public Boolean getFeatureDefault​(String featureId)
        Returns the default state for a feature, or null if this component does not want to report a default value for this feature.
        Specified by:
        getFeatureDefault in interface XMLComponent
        Parameters:
        featureId - The feature identifier.
        Since:
        Xerces 2.2.0
      • getPropertyDefault

        public Object getPropertyDefault​(String propertyId)
        Returns the default state for a property, or null if this component does not want to report a default value for this property.
        Specified by:
        getPropertyDefault in interface XMLComponent
        Parameters:
        propertyId - The property identifier.
        Since:
        Xerces 2.2.0
      • absolutizeAgainstUserDir

        public static void absolutizeAgainstUserDir​(URI uri)
                                             throws URI.MalformedURIException
        Absolutizes a URI using the current value of the "user.dir" property as the base URI. If the URI is already absolute, this is a no-op.
        Parameters:
        uri - the URI to absolutize
        Throws:
        URI.MalformedURIException
      • expandSystemId

        public static String expandSystemId​(String systemId,
                                            String baseSystemId,
                                            boolean strict)
                                     throws URI.MalformedURIException
        Expands a system id and returns the system id as a URI, if it can be expanded. A return value of null means that the identifier is already expanded. An exception thrown indicates a failure to expand the id.
        Parameters:
        systemId - The systemId to be expanded.
        Returns:
        Returns the URI string representing the expanded system identifier. A null value indicates that the given system identifier is already expanded.
        Throws:
        URI.MalformedURIException
      • getEncodingInfo

        protected org.smooks.engine.delivery.sax.ng.org.apache.xerces.impl.XMLEntityManager.EncodingInfo getEncodingInfo​(byte[] b4,
                                                                                                                         int count)
        Returns the IANA encoding name that is auto-detected from the bytes specified, with the endian-ness of that encoding where appropriate.
        Parameters:
        b4 - The first four bytes of the input.
        count - The number of bytes actually read.
        Returns:
        an instance of EncodingInfo which represents the auto-detected encoding.
      • createReader

        protected Reader createReader​(InputStream inputStream,
                                      String encoding,
                                      Boolean isBigEndian)
                               throws IOException
        Creates a reader capable of reading the given input stream in the specified encoding.
        Parameters:
        inputStream - The input stream.
        encoding - The encoding name that the input stream is encoded using. If the user has specified that Java encoding names are allowed, then the encoding name may be a Java encoding name; otherwise, it is an ianaEncoding name.
        isBigEndian - For encodings (like uCS-4), whose names cannot specify a byte order, this tells whether the order is bigEndian. Null means unknown or not relevant.
        Returns:
        Returns a reader.
        Throws:
        IOException
      • fixURI

        protected static String fixURI​(String str)
        Fixes a platform dependent filename to standard URI form.
        Parameters:
        str - The string to fix.
        Returns:
        Returns the fixed URI string.