org.attoparser.markup.xml
Class DOMXmlAttoHandler

Object
  extended by org.attoparser.AbstractAttoHandler
      extended by org.attoparser.markup.AbstractBasicMarkupAttoHandler
          extended by org.attoparser.markup.AbstractDetailedMarkupAttoHandler
              extended by org.attoparser.markup.xml.AbstractDetailedXmlAttoHandler
                  extended by org.attoparser.markup.xml.AbstractStandardXmlAttoHandler
                      extended by org.attoparser.markup.xml.DOMXmlAttoHandler
All Implemented Interfaces:
IAttoHandler, ITimedDocumentHandling, IAttributeSequenceHandling, IBasicDocTypeHandling, IBasicElementHandling, ICDATASectionHandling, ICommentHandling, IDetailedDocTypeHandling, IDetailedElementHandling, IProcessingInstructionHandling, IXmlDeclarationHandling, IDetailedXmlElementHandling

public final class DOMXmlAttoHandler
extends AbstractStandardXmlAttoHandler

Implementation of IAttoHandler that considers input as XML code and builds an attoDOM tree with objects from package org.attoparser.markup.dom.

Use of this handler requires the document to be well-formed from the XML standpoint.

Since:
1.1
Author:
Daniel Fernández

Constructor Summary
DOMXmlAttoHandler()
           Creates a new instance of this handler.
DOMXmlAttoHandler(String documentName)
           Creates a new instance of this handler.
 
Method Summary
 Document getDocument()
           Returns the attoDOM Document created during parsing.
 long getParsingEndTimeNanos()
           Returns the time (in nanoseconds) when parsing ended.
 long getParsingStartTimeNanos()
           Returns the time (in nanoseconds) when parsing started.
 long getParsingTotalTimeNanos()
           Returns the difference (in nanoseconds) between parsing start and end.
 void handleCDATASection(char[] buffer, int offset, int len, int line, int col)
           Called when a CDATA section is found.
 void handleComment(char[] buffer, int offset, int len, int line, int col)
           Called when a comment is found.
 void handleDocType(String elementName, String publicId, String systemId, String internalSubset, int line, int col)
           Called when a DOCTYPE clause is found.
 void handleProcessingInstruction(String target, String content, int line, int col)
           Called when a Processing Instruction is found.
 void handleText(char[] buffer, int offset, int len, int line, int col)
           Called when a text artifact is found.
 void handleXmlCloseElement(String elementName, int line, int col)
           Called when a close element (a close tag) is found.
 void handleXmlDeclaration(String version, String encoding, String standalone, int line, int col)
           Called when an XML Declaration is found.
 void handleXmlDocumentEnd(long endTimeNanos, long totalTimeNanos, int line, int col)
           
 void handleXmlDocumentStart(long startTimeNanos, int line, int col)
           
 void handleXmlOpenElement(String elementName, Map<String,String> attributes, int line, int col)
           Called when an open element (an open tag) is found.
 void handleXmlStandaloneElement(String elementName, Map<String,String> attributes, int line, int col)
           Called when a standalone element (a minimized tag) is found.
 boolean isParsingFinished()
           Returns whether parsing has already finished or not.
 
Methods inherited from class org.attoparser.markup.xml.AbstractStandardXmlAttoHandler
handleAttribute, handleCDATASection, handleComment, handleDocType, handleInnerWhiteSpace, handleProcessingInstruction, handleXmlCloseElementEnd, handleXmlCloseElementStart, handleXmlDeclarationDetail, handleXmlOpenElementEnd, handleXmlOpenElementStart, handleXmlStandaloneElementEnd, handleXmlStandaloneElementStart
 
Methods inherited from class org.attoparser.markup.xml.AbstractDetailedXmlAttoHandler
handleAutoCloseElementEnd, handleAutoCloseElementStart, handleCloseElementEnd, handleCloseElementStart, handleDocumentEnd, handleDocumentStart, handleOpenElementEnd, handleOpenElementStart, handleStandaloneElementEnd, handleStandaloneElementStart
 
Methods inherited from class org.attoparser.markup.AbstractDetailedMarkupAttoHandler
handleCloseElement, handleDocType, handleDocumentEnd, handleDocumentStart, handleOpenElement, handleStandaloneElement, handleUnmatchedCloseElementEnd, handleUnmatchedCloseElementStart, handleXmlDeclaration
 
Methods inherited from class org.attoparser.markup.AbstractBasicMarkupAttoHandler
handleStructure
 
Methods inherited from class org.attoparser.AbstractAttoHandler
getEndTimeNanos, getStartTimeNanos, getTotalTimeNanos, handleDocumentEnd, handleDocumentStart
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DOMXmlAttoHandler

public DOMXmlAttoHandler()

Creates a new instance of this handler.


DOMXmlAttoHandler

public DOMXmlAttoHandler(String documentName)

Creates a new instance of this handler.

Method Detail

getDocument

public Document getDocument()

Returns the attoDOM Document created during parsing.

Returns:
the built DOM document object.

getParsingStartTimeNanos

public long getParsingStartTimeNanos()

Returns the time (in nanoseconds) when parsing started.

Returns:
the start time.

getParsingEndTimeNanos

public long getParsingEndTimeNanos()

Returns the time (in nanoseconds) when parsing ended.

Returns:
the end time.

getParsingTotalTimeNanos

public long getParsingTotalTimeNanos()

Returns the difference (in nanoseconds) between parsing start and end.

Returns:
the parsing time in nanos.

isParsingFinished

public boolean isParsingFinished()

Returns whether parsing has already finished or not.

Returns:
true if parsing has finished, false if not.

handleXmlDocumentStart

public void handleXmlDocumentStart(long startTimeNanos,
                                   int line,
                                   int col)
                            throws AttoParseException
Overrides:
handleXmlDocumentStart in class AbstractDetailedXmlAttoHandler
Throws:
AttoParseException

handleXmlDocumentEnd

public void handleXmlDocumentEnd(long endTimeNanos,
                                 long totalTimeNanos,
                                 int line,
                                 int col)
                          throws AttoParseException
Overrides:
handleXmlDocumentEnd in class AbstractDetailedXmlAttoHandler
Throws:
AttoParseException

handleXmlDeclaration

public void handleXmlDeclaration(String version,
                                 String encoding,
                                 String standalone,
                                 int line,
                                 int col)
                          throws AttoParseException
Description copied from class: AbstractStandardXmlAttoHandler

Called when an XML Declaration is found.

Overrides:
handleXmlDeclaration in class AbstractStandardXmlAttoHandler
Parameters:
version - the version value specified (cannot be null).
encoding - the encoding value specified (can be null).
standalone - the standalone value specified (can be null).
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleDocType

public void handleDocType(String elementName,
                          String publicId,
                          String systemId,
                          String internalSubset,
                          int line,
                          int col)
                   throws AttoParseException
Description copied from class: AbstractStandardXmlAttoHandler

Called when a DOCTYPE clause is found.

Overrides:
handleDocType in class AbstractStandardXmlAttoHandler
Parameters:
elementName - the root element name present in the DOCTYPE clause (e.g. "html").
publicId - the public ID specified, if present (might be null).
systemId - the system ID specified, if present (might be null).
internalSubset - the internal subset specified, if present (might be null).
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleXmlStandaloneElement

public void handleXmlStandaloneElement(String elementName,
                                       Map<String,String> attributes,
                                       int line,
                                       int col)
                                throws AttoParseException
Description copied from class: AbstractStandardXmlAttoHandler

Called when a standalone element (a minimized tag) is found.

Note that the element attributes map can be null if no attributes are present.

Overrides:
handleXmlStandaloneElement in class AbstractStandardXmlAttoHandler
Parameters:
elementName - the element name (e.g. "<img src="logo.png">" -> "img").
attributes - the element attributes map, or null if no attributes are present.
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleXmlOpenElement

public void handleXmlOpenElement(String elementName,
                                 Map<String,String> attributes,
                                 int line,
                                 int col)
                          throws AttoParseException
Description copied from class: AbstractStandardXmlAttoHandler

Called when an open element (an open tag) is found.

Note that the element attributes map can be null if no attributes are present.

Overrides:
handleXmlOpenElement in class AbstractStandardXmlAttoHandler
Parameters:
elementName - the element name (e.g. "<div class="content">" -> "div").
attributes - the element attributes map, or null if no attributes are present.
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleXmlCloseElement

public void handleXmlCloseElement(String elementName,
                                  int line,
                                  int col)
                           throws AttoParseException
Description copied from class: AbstractStandardXmlAttoHandler

Called when a close element (a close tag) is found.

Overrides:
handleXmlCloseElement in class AbstractStandardXmlAttoHandler
Parameters:
elementName - the element name (e.g. "</div>" -> "div").
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleComment

public void handleComment(char[] buffer,
                          int offset,
                          int len,
                          int line,
                          int col)
                   throws AttoParseException
Description copied from class: AbstractStandardXmlAttoHandler

Called when a comment is found.

This artifact is returned as a char[] instead of a String because its content can be large. In order to convert it into a String, just do new String(buffer, offset, len).

Overrides:
handleComment in class AbstractStandardXmlAttoHandler
Parameters:
buffer - the document buffer.
offset - the offset of the artifact in the document buffer.
len - the length (in chars) of the artifact.
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleCDATASection

public void handleCDATASection(char[] buffer,
                               int offset,
                               int len,
                               int line,
                               int col)
                        throws AttoParseException
Description copied from class: AbstractStandardXmlAttoHandler

Called when a CDATA section is found.

This artifact is returned as a char[] instead of a String because its content can be large. In order to convert it into a String, just do new String(buffer, offset, len).

Overrides:
handleCDATASection in class AbstractStandardXmlAttoHandler
Parameters:
buffer - the document buffer.
offset - the offset of the artifact in the document buffer.
len - the length (in chars) of the artifact.
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleText

public void handleText(char[] buffer,
                       int offset,
                       int len,
                       int line,
                       int col)
                throws AttoParseException
Description copied from interface: IAttoHandler

Called when a text artifact is found.

A sequence of chars is considered to be text when no structures of any kind are contained inside it. In markup parsers, for example, this means no tags (a.k.a. elements), DOCTYPE's, processing instructions, etc. are contained in the sequence.

Text sequences might include any number of new line and/or control characters.

Text artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported texts should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleText in interface IAttoHandler
Overrides:
handleText in class AbstractAttoHandler
Parameters:
buffer - the document buffer (not copied)
offset - the offset (position in buffer) where the text artifact starts.
len - the length (in chars) of the text artifact, starting in offset.
line - the line in the original document where this text artifact starts.
col - the column in the original document where this text artifact starts.
Throws:
AttoParseException

handleProcessingInstruction

public void handleProcessingInstruction(String target,
                                        String content,
                                        int line,
                                        int col)
                                 throws AttoParseException
Description copied from class: AbstractStandardXmlAttoHandler

Called when a Processing Instruction is found.

Overrides:
handleProcessingInstruction in class AbstractStandardXmlAttoHandler
Parameters:
target - the target specified in the processing instruction.
content - the content of the processing instruction, if specified (might be null).
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException


Copyright © 2013 The ATTOPARSER team. All Rights Reserved.