org.attoparser.markup.html
Class AbstractStandardNonValidatingHtmlAttoHandler

Object
  extended by org.attoparser.AbstractAttoHandler
      extended by org.attoparser.markup.AbstractBasicMarkupAttoHandler
          extended by org.attoparser.markup.AbstractDetailedMarkupAttoHandler
              extended by org.attoparser.markup.html.AbstractDetailedNonValidatingHtmlAttoHandler
                  extended by org.attoparser.markup.html.AbstractStandardNonValidatingHtmlAttoHandler
All Implemented Interfaces:
IAttoHandler, ITimedDocumentHandling, IDetailedHtmlElementHandling, IHtmlAttributeSequenceHandling, IAttributeSequenceHandling, IBasicDocTypeHandling, IBasicElementHandling, ICDATASectionHandling, ICommentHandling, IDetailedDocTypeHandling, IDetailedElementHandling, IProcessingInstructionHandling, IXmlDeclarationHandling

public abstract class AbstractStandardNonValidatingHtmlAttoHandler
extends AbstractDetailedNonValidatingHtmlAttoHandler

Since:
1.1
Author:
Daniel Fernández

Constructor Summary
protected AbstractStandardNonValidatingHtmlAttoHandler(HtmlParsingConfiguration configuration)
           
 
Method Summary
 void handleCDATASection(char[] buffer, int offset, int len, int line, int col)
           Called when a CDATA section is found.
 void handleCDATASection(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col)
           Called when a CDATA section is found.
 void handleComment(char[] buffer, int offset, int len, int line, int col)
           Called when a comment is found.
 void handleComment(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col)
           Called when a comment is found.
 void handleDocType(char[] buffer, int keywordOffset, int keywordLen, int keywordLine, int keywordCol, int elementNameOffset, int elementNameLen, int elementNameLine, int elementNameCol, int typeOffset, int typeLen, int typeLine, int typeCol, int publicIdOffset, int publicIdLen, int publicIdLine, int publicIdCol, int systemIdOffset, int systemIdLen, int systemIdLine, int systemIdCol, int internalSubsetOffset, int internalSubsetLen, int internalSubsetLine, int internalSubsetCol, int outerOffset, int outerLen, int outerLine, int outerCol)
           Called when a DOCTYPE clause is found.
 void handleDocType(String elementName, String publicId, String systemId, String internalSubset, int line, int col)
           Called when a DOCTYPE clause is found.
 void handleHtmlAttribute(char[] buffer, int nameOffset, int nameLen, int nameLine, int nameCol, int operatorOffset, int operatorLen, int operatorLine, int operatorCol, int valueContentOffset, int valueContentLen, int valueOuterOffset, int valueOuterLen, int valueLine, int valueCol)
           Called when an attribute is found.
 void handleHtmlCloseElement(IHtmlElement element, String elementName, int line, int col)
           Called when a close element (a close tag) is found.
 void handleHtmlCloseElementEnd(IHtmlElement element, int line, int col)
           Called when the end of a close element (a close tag) is found.
 void handleHtmlCloseElementStart(IHtmlElement element, char[] buffer, int nameOffset, int nameLen, int line, int col)
           Called when the start of a close element (a close tag) is found.
 void handleHtmlInnerWhiteSpace(char[] buffer, int offset, int len, int line, int col)
           Called when an amount of white space is found inside an element.
 void handleHtmlOpenElement(IHtmlElement element, String elementName, Map<String,String> attributes, int line, int col)
           Called when an open element (an open tag) is found.
 void handleHtmlOpenElementEnd(IHtmlElement element, int line, int col)
           Called when the end of an open element (an open tag) is found.
 void handleHtmlOpenElementStart(IHtmlElement element, char[] buffer, int nameOffset, int nameLen, int line, int col)
           Called when the start of an open element (an open tag) is found.
 void handleHtmlStandaloneElement(IHtmlElement element, boolean minimized, String elementName, Map<String,String> attributes, int line, int col)
           Called when a standalone element (a minimized tag) is found.
 void handleHtmlStandaloneElementEnd(IHtmlElement element, boolean minimized, int line, int col)
           Called when the end of a standalone element is found.
 void handleHtmlStandaloneElementStart(IHtmlElement element, boolean minimized, char[] buffer, int nameOffset, int nameLen, int line, int col)
           Called when the start of a standalone element is found.
 void handleProcessingInstruction(char[] buffer, int targetOffset, int targetLen, int targetLine, int targetCol, int contentOffset, int contentLen, int contentLine, int contentCol, int outerOffset, int outerLen, int line, int col)
           Called when a Processing Instruction is found.
 void handleProcessingInstruction(String target, String content, int line, int col)
           Called when a Processing Instruction is found.
 void handleXmlDeclaration(String version, String encoding, String standalone, int line, int col)
           Called when an XML Declaration is found.
 void handleXmlDeclarationDetail(char[] buffer, int keywordOffset, int keywordLen, int keywordLine, int keywordCol, int versionOffset, int versionLen, int versionLine, int versionCol, int encodingOffset, int encodingLen, int encodingLine, int encodingCol, int standaloneOffset, int standaloneLen, int standaloneLine, int standaloneCol, int outerOffset, int outerLen, int line, int col)
           Called when a XML Declaration is found when using a handler extending from AbstractDetailedMarkupAttoHandler.
 
Methods inherited from class org.attoparser.markup.html.AbstractDetailedNonValidatingHtmlAttoHandler
handleAttribute, handleAutoCloseElementEnd, handleAutoCloseElementStart, handleCloseElementEnd, handleCloseElementStart, handleDocumentEnd, handleDocumentEnd, handleDocumentStart, handleDocumentStart, handleInnerWhiteSpace, handleOpenElementEnd, handleOpenElementStart, handleStandaloneElementEnd, handleStandaloneElementStart, handleUnmatchedCloseElementEnd, handleUnmatchedCloseElementStart
 
Methods inherited from class org.attoparser.markup.AbstractDetailedMarkupAttoHandler
handleCloseElement, handleDocType, handleDocumentEnd, handleDocumentStart, handleOpenElement, handleStandaloneElement, handleXmlDeclaration
 
Methods inherited from class org.attoparser.markup.AbstractBasicMarkupAttoHandler
handleStructure
 
Methods inherited from class org.attoparser.AbstractAttoHandler
getEndTimeNanos, getStartTimeNanos, getTotalTimeNanos, handleDocumentEnd, handleDocumentStart, handleText
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbstractStandardNonValidatingHtmlAttoHandler

protected AbstractStandardNonValidatingHtmlAttoHandler(HtmlParsingConfiguration configuration)
Method Detail

handleHtmlStandaloneElementStart

public final void handleHtmlStandaloneElementStart(IHtmlElement element,
                                                   boolean minimized,
                                                   char[] buffer,
                                                   int nameOffset,
                                                   int nameLen,
                                                   int line,
                                                   int col)
                                            throws AttoParseException
Description copied from interface: IDetailedHtmlElementHandling

Called when the start of a standalone element is found.

A standalone element can be either a minimized tag or not. For example: <img src="..." /> is minimized (self-closed), as opposed to <img src="..."> (non-minimized, perfectly valid from the HTML but not from the XML or XHTML standpoints).

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleHtmlStandaloneElementStart in interface IDetailedHtmlElementHandling
Overrides:
handleHtmlStandaloneElementStart in class AbstractDetailedNonValidatingHtmlAttoHandler
Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
minimized - whether the tag representing this element is minimized (self-closed) or not.
buffer - the document buffer (not copied).
nameOffset - the offset (position in buffer) where the element name appears.
nameLen - the length (in chars) of the element name.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleHtmlStandaloneElementEnd

public final void handleHtmlStandaloneElementEnd(IHtmlElement element,
                                                 boolean minimized,
                                                 int line,
                                                 int col)
                                          throws AttoParseException
Description copied from interface: IDetailedHtmlElementHandling

Called when the end of a standalone element is found.

Specified by:
handleHtmlStandaloneElementEnd in interface IDetailedHtmlElementHandling
Overrides:
handleHtmlStandaloneElementEnd in class AbstractDetailedNonValidatingHtmlAttoHandler
Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
minimized - whether the tag representing this element is minimized (self-closed) or not.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleHtmlOpenElementStart

public final void handleHtmlOpenElementStart(IHtmlElement element,
                                             char[] buffer,
                                             int nameOffset,
                                             int nameLen,
                                             int line,
                                             int col)
                                      throws AttoParseException
Description copied from interface: IDetailedHtmlElementHandling

Called when the start of an open element (an open tag) is found.

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleHtmlOpenElementStart in interface IDetailedHtmlElementHandling
Overrides:
handleHtmlOpenElementStart in class AbstractDetailedNonValidatingHtmlAttoHandler
Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
buffer - the document buffer (not copied).
nameOffset - the offset (position in buffer) where the element name appears.
nameLen - the length (in chars) of the element name.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleHtmlOpenElementEnd

public final void handleHtmlOpenElementEnd(IHtmlElement element,
                                           int line,
                                           int col)
                                    throws AttoParseException
Description copied from interface: IDetailedHtmlElementHandling

Called when the end of an open element (an open tag) is found.

Specified by:
handleHtmlOpenElementEnd in interface IDetailedHtmlElementHandling
Overrides:
handleHtmlOpenElementEnd in class AbstractDetailedNonValidatingHtmlAttoHandler
Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleHtmlCloseElementStart

public final void handleHtmlCloseElementStart(IHtmlElement element,
                                              char[] buffer,
                                              int nameOffset,
                                              int nameLen,
                                              int line,
                                              int col)
                                       throws AttoParseException
Description copied from interface: IDetailedHtmlElementHandling

Called when the start of a close element (a close tag) is found.

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleHtmlCloseElementStart in interface IDetailedHtmlElementHandling
Overrides:
handleHtmlCloseElementStart in class AbstractDetailedNonValidatingHtmlAttoHandler
Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
buffer - the document buffer (not copied).
nameOffset - the offset (position in buffer) where the element name appears.
nameLen - the length (in chars) of the element name.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleHtmlCloseElementEnd

public final void handleHtmlCloseElementEnd(IHtmlElement element,
                                            int line,
                                            int col)
                                     throws AttoParseException
Description copied from interface: IDetailedHtmlElementHandling

Called when the end of a close element (a close tag) is found.

Specified by:
handleHtmlCloseElementEnd in interface IDetailedHtmlElementHandling
Overrides:
handleHtmlCloseElementEnd in class AbstractDetailedNonValidatingHtmlAttoHandler
Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleHtmlAttribute

public final void handleHtmlAttribute(char[] buffer,
                                      int nameOffset,
                                      int nameLen,
                                      int nameLine,
                                      int nameCol,
                                      int operatorOffset,
                                      int operatorLen,
                                      int operatorLine,
                                      int operatorCol,
                                      int valueContentOffset,
                                      int valueContentLen,
                                      int valueOuterOffset,
                                      int valueOuterLen,
                                      int valueLine,
                                      int valueCol)
                               throws AttoParseException
Description copied from interface: IHtmlAttributeSequenceHandling

Called when an attribute is found.

Three [offset, len] pairs are provided for three partitions (name, operator, valueContent and valueOuter):

class="basic_column"
[NAM]* [VALUECONTE]| (*) = [OPERATOR]
|     [VALUEOUTER--]
[OUTER-------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleHtmlAttribute in interface IHtmlAttributeSequenceHandling
Overrides:
handleHtmlAttribute in class AbstractDetailedNonValidatingHtmlAttoHandler
Parameters:
buffer - the document buffer (not copied)
nameOffset - offset for the name partition.
nameLen - length of the name partition.
nameLine - the line in the original document where the name partition starts.
nameCol - the column in the original document where the name partition starts.
operatorOffset - offset for the operator partition.
operatorLen - length of the operator partition.
operatorLine - the line in the original document where the operator partition starts.
operatorCol - the column in the original document where the operator partition starts.
valueContentOffset - offset for the valueContent partition.
valueContentLen - length of the valueContent partition.
valueOuterOffset - offset for the valueOuter partition.
valueOuterLen - length of the valueOuter partition.
valueLine - the line in the original document where the value (outer) partition starts.
valueCol - the column in the original document where the value (outer) partition starts.
Throws:
AttoParseException

handleHtmlInnerWhiteSpace

public final void handleHtmlInnerWhiteSpace(char[] buffer,
                                            int offset,
                                            int len,
                                            int line,
                                            int col)
                                     throws AttoParseException
Description copied from interface: IHtmlAttributeSequenceHandling

Called when an amount of white space is found inside an element.

This attribute separators can contain any amount of whitespace, including line feeds:

<div id="main"        class="basic_column">
              [ATTSEP]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleHtmlInnerWhiteSpace in interface IHtmlAttributeSequenceHandling
Overrides:
handleHtmlInnerWhiteSpace in class AbstractDetailedNonValidatingHtmlAttoHandler
Parameters:
buffer - the document buffer (not copied)
offset - offset for the artifact.
len - length of the artifact.
line - the line in the original document where the artifact starts.
col - the column in the original document where the artifact starts.
Throws:
AttoParseException

handleDocType

public final void handleDocType(char[] buffer,
                                int keywordOffset,
                                int keywordLen,
                                int keywordLine,
                                int keywordCol,
                                int elementNameOffset,
                                int elementNameLen,
                                int elementNameLine,
                                int elementNameCol,
                                int typeOffset,
                                int typeLen,
                                int typeLine,
                                int typeCol,
                                int publicIdOffset,
                                int publicIdLen,
                                int publicIdLine,
                                int publicIdCol,
                                int systemIdOffset,
                                int systemIdLen,
                                int systemIdLine,
                                int systemIdCol,
                                int internalSubsetOffset,
                                int internalSubsetLen,
                                int internalSubsetLine,
                                int internalSubsetCol,
                                int outerOffset,
                                int outerLen,
                                int outerLine,
                                int outerCol)
                         throws AttoParseException
Description copied from interface: IDetailedDocTypeHandling

Called when a DOCTYPE clause is found.

This method reports the DOCTYPE clause splitting it into its different parts.

Seven [offset, len] pairs are provided for seven partitions (outer, keyword, elementName, type, publicId, systemId and internalSubset) of the DOCTYPE clause:

<!DOCTYPE html PUBLIC ".........." ".........." [................]>
| [KEYWO] [EN] [TYPE]  [PUBLICID]   [SYSTEMID]   [INTERNALSUBSET] |
[OUTER------------------------------------------------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleDocType in interface IDetailedDocTypeHandling
Overrides:
handleDocType in class AbstractDetailedMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
keywordOffset - offset for the keyword partition.
keywordLen - length of the keyword partition.
keywordLine - the line in the original document where the keyword partition starts.
keywordCol - the column in the original document where the keyword partition starts.
elementNameOffset - offset for the elementName partition.
elementNameLen - length of the elementName partition.
elementNameLine - the line in the original document where the elementName partition starts.
elementNameCol - the column in the original document where the elementName partition starts.
typeOffset - offset for the type partition.
typeLen - length of the type partition.
typeLine - the line in the original document where the type partition starts.
typeCol - the column in the original document where the type partition starts.
publicIdOffset - offset for the publicId partition.
publicIdLen - length of the publicId partition.
publicIdLine - the line in the original document where the publicId partition starts.
publicIdCol - the column in the original document where the publicId partition starts.
systemIdOffset - offset for the systemId partition.
systemIdLen - length of the systemId partition.
systemIdLine - the line in the original document where the systemId partition starts.
systemIdCol - the column in the original document where the systemId partition starts.
internalSubsetOffset - offset for the internalSubsetId partition.
internalSubsetLen - length of the internalSubsetId partition.
internalSubsetLine - the line in the original document where the internalSubsetId partition starts.
internalSubsetCol - the column in the original document where the internalSubsetId partition starts.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
outerLine - the line in the original document where this artifact starts.
outerCol - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleComment

public final void handleComment(char[] buffer,
                                int contentOffset,
                                int contentLen,
                                int outerOffset,
                                int outerLen,
                                int line,
                                int col)
                         throws AttoParseException
Description copied from interface: ICommentHandling

Called when a comment is found.

Two [offset, len] pairs are provided for two partitions (outer and content):

<!-- this is a comment -->
|   [CONTENT----------]  |
[OUTER-------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleComment in interface ICommentHandling
Overrides:
handleComment in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
contentOffset - offset for the content partition.
contentLen - length of the content partition.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleCDATASection

public final void handleCDATASection(char[] buffer,
                                     int contentOffset,
                                     int contentLen,
                                     int outerOffset,
                                     int outerLen,
                                     int line,
                                     int col)
                              throws AttoParseException
Description copied from interface: ICDATASectionHandling

Called when a CDATA section is found.

Two [offset, len] pairs are provided for two partitions (outer and content):

<![CDATA[ this is a CDATA section ]]>
|        [CONTENT----------------]  |
[OUTER------------------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleCDATASection in interface ICDATASectionHandling
Overrides:
handleCDATASection in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
contentOffset - offset for the content partition.
contentLen - length of the content partition.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleXmlDeclarationDetail

public final void handleXmlDeclarationDetail(char[] buffer,
                                             int keywordOffset,
                                             int keywordLen,
                                             int keywordLine,
                                             int keywordCol,
                                             int versionOffset,
                                             int versionLen,
                                             int versionLine,
                                             int versionCol,
                                             int encodingOffset,
                                             int encodingLen,
                                             int encodingLine,
                                             int encodingCol,
                                             int standaloneOffset,
                                             int standaloneLen,
                                             int standaloneLine,
                                             int standaloneCol,
                                             int outerOffset,
                                             int outerLen,
                                             int line,
                                             int col)
                                      throws AttoParseException
Description copied from class: AbstractDetailedMarkupAttoHandler

Called when a XML Declaration is found when using a handler extending from AbstractDetailedMarkupAttoHandler.

Five [offset, len] pairs are provided for five partitions (outer, keyword, version, encoding and standalone):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
| [K]          [V]            [ENC]              [S]  |
[OUTER------------------------------------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Overrides:
handleXmlDeclarationDetail in class AbstractDetailedMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
keywordOffset - offset for the keyword partition.
keywordLen - length of the keyword partition.
keywordLine - the line in the original document where the keyword partition starts.
keywordCol - the column in the original document where the keyword partition starts.
versionOffset - offset for the version partition.
versionLen - length of the version partition.
versionLine - the line in the original document where the version partition starts.
versionCol - the column in the original document where the version partition starts.
encodingOffset - offset for the encoding partition.
encodingLen - length of the encoding partition.
encodingLine - the line in the original document where the encoding partition starts.
encodingCol - the column in the original document where the encoding partition starts.
standaloneOffset - offset for the standalone partition.
standaloneLen - length of the standalone partition.
standaloneLine - the line in the original document where the standalone partition starts.
standaloneCol - the column in the original document where the standalone partition starts.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleProcessingInstruction

public final void handleProcessingInstruction(char[] buffer,
                                              int targetOffset,
                                              int targetLen,
                                              int targetLine,
                                              int targetCol,
                                              int contentOffset,
                                              int contentLen,
                                              int contentLine,
                                              int contentCol,
                                              int outerOffset,
                                              int outerLen,
                                              int line,
                                              int col)
                                       throws AttoParseException
Description copied from interface: IProcessingInstructionHandling

Called when a Processing Instruction is found.

Three [offset, len] pairs are provided for three partitions (outer, target and content):

<?xls-stylesheet somePar1="a" somePar2="b"?>
| [TARGET------] [CONTENT----------------] |
[OUTER-------------------------------------]

Note that, although XML Declarations have the same format as processing instructions, they are not considered as such and therefore are handled by a different interface (IXmlDeclarationHandling).

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleProcessingInstruction in interface IProcessingInstructionHandling
Overrides:
handleProcessingInstruction in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
targetOffset - offset for the target partition.
targetLen - length of the target partition.
targetLine - the line in the original document where the target partition starts.
targetCol - the column in the original document where the target partition starts.
contentOffset - offset for the content partition.
contentLen - length of the content partition.
contentLine - the line in the original document where the content partition starts.
contentCol - the column in the original document where the content partition starts.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleHtmlStandaloneElement

public void handleHtmlStandaloneElement(IHtmlElement element,
                                        boolean minimized,
                                        String elementName,
                                        Map<String,String> attributes,
                                        int line,
                                        int col)
                                 throws AttoParseException

Called when a standalone element (a minimized tag) is found.

Note that the element attributes map can be null if no attributes are present.

Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
minimized - whether the tag representing this element is minimized (self-closed) or not.
elementName - the element name (e.g. "<img src="logo.png">" -> "img").
attributes - the element attributes map, or null if no attributes are present.
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleHtmlOpenElement

public void handleHtmlOpenElement(IHtmlElement element,
                                  String elementName,
                                  Map<String,String> attributes,
                                  int line,
                                  int col)
                           throws AttoParseException

Called when an open element (an open tag) is found.

Note that the element attributes map can be null if no attributes are present.

Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
elementName - the element name (e.g. "<div class="content">" -> "div").
attributes - the element attributes map, or null if no attributes are present.
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleHtmlCloseElement

public void handleHtmlCloseElement(IHtmlElement element,
                                   String elementName,
                                   int line,
                                   int col)
                            throws AttoParseException

Called when a close element (a close tag) is found.

Parameters:
element - the IHtmlElement element object representing the corresponding HTML element.
elementName - the element name (e.g. "</div>" -> "div").
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleDocType

public void handleDocType(String elementName,
                          String publicId,
                          String systemId,
                          String internalSubset,
                          int line,
                          int col)
                   throws AttoParseException

Called when a DOCTYPE clause is found.

Parameters:
elementName - the root element name present in the DOCTYPE clause (e.g. "html").
publicId - the public ID specified, if present (might be null).
systemId - the system ID specified, if present (might be null).
internalSubset - the internal subset specified, if present (might be null).
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleComment

public void handleComment(char[] buffer,
                          int offset,
                          int len,
                          int line,
                          int col)
                   throws AttoParseException

Called when a comment is found.

This artifact is returned as a char[] instead of a String because its content can be large. In order to convert it into a String, just do new String(buffer, offset, len).

Parameters:
buffer - the document buffer.
offset - the offset of the artifact in the document buffer.
len - the length (in chars) of the artifact.
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleCDATASection

public void handleCDATASection(char[] buffer,
                               int offset,
                               int len,
                               int line,
                               int col)
                        throws AttoParseException

Called when a CDATA section is found.

This artifact is returned as a char[] instead of a String because its content can be large. In order to convert it into a String, just do new String(buffer, offset, len).

Parameters:
buffer - the document buffer.
offset - the offset of the artifact in the document buffer.
len - the length (in chars) of the artifact.
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleXmlDeclaration

public void handleXmlDeclaration(String version,
                                 String encoding,
                                 String standalone,
                                 int line,
                                 int col)
                          throws AttoParseException

Called when an XML Declaration is found.

Parameters:
version - the version value specified (cannot be null).
encoding - the encoding value specified (can be null).
standalone - the standalone value specified (can be null).
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException

handleProcessingInstruction

public void handleProcessingInstruction(String target,
                                        String content,
                                        int line,
                                        int col)
                                 throws AttoParseException

Called when a Processing Instruction is found.

Parameters:
target - the target specified in the processing instruction.
content - the content of the processing instruction, if specified (might be null).
line - the line in the document where this elements appears.
col - the column in the document where this element appears.
Throws:
AttoParseException


Copyright © 2012 The ATTOPARSER team. All Rights Reserved.