org.attoparser.markup.duplicate
Class DuplicatingBasicMarkupAttoHandler

Object
  extended by org.attoparser.AbstractAttoHandler
      extended by org.attoparser.markup.AbstractBasicMarkupAttoHandler
          extended by org.attoparser.markup.duplicate.DuplicatingBasicMarkupAttoHandler
All Implemented Interfaces:
IAttoHandler, ITimedDocumentHandling, IBasicDocTypeHandling, IBasicElementHandling, ICDATASectionHandling, ICommentHandling, IProcessingInstructionHandling, IXmlDeclarationHandling

public final class DuplicatingBasicMarkupAttoHandler
extends AbstractBasicMarkupAttoHandler

Since:
1.0
Author:
Daniel Fernández

Constructor Summary
DuplicatingBasicMarkupAttoHandler(Writer writer)
           
 
Method Summary
 void handleCDATASection(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col)
           Called when a CDATA section is found.
 void handleCloseElement(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col)
           Called when a close element (a close tag) is found.
 void handleComment(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col)
           Called when a comment is found.
 void handleDocType(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col)
           Called when a DOCTYPE clause is found.
 void handleDocumentEnd(long endTimeNanos, long totalTimeNanos, int line, int col)
           Called at the end of document parsing, adding timing information.
 void handleDocumentStart(long startTimeNanos, int line, int col)
           Called at the beginning of document parsing, adding timing information.
 void handleOpenElement(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col)
           Called when an open element (an open tag) is found.
 void handleProcessingInstruction(char[] buffer, int targetOffset, int targetLen, int targetLine, int targetCol, int contentOffset, int contentLen, int contentLine, int contentCol, int outerOffset, int outerLen, int line, int col)
           Called when a Processing Instruction is found.
 void handleStandaloneElement(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col)
           Called when a standalone element (a minimized tag) is found.
 void handleText(char[] buffer, int offset, int len, int line, int col)
           Called when a text artifact is found.
 void handleXmlDeclaration(char[] buffer, int keywordOffset, int keywordLen, int keywordLine, int keywordCol, int versionOffset, int versionLen, int versionLine, int versionCol, int encodingOffset, int encodingLen, int encodingLine, int encodingCol, int standaloneOffset, int standaloneLen, int standaloneLine, int standaloneCol, int outerOffset, int outerLen, int line, int col)
           Called when a XML Declaration is found.
 
Methods inherited from class org.attoparser.markup.AbstractBasicMarkupAttoHandler
handleStructure
 
Methods inherited from class org.attoparser.AbstractAttoHandler
getEndTimeNanos, getStartTimeNanos, getTotalTimeNanos, handleDocumentEnd, handleDocumentStart
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DuplicatingBasicMarkupAttoHandler

public DuplicatingBasicMarkupAttoHandler(Writer writer)
Method Detail

handleDocumentStart

public void handleDocumentStart(long startTimeNanos,
                                int line,
                                int col)
                         throws AttoParseException
Description copied from interface: ITimedDocumentHandling

Called at the beginning of document parsing, adding timing information.

Specified by:
handleDocumentStart in interface ITimedDocumentHandling
Overrides:
handleDocumentStart in class AbstractAttoHandler
Parameters:
startTimeNanos - the current time (in nanoseconds) obtained when parsing starts.
line - the line of the document where parsing starts (usually number 1)
col - the column of the document where parsing starts (usually number 1)
Throws:
AttoParseException

handleDocumentEnd

public void handleDocumentEnd(long endTimeNanos,
                              long totalTimeNanos,
                              int line,
                              int col)
                       throws AttoParseException
Description copied from interface: ITimedDocumentHandling

Called at the end of document parsing, adding timing information.

Specified by:
handleDocumentEnd in interface ITimedDocumentHandling
Overrides:
handleDocumentEnd in class AbstractAttoHandler
Parameters:
endTimeNanos - the current time (in nanoseconds) obtained when parsing ends.
totalTimeNanos - the difference between current times at the start and end of parsing (in nanoseconds)
line - the line of the document where parsing ends (usually the last one)
col - the column of the document where the parsing ends (usually the last one)
Throws:
AttoParseException

handleDocType

public void handleDocType(char[] buffer,
                          int contentOffset,
                          int contentLen,
                          int outerOffset,
                          int outerLen,
                          int line,
                          int col)
                   throws AttoParseException
Description copied from interface: IBasicDocTypeHandling

Called when a DOCTYPE clause is found.

This method reports the DOCTYPE clause as a whole, not splitting it into its different parts (root element name, publicId, etc.). This splitting should normally be done by implementations of the IDetailedDocTypeHandling interface (like AbstractDetailedMarkupAttoHandler).

Two [offset, len] pairs are provided for two partitions (outer and content) of the DOCTYPE clause:

<!DOCTYPE html PUBLIC "..." "...">
| [CONTENT----------------------]|
[OUTER---------------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleDocType in interface IBasicDocTypeHandling
Overrides:
handleDocType in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
contentOffset - offset for the content partition.
contentLen - length of the content partition.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleStandaloneElement

public void handleStandaloneElement(char[] buffer,
                                    int contentOffset,
                                    int contentLen,
                                    int outerOffset,
                                    int outerLen,
                                    int line,
                                    int col)
                             throws AttoParseException
Description copied from interface: IBasicElementHandling

Called when a standalone element (a minimized tag) is found.

This method reports the element as a whole, not splitting it among its different parts (element name, attributes). This splitting should normally be done by implementations of the IDetailedElementHandling interface (like AbstractDetailedMarkupAttoHandler).

Two [offset, len] pairs are provided for two partitions (outer and content) of the element:

<img src="/images/logo.png"/>
|[CONTENT-----------------] |
[OUTER----------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleStandaloneElement in interface IBasicElementHandling
Overrides:
handleStandaloneElement in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
contentOffset - offset for the content partition.
contentLen - length of the content partition.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleOpenElement

public void handleOpenElement(char[] buffer,
                              int contentOffset,
                              int contentLen,
                              int outerOffset,
                              int outerLen,
                              int line,
                              int col)
                       throws AttoParseException
Description copied from interface: IBasicElementHandling

Called when an open element (an open tag) is found.

This method reports the element as a whole, not splitting it among its different parts (element name, attributes). This splitting should normally be done by implementations of the IDetailedElementHandling interface (like AbstractDetailedMarkupAttoHandler).

Two [offset, len] pairs are provided for two partitions (outer and content) of the element:

<div class="main_section">
|[CONTENT---------------]|
[OUTER-------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleOpenElement in interface IBasicElementHandling
Overrides:
handleOpenElement in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
contentOffset - offset for the content partition.
contentLen - length of the content partition.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleCloseElement

public void handleCloseElement(char[] buffer,
                               int contentOffset,
                               int contentLen,
                               int outerOffset,
                               int outerLen,
                               int line,
                               int col)
                        throws AttoParseException
Description copied from interface: IBasicElementHandling

Called when a close element (a close tag) is found.

This method reports the element as a whole, not splitting it among its different parts (element name). This splitting should normally be done by implementations of the IDetailedElementHandling interface (like AbstractDetailedMarkupAttoHandler).

Two [offset, len] pairs are provided for two partitions (outer and content) of the element:

</div>
| [C]|
[OUTE]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleCloseElement in interface IBasicElementHandling
Overrides:
handleCloseElement in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
contentOffset - offset for the content partition.
contentLen - length of the content partition.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleText

public void handleText(char[] buffer,
                       int offset,
                       int len,
                       int line,
                       int col)
                throws AttoParseException
Description copied from interface: IAttoHandler

Called when a text artifact is found.

A sequence of chars is considered to be text when no structures of any kind are contained inside it. In markup parsers, for example, this means no tags (a.k.a. elements), DOCTYPE's, processing instructions, etc. are contained in the sequence.

Text sequences might include any number of new line and/or control characters.

Text artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported texts should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleText in interface IAttoHandler
Overrides:
handleText in class AbstractAttoHandler
Parameters:
buffer - the document buffer (not copied)
offset - the offset (position in buffer) where the text artifact starts.
len - the length (in chars) of the text artifact, starting in offset.
line - the line in the original document where this text artifact starts.
col - the column in the original document where this text artifact starts.
Throws:
AttoParseException

handleComment

public void handleComment(char[] buffer,
                          int contentOffset,
                          int contentLen,
                          int outerOffset,
                          int outerLen,
                          int line,
                          int col)
                   throws AttoParseException
Description copied from interface: ICommentHandling

Called when a comment is found.

Two [offset, len] pairs are provided for two partitions (outer and content):

<!-- this is a comment -->
|   [CONTENT----------]  |
[OUTER-------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleComment in interface ICommentHandling
Overrides:
handleComment in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
contentOffset - offset for the content partition.
contentLen - length of the content partition.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleCDATASection

public void handleCDATASection(char[] buffer,
                               int contentOffset,
                               int contentLen,
                               int outerOffset,
                               int outerLen,
                               int line,
                               int col)
                        throws AttoParseException
Description copied from interface: ICDATASectionHandling

Called when a CDATA section is found.

Two [offset, len] pairs are provided for two partitions (outer and content):

<![CDATA[ this is a CDATA section ]]>
|        [CONTENT----------------]  |
[OUTER------------------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleCDATASection in interface ICDATASectionHandling
Overrides:
handleCDATASection in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
contentOffset - offset for the content partition.
contentLen - length of the content partition.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleXmlDeclaration

public void handleXmlDeclaration(char[] buffer,
                                 int keywordOffset,
                                 int keywordLen,
                                 int keywordLine,
                                 int keywordCol,
                                 int versionOffset,
                                 int versionLen,
                                 int versionLine,
                                 int versionCol,
                                 int encodingOffset,
                                 int encodingLen,
                                 int encodingLine,
                                 int encodingCol,
                                 int standaloneOffset,
                                 int standaloneLen,
                                 int standaloneLine,
                                 int standaloneCol,
                                 int outerOffset,
                                 int outerLen,
                                 int line,
                                 int col)
                          throws AttoParseException
Description copied from interface: IXmlDeclarationHandling

Called when a XML Declaration is found.

Five [offset, len] pairs are provided for five partitions (outer, keyword, version, encoding and standalone):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
| [K]          [V]            [ENC]              [S]  |
[OUTER------------------------------------------------]

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleXmlDeclaration in interface IXmlDeclarationHandling
Overrides:
handleXmlDeclaration in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
keywordOffset - offset for the keyword partition.
keywordLen - length of the keyword partition.
keywordLine - the line in the original document where the keyword partition starts.
keywordCol - the column in the original document where the keyword partition starts.
versionOffset - offset for the version partition.
versionLen - length of the version partition.
versionLine - the line in the original document where the version partition starts.
versionCol - the column in the original document where the version partition starts.
encodingOffset - offset for the encoding partition.
encodingLen - length of the encoding partition.
encodingLine - the line in the original document where the encoding partition starts.
encodingCol - the column in the original document where the encoding partition starts.
standaloneOffset - offset for the standalone partition.
standaloneLen - length of the standalone partition.
standaloneLine - the line in the original document where the standalone partition starts.
standaloneCol - the column in the original document where the standalone partition starts.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException

handleProcessingInstruction

public void handleProcessingInstruction(char[] buffer,
                                        int targetOffset,
                                        int targetLen,
                                        int targetLine,
                                        int targetCol,
                                        int contentOffset,
                                        int contentLen,
                                        int contentLine,
                                        int contentCol,
                                        int outerOffset,
                                        int outerLen,
                                        int line,
                                        int col)
                                 throws AttoParseException
Description copied from interface: IProcessingInstructionHandling

Called when a Processing Instruction is found.

Three [offset, len] pairs are provided for three partitions (outer, target and content):

<?xls-stylesheet somePar1="a" somePar2="b"?>
| [TARGET------] [CONTENT----------------] |
[OUTER-------------------------------------]

Note that, although XML Declarations have the same format as processing instructions, they are not considered as such and therefore are handled by a different interface (IXmlDeclarationHandling).

Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).

Implementations of this handler should never modify the document buffer.

Specified by:
handleProcessingInstruction in interface IProcessingInstructionHandling
Overrides:
handleProcessingInstruction in class AbstractBasicMarkupAttoHandler
Parameters:
buffer - the document buffer (not copied)
targetOffset - offset for the target partition.
targetLen - length of the target partition.
targetLine - the line in the original document where the target partition starts.
targetCol - the column in the original document where the target partition starts.
contentOffset - offset for the content partition.
contentLen - length of the content partition.
contentLine - the line in the original document where the content partition starts.
contentCol - the column in the original document where the content partition starts.
outerOffset - offset for the outer partition.
outerLen - length of the outer partition.
line - the line in the original document where this artifact starts.
col - the column in the original document where this artifact starts.
Throws:
AttoParseException


Copyright © 2013 The ATTOPARSER team. All Rights Reserved.