|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Objectorg.attoparser.AbstractAttoHandler
org.attoparser.markup.AbstractBasicMarkupAttoHandler
org.attoparser.markup.AbstractDetailedMarkupAttoHandler
org.attoparser.markup.html.AbstractDetailedNonValidatingHtmlAttoHandler
org.attoparser.markup.html.trace.HtmlCodeDisplayAttoHandler
public class HtmlCodeDisplayAttoHandler
Constructor Summary | |
---|---|
HtmlCodeDisplayAttoHandler(String documentName,
Writer writer,
boolean createHtmlAsFragment)
|
|
HtmlCodeDisplayAttoHandler(String documentName,
Writer writer,
HtmlParsingConfiguration configuration,
boolean createHtmlFragment)
|
Method Summary | |
---|---|
void |
handleCDATASection(char[] buffer,
int contentOffset,
int contentLen,
int outerOffset,
int outerLen,
int line,
int col)
Called when a CDATA section is found. |
void |
handleComment(char[] buffer,
int contentOffset,
int contentLen,
int outerOffset,
int outerLen,
int line,
int col)
Called when a comment is found. |
void |
handleDocType(char[] buffer,
int keywordOffset,
int keywordLen,
int keywordLine,
int keywordCol,
int elementNameOffset,
int elementNameLen,
int elementNameLine,
int elementNameCol,
int typeOffset,
int typeLen,
int typeLine,
int typeCol,
int publicIdOffset,
int publicIdLen,
int publicIdLine,
int publicIdCol,
int systemIdOffset,
int systemIdLen,
int systemIdLine,
int systemIdCol,
int internalSubsetOffset,
int internalSubsetLen,
int internalSubsetLine,
int internalSubsetCol,
int outerOffset,
int outerLen,
int outerLine,
int outerCol)
Called when a DOCTYPE clause is found. |
void |
handleDocumentEnd(long endTimeNanos,
long totalTimeNanos,
int line,
int col,
HtmlParsingConfiguration configuration)
|
void |
handleDocumentStart(long startTimeNanos,
int line,
int col,
HtmlParsingConfiguration configuration)
|
void |
handleHtmlAttribute(char[] buffer,
int nameOffset,
int nameLen,
int nameLine,
int nameCol,
int operatorOffset,
int operatorLen,
int operatorLine,
int operatorCol,
int valueContentOffset,
int valueContentLen,
int valueOuterOffset,
int valueOuterLen,
int valueLine,
int valueCol)
Called when an attribute is found. |
void |
handleHtmlCloseElementEnd(IHtmlElement element,
int line,
int col)
Called when the end of a close element (a close tag) is found. |
void |
handleHtmlCloseElementStart(IHtmlElement element,
char[] buffer,
int offset,
int len,
int line,
int col)
Called when the start of a close element (a close tag) is found. |
void |
handleHtmlInnerWhiteSpace(char[] buffer,
int offset,
int len,
int line,
int col)
Called when an amount of white space is found inside an element. |
void |
handleHtmlOpenElementEnd(IHtmlElement element,
int line,
int col)
Called when the end of an open element (an open tag) is found. |
void |
handleHtmlOpenElementStart(IHtmlElement element,
char[] buffer,
int offset,
int len,
int line,
int col)
Called when the start of an open element (an open tag) is found. |
void |
handleHtmlStandaloneElementEnd(IHtmlElement element,
boolean minimized,
int line,
int col)
Called when the end of a standalone element is found. |
void |
handleHtmlStandaloneElementStart(IHtmlElement element,
boolean minimized,
char[] buffer,
int offset,
int len,
int line,
int col)
Called when the start of a standalone element is found. |
void |
handleProcessingInstruction(char[] buffer,
int targetOffset,
int targetLen,
int targetLine,
int targetCol,
int contentOffset,
int contentLen,
int contentLine,
int contentCol,
int outerOffset,
int outerLen,
int line,
int col)
Called when a Processing Instruction is found. |
void |
handleText(char[] buffer,
int offset,
int len,
int line,
int col)
Called when a text artifact is found. |
void |
handleXmlDeclarationDetail(char[] buffer,
int keywordOffset,
int keywordLen,
int keywordLine,
int keywordCol,
int versionOffset,
int versionLen,
int versionLine,
int versionCol,
int encodingOffset,
int encodingLen,
int encodingLine,
int encodingCol,
int standaloneOffset,
int standaloneLen,
int standaloneLine,
int standaloneCol,
int outerOffset,
int outerLen,
int line,
int col)
Called when a XML Declaration is found when using a handler extending from AbstractDetailedMarkupAttoHandler . |
String |
tokenify(String text)
|
Methods inherited from class org.attoparser.markup.AbstractDetailedMarkupAttoHandler |
---|
handleCloseElement, handleDocType, handleDocumentEnd, handleDocumentStart, handleOpenElement, handleStandaloneElement, handleXmlDeclaration |
Methods inherited from class org.attoparser.markup.AbstractBasicMarkupAttoHandler |
---|
handleStructure |
Methods inherited from class org.attoparser.AbstractAttoHandler |
---|
getEndTimeNanos, getStartTimeNanos, getTotalTimeNanos, handleDocumentEnd, handleDocumentStart |
Methods inherited from class Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public HtmlCodeDisplayAttoHandler(String documentName, Writer writer, boolean createHtmlAsFragment)
public HtmlCodeDisplayAttoHandler(String documentName, Writer writer, HtmlParsingConfiguration configuration, boolean createHtmlFragment)
Method Detail |
---|
public String tokenify(String text)
public void handleDocumentStart(long startTimeNanos, int line, int col, HtmlParsingConfiguration configuration) throws AttoParseException
handleDocumentStart
in class AbstractDetailedNonValidatingHtmlAttoHandler
AttoParseException
public void handleDocumentEnd(long endTimeNanos, long totalTimeNanos, int line, int col, HtmlParsingConfiguration configuration) throws AttoParseException
handleDocumentEnd
in class AbstractDetailedNonValidatingHtmlAttoHandler
AttoParseException
public void handleHtmlStandaloneElementStart(IHtmlElement element, boolean minimized, char[] buffer, int offset, int len, int line, int col) throws AttoParseException
IDetailedHtmlElementHandling
Called when the start of a standalone element is found.
A standalone element can be either a minimized tag or not. For example: <img src="..." /> is minimized (self-closed), as opposed to <img src="..."> (non-minimized, perfectly valid from the HTML but not from the XML or XHTML standpoints).
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleHtmlStandaloneElementStart
in interface IDetailedHtmlElementHandling
handleHtmlStandaloneElementStart
in class AbstractDetailedNonValidatingHtmlAttoHandler
element
- the IHtmlElement
element object representing the corresponding HTML element.minimized
- whether the tag representing this element is minimized (self-closed) or not.buffer
- the document buffer (not copied).offset
- the offset (position in buffer) where the element name appears.len
- the length (in chars) of the element name.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleHtmlStandaloneElementEnd(IHtmlElement element, boolean minimized, int line, int col) throws AttoParseException
IDetailedHtmlElementHandling
Called when the end of a standalone element is found.
handleHtmlStandaloneElementEnd
in interface IDetailedHtmlElementHandling
handleHtmlStandaloneElementEnd
in class AbstractDetailedNonValidatingHtmlAttoHandler
element
- the IHtmlElement
element object representing the corresponding HTML element.minimized
- whether the tag representing this element is minimized (self-closed) or not.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleHtmlOpenElementStart(IHtmlElement element, char[] buffer, int offset, int len, int line, int col) throws AttoParseException
IDetailedHtmlElementHandling
Called when the start of an open element (an open tag) is found.
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleHtmlOpenElementStart
in interface IDetailedHtmlElementHandling
handleHtmlOpenElementStart
in class AbstractDetailedNonValidatingHtmlAttoHandler
element
- the IHtmlElement
element object representing the corresponding HTML element.buffer
- the document buffer (not copied).offset
- the offset (position in buffer) where the element name appears.len
- the length (in chars) of the element name.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleHtmlOpenElementEnd(IHtmlElement element, int line, int col) throws AttoParseException
IDetailedHtmlElementHandling
Called when the end of an open element (an open tag) is found.
handleHtmlOpenElementEnd
in interface IDetailedHtmlElementHandling
handleHtmlOpenElementEnd
in class AbstractDetailedNonValidatingHtmlAttoHandler
element
- the IHtmlElement
element object representing the corresponding HTML element.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleHtmlCloseElementStart(IHtmlElement element, char[] buffer, int offset, int len, int line, int col) throws AttoParseException
IDetailedHtmlElementHandling
Called when the start of a close element (a close tag) is found.
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleHtmlCloseElementStart
in interface IDetailedHtmlElementHandling
handleHtmlCloseElementStart
in class AbstractDetailedNonValidatingHtmlAttoHandler
element
- the IHtmlElement
element object representing the corresponding HTML element.buffer
- the document buffer (not copied).offset
- the offset (position in buffer) where the element name appears.len
- the length (in chars) of the element name.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleHtmlCloseElementEnd(IHtmlElement element, int line, int col) throws AttoParseException
IDetailedHtmlElementHandling
Called when the end of a close element (a close tag) is found.
handleHtmlCloseElementEnd
in interface IDetailedHtmlElementHandling
handleHtmlCloseElementEnd
in class AbstractDetailedNonValidatingHtmlAttoHandler
element
- the IHtmlElement
element object representing the corresponding HTML element.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleHtmlAttribute(char[] buffer, int nameOffset, int nameLen, int nameLine, int nameCol, int operatorOffset, int operatorLen, int operatorLine, int operatorCol, int valueContentOffset, int valueContentLen, int valueOuterOffset, int valueOuterLen, int valueLine, int valueCol) throws AttoParseException
IHtmlAttributeSequenceHandling
Called when an attribute is found.
Three [offset, len] pairs are provided for three partitions (name, operator, valueContent and valueOuter):
class="basic_column"
[NAM]* [VALUECONTE]| (*) = [OPERATOR]
| [VALUEOUTER--]
[OUTER-------------]
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleHtmlAttribute
in interface IHtmlAttributeSequenceHandling
handleHtmlAttribute
in class AbstractDetailedNonValidatingHtmlAttoHandler
buffer
- the document buffer (not copied)nameOffset
- offset for the name partition.nameLen
- length of the name partition.nameLine
- the line in the original document where the name partition starts.nameCol
- the column in the original document where the name partition starts.operatorOffset
- offset for the operator partition.operatorLen
- length of the operator partition.operatorLine
- the line in the original document where the operator partition starts.operatorCol
- the column in the original document where the operator partition starts.valueContentOffset
- offset for the valueContent partition.valueContentLen
- length of the valueContent partition.valueOuterOffset
- offset for the valueOuter partition.valueOuterLen
- length of the valueOuter partition.valueLine
- the line in the original document where the value (outer) partition starts.valueCol
- the column in the original document where the value (outer) partition starts.
AttoParseException
public void handleHtmlInnerWhiteSpace(char[] buffer, int offset, int len, int line, int col) throws AttoParseException
IHtmlAttributeSequenceHandling
Called when an amount of white space is found inside an element.
This attribute separators can contain any amount of whitespace, including line feeds:
<div id="main" class="basic_column">
[ATTSEP]
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleHtmlInnerWhiteSpace
in interface IHtmlAttributeSequenceHandling
handleHtmlInnerWhiteSpace
in class AbstractDetailedNonValidatingHtmlAttoHandler
buffer
- the document buffer (not copied)offset
- offset for the artifact.len
- length of the artifact.line
- the line in the original document where the artifact starts.col
- the column in the original document where the artifact starts.
AttoParseException
public void handleText(char[] buffer, int offset, int len, int line, int col) throws AttoParseException
IAttoHandler
Called when a text artifact is found.
A sequence of chars is considered to be text when no structures of any kind are contained inside it. In markup parsers, for example, this means no tags (a.k.a. elements), DOCTYPE's, processing instructions, etc. are contained in the sequence.
Text sequences might include any number of new line and/or control characters.
Text artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported texts should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleText
in interface IAttoHandler
handleText
in class AbstractAttoHandler
buffer
- the document buffer (not copied)offset
- the offset (position in buffer) where the text artifact starts.len
- the length (in chars) of the text artifact, starting in offset.line
- the line in the original document where this text artifact starts.col
- the column in the original document where this text artifact starts.
AttoParseException
public void handleComment(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col) throws AttoParseException
ICommentHandling
Called when a comment is found.
Two [offset, len] pairs are provided for two partitions (outer and content):
<!-- this is a comment -->
| [CONTENT----------] |
[OUTER-------------------]
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleComment
in interface ICommentHandling
handleComment
in class AbstractBasicMarkupAttoHandler
buffer
- the document buffer (not copied)contentOffset
- offset for the content partition.contentLen
- length of the content partition.outerOffset
- offset for the outer partition.outerLen
- length of the outer partition.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleCDATASection(char[] buffer, int contentOffset, int contentLen, int outerOffset, int outerLen, int line, int col) throws AttoParseException
ICDATASectionHandling
Called when a CDATA section is found.
Two [offset, len] pairs are provided for two partitions (outer and content):
<![CDATA[ this is a CDATA section ]]>
| [CONTENT----------------] |
[OUTER------------------------------]
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleCDATASection
in interface ICDATASectionHandling
handleCDATASection
in class AbstractBasicMarkupAttoHandler
buffer
- the document buffer (not copied)contentOffset
- offset for the content partition.contentLen
- length of the content partition.outerOffset
- offset for the outer partition.outerLen
- length of the outer partition.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleXmlDeclarationDetail(char[] buffer, int keywordOffset, int keywordLen, int keywordLine, int keywordCol, int versionOffset, int versionLen, int versionLine, int versionCol, int encodingOffset, int encodingLen, int encodingLine, int encodingCol, int standaloneOffset, int standaloneLen, int standaloneLine, int standaloneCol, int outerOffset, int outerLen, int line, int col) throws AttoParseException
AbstractDetailedMarkupAttoHandler
Called when a XML Declaration is found when using a handler extending from
AbstractDetailedMarkupAttoHandler
.
Five [offset, len] pairs are provided for five partitions (outer, keyword, version, encoding and standalone):
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
| [K] [V] [ENC] [S] |
[OUTER------------------------------------------------]
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleXmlDeclarationDetail
in class AbstractDetailedMarkupAttoHandler
buffer
- the document buffer (not copied)keywordOffset
- offset for the keyword partition.keywordLen
- length of the keyword partition.keywordLine
- the line in the original document where the keyword partition starts.keywordCol
- the column in the original document where the keyword partition starts.versionOffset
- offset for the version partition.versionLen
- length of the version partition.versionLine
- the line in the original document where the version partition starts.versionCol
- the column in the original document where the version partition starts.encodingOffset
- offset for the encoding partition.encodingLen
- length of the encoding partition.encodingLine
- the line in the original document where the encoding partition starts.encodingCol
- the column in the original document where the encoding partition starts.standaloneOffset
- offset for the standalone partition.standaloneLen
- length of the standalone partition.standaloneLine
- the line in the original document where the standalone partition starts.standaloneCol
- the column in the original document where the standalone partition starts.outerOffset
- offset for the outer partition.outerLen
- length of the outer partition.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
public void handleDocType(char[] buffer, int keywordOffset, int keywordLen, int keywordLine, int keywordCol, int elementNameOffset, int elementNameLen, int elementNameLine, int elementNameCol, int typeOffset, int typeLen, int typeLine, int typeCol, int publicIdOffset, int publicIdLen, int publicIdLine, int publicIdCol, int systemIdOffset, int systemIdLen, int systemIdLine, int systemIdCol, int internalSubsetOffset, int internalSubsetLen, int internalSubsetLine, int internalSubsetCol, int outerOffset, int outerLen, int outerLine, int outerCol) throws AttoParseException
IDetailedDocTypeHandling
Called when a DOCTYPE clause is found.
This method reports the DOCTYPE clause splitting it into its different parts.
Seven [offset, len] pairs are provided for seven partitions (outer, keyword, elementName, type, publicId, systemId and internalSubset) of the DOCTYPE clause:
<!DOCTYPE html PUBLIC ".........." ".........." [................]>
| [KEYWO] [EN] [TYPE] [PUBLICID] [SYSTEMID] [INTERNALSUBSET] |
[OUTER------------------------------------------------------------]
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleDocType
in interface IDetailedDocTypeHandling
handleDocType
in class AbstractDetailedMarkupAttoHandler
buffer
- the document buffer (not copied)keywordOffset
- offset for the keyword partition.keywordLen
- length of the keyword partition.keywordLine
- the line in the original document where the keyword partition starts.keywordCol
- the column in the original document where the keyword partition starts.elementNameOffset
- offset for the elementName partition.elementNameLen
- length of the elementName partition.elementNameLine
- the line in the original document where the elementName partition starts.elementNameCol
- the column in the original document where the elementName partition starts.typeOffset
- offset for the type partition.typeLen
- length of the type partition.typeLine
- the line in the original document where the type partition starts.typeCol
- the column in the original document where the type partition starts.publicIdOffset
- offset for the publicId partition.publicIdLen
- length of the publicId partition.publicIdLine
- the line in the original document where the publicId partition starts.publicIdCol
- the column in the original document where the publicId partition starts.systemIdOffset
- offset for the systemId partition.systemIdLen
- length of the systemId partition.systemIdLine
- the line in the original document where the systemId partition starts.systemIdCol
- the column in the original document where the systemId partition starts.internalSubsetOffset
- offset for the internalSubsetId partition.internalSubsetLen
- length of the internalSubsetId partition.internalSubsetLine
- the line in the original document where the internalSubsetId partition starts.internalSubsetCol
- the column in the original document where the internalSubsetId partition starts.outerOffset
- offset for the outer partition.outerLen
- length of the outer partition.outerLine
- the line in the original document where this artifact starts.outerCol
- the column in the original document where this artifact starts.
AttoParseException
public void handleProcessingInstruction(char[] buffer, int targetOffset, int targetLen, int targetLine, int targetCol, int contentOffset, int contentLen, int contentLine, int contentCol, int outerOffset, int outerLen, int line, int col) throws AttoParseException
IProcessingInstructionHandling
Called when a Processing Instruction is found.
Three [offset, len] pairs are provided for three partitions (outer, target and content):
<?xls-stylesheet somePar1="a" somePar2="b"?>
| [TARGET------] [CONTENT----------------] |
[OUTER-------------------------------------]
Note that, although XML Declarations have the same format as processing instructions,
they are not considered as such and therefore are handled by a different interface
(IXmlDeclarationHandling
).
Artifacts are reported using the document buffer directly, and this buffer should not be considered to be immutable, so reported structures should be copied if they need to be stored (either by copying len chars from the buffer char[] starting in offset or by creating a String from it using the same specification).
Implementations of this handler should never modify the document buffer.
handleProcessingInstruction
in interface IProcessingInstructionHandling
handleProcessingInstruction
in class AbstractBasicMarkupAttoHandler
buffer
- the document buffer (not copied)targetOffset
- offset for the target partition.targetLen
- length of the target partition.targetLine
- the line in the original document where the target partition starts.targetCol
- the column in the original document where the target partition starts.contentOffset
- offset for the content partition.contentLen
- length of the content partition.contentLine
- the line in the original document where the content partition starts.contentCol
- the column in the original document where the content partition starts.outerOffset
- offset for the outer partition.outerLen
- length of the outer partition.line
- the line in the original document where this artifact starts.col
- the column in the original document where this artifact starts.
AttoParseException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |