Class EndTag
- All Implemented Interfaces:
CharSequence
,Comparable<Segment>
An end tag always has a type that is a subclass of EndTagType
, meaning it
always starts with the characters '</
'.
EndTag
instances are obtained using one of the following methods:
Element.getEndTag()
Tag.getNextTag()
Tag.getPreviousTag()
Source.getPreviousEndTag(int pos)
Source.getPreviousEndTag(int pos, String name)
Source.getPreviousTag(int pos)
Source.getPreviousTag(int pos, TagType)
Source.getNextEndTag(int pos)
Source.getNextEndTag(int pos, String name)
Source.getNextEndTag(int pos, String name, EndTagType)
Source.getNextTag(int pos)
Source.getNextTag(int pos, TagType)
Source.getEnclosingTag(int pos)
Source.getEnclosingTag(int pos, TagType)
Source.getTagAt(int pos)
Segment.getAllTags()
Segment.getAllTags(TagType)
The Tag
superclass defines the getName()
method used to get the name of this end tag.
See also the XML 1.0 specification for end tags.
-
Method Summary
Modifier and TypeMethodDescriptionstatic String
generateHTML
(String tagName) Returns a string representation of this object useful for debugging purposes.Returns the element that is ended by this end tag.Returns the type of this end tag.Returns the type of this tag.boolean
Indicates whether this tag has a syntax that does not match any of the registered tag types.tidy()
Returns an XML representation of this end tag.Methods inherited from class net.htmlparser.jericho.Tag
getName, getNameSegment, getNextTag, getPreviousTag, getUserData, isXMLName, isXMLNameChar, isXMLNameStartChar, setUserData
Methods inherited from class net.htmlparser.jericho.Segment
charAt, compareTo, encloses, encloses, equals, getAllCharacterReferences, getAllElements, getAllElements, getAllElements, getAllElements, getAllElements, getAllElementsByClass, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTagsByClass, getAllTags, getAllTags, getBegin, getChildElements, getEnd, getFirstElement, getFirstElement, getFirstElement, getFirstElement, getFirstElementByClass, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTagByClass, getFormControls, getFormFields, getMaxDepthIndicator, getNodeIterator, getRenderer, getRowColumnVector, getSource, getStyleURISegments, getTextExtractor, getURIAttributes, hashCode, ignoreWhenParsing, isWhiteSpace, isWhiteSpace, length, parseAttributes, subSequence, toString
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
Methods inherited from interface java.lang.CharSequence
chars, codePoints, isEmpty
-
Method Details
-
getElement
Returns the element that is ended by this end tag.Returns
null
if this end tag is not properly matched to any start tag in the source document.This method is much less efficient than the
StartTag.getElement()
method.IMPLEMENTATION NOTE: The explanation for why this method is relatively inefficient lies in the fact that more than one start tag type can have the same corresponding end tag type, so it is not possible to know for certain which type of start tag this end tag is matched to (see
EndTagType.getCorrespondingStartTagType()
for more explanation). Because of this uncertainty, the implementation of this method must check every start tag preceding this end tag, calling itsStartTag.getElement()
method to see whether it is terminated by this end tag.- Specified by:
getElement
in classTag
- Returns:
- the element that is ended by this end tag.
-
getEndTagType
Returns the type of this end tag.This is equivalent to
(EndTagType)
getTagType()
.- Returns:
- the type of this end tag.
-
getTagType
Description copied from class:Tag
Returns the type of this tag.- Specified by:
getTagType
in classTag
- Returns:
- the type of this tag.
-
isUnregistered
public boolean isUnregistered()Description copied from class:Tag
Indicates whether this tag has a syntax that does not match any of the registered tag types.The only requirement of an unregistered tag type is that it starts with '
<
' and there is a closing '>
' character at some position after it in the source document.The absence or presence of a '
/
' character after the initial '<
' determines whether an unregistered tag is respectively aStartTag
with a type ofStartTagType.UNREGISTERED
or anEndTag
with a type ofEndTagType.UNREGISTERED
.There are no restrictions on the characters that might appear between these delimiters, including other '
<
' characters. This may result in a '>
' character that is identified as the closing delimiter of two separate tags, one an unregistered tag, and the other a tag of any type that begins in the middle of the unregistered tag. As explained below, unregistered tags are usually only found when specifically looking for them, so it is up to the user to detect and deal with any such nonsensical results.Unregistered tags are only returned by the
Source.getTagAt(int pos)
method, named search methods, where the specifiedname
matches the first characters inside the tag, and by tag type search methods, where the specifiedtagType
is eitherStartTagType.UNREGISTERED
orEndTagType.UNREGISTERED
.Open tag searches and other searches always ignore unregistered tags, although every discovery of an unregistered tag is logged by the parser.
The logic behind this design is that unregistered tag types are usually the result of a '
<
' character in the text that was mistakenly left unencoded, or a less-than operator inside a script, or some other occurrence which is of no interest to the user. By returning unregistered tags in named and tag type search methods, the library allows the user to specifically search for tags with a certain syntax that does not match any existingTagType
. This expediency feature avoids the need for the user to create a custom tag type to define the syntax before searching for these tags. By not returning unregistered tags in the less specific search methods, it is providing only the information that most users are interested in.- Specified by:
isUnregistered
in classTag
- Returns:
true
if this tag has a syntax that does not match any of the registered tag types, otherwisefalse
.
-
tidy
Returns an XML representation of this end tag.The tidying of the tag is carried out as follows:
- if this end tag is a
NORMAL
end tag then any white space before the closing angle bracket is removed. - otherwise the original source text of the entire tag is returned.
- if this end tag is a
-
generateHTML
Generates the HTML text of a normal end tag with the specified tag name.- Example:
-
The following method call:
EndTag.generateHTML("INPUT")
</INPUT>
-
getDebugInfo
Description copied from class:Segment
Returns a string representation of this object useful for debugging purposes.- Overrides:
getDebugInfo
in classSegment
- Returns:
- a string representation of this object useful for debugging purposes.
-