Class Attributes
- All Implemented Interfaces:
CharSequence
,Comparable<Segment>
,Iterable<Attribute>
,Collection<Attribute>
,List<Attribute>
Attribute
objects present within a particular StartTag
.
This segment starts at the end of the start tag's name and ends at the end of the last attribute.
The attributes in this list are a representation of those found in the source document and are not modifiable.
The OutputDocument.replace(Attributes, Map)
and OutputDocument.replace(Attributes, boolean convertNamesToLowerCase)
methods
provide the means to add, delete or modify attributes and their values in an OutputDocument
.
Any server tags encountered inside the attributes area of a non-server tag do not interfere with the parsing of the attributes.
If too many syntax errors are encountered while parsing a start tag's attributes, the parser rejects the entire start tag
and generates a log entry.
The threshold for the number of errors allowed can be set using the setDefaultMaxErrorCount(int)
static method.
Obtained using the StartTag.getAttributes()
method, or explicitly using the Source.parseAttributes(int pos, int maxEnd)
method.
It is common for instances of this class to contain no attributes.
See also the XML 1.0 specification for attributes.
-
Method Summary
Modifier and TypeMethodDescriptionstatic String
generateHTML
(Map<String, String> attributesMap) Returns the contents of the specified attributes map as HTML attribute name/value pairs.Returns theAttribute
with the specified name (case insensitive).int
getCount()
Returns the number of attributes.Returns a string representation of this object useful for debugging purposes.static int
Returns the default maximum error count allowed when parsing attributes.Returns the decoded value of the attribute with the specified name (case insensitive).iterator()
Returns an iterator over theAttribute
objects in this list in order of appearance.listIterator
(int index) Returns a list iterator of theAttribute
objects in this list in order of appearance, starting at the specified position in the list.populateMap
(Map<String, String> attributesMap, boolean convertNamesToLowerCase) Populates the specifiedMap
with the name/value pairs from these attributes.static void
setDefaultMaxErrorCount
(int value) Sets the default maximum error count allowed when parsing attributes.Methods inherited from class net.htmlparser.jericho.nodoc.SequentialListSegment
add, add, addAll, addAll, clear, contains, containsAll, get, indexOf, isEmpty, lastIndexOf, listIterator, remove, remove, removeAll, retainAll, set, size, subList, toArray, toArray
Methods inherited from class net.htmlparser.jericho.Segment
charAt, compareTo, encloses, encloses, equals, getAllCharacterReferences, getAllElements, getAllElements, getAllElements, getAllElements, getAllElements, getAllElementsByClass, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTagsByClass, getAllTags, getAllTags, getBegin, getChildElements, getEnd, getFirstElement, getFirstElement, getFirstElement, getFirstElement, getFirstElementByClass, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTagByClass, getFormControls, getFormFields, getMaxDepthIndicator, getNodeIterator, getRenderer, getRowColumnVector, getSource, getStyleURISegments, getTextExtractor, getURIAttributes, hashCode, ignoreWhenParsing, isWhiteSpace, isWhiteSpace, length, parseAttributes, subSequence, toString
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
Methods inherited from interface java.lang.CharSequence
chars, codePoints
Methods inherited from interface java.util.Collection
parallelStream, removeIf, stream, toArray
Methods inherited from interface java.util.List
equals, hashCode, replaceAll, sort, spliterator
-
Method Details
-
get
Returns theAttribute
with the specified name (case insensitive).If more than one attribute exists with the specified name (which is illegal HTML), the first is returned.
- Parameters:
name
- the name of the attribute to get.- Returns:
- the attribute with the specified name, or
null
if no attribute with the specified name exists. - See Also:
-
getValue
Returns the decoded value of the attribute with the specified name (case insensitive).Returns
null
if no attribute with the specified name exists or the attribute has no value.This is equivalent to
get(name)
.
getValue()
, except that it returnsnull
if no attribute with the specified name exists instead of throwing aNullPointerException
.- Parameters:
name
- the name of the attribute to get.- Returns:
- the decoded value of the attribute with the specified name, or
null
if the attribute does not exist or has no value. - See Also:
-
getCount
public int getCount()Returns the number of attributes.This is equivalent to calling the
size()
method specified in theList
interface.- Specified by:
getCount
in classSequentialListSegment<Attribute>
- Returns:
- the number of attributes.
-
iterator
Returns an iterator over theAttribute
objects in this list in order of appearance.- Specified by:
iterator
in interfaceCollection<Attribute>
- Specified by:
iterator
in interfaceIterable<Attribute>
- Specified by:
iterator
in interfaceList<Attribute>
- Overrides:
iterator
in classSequentialListSegment<Attribute>
- Returns:
- an iterator over the
Attribute
objects in this list in order of appearance.
-
listIterator
Returns a list iterator of theAttribute
objects in this list in order of appearance, starting at the specified position in the list.The specified index indicates the first item that would be returned by an initial call to the
next()
method. An initial call to theprevious()
method would return the item with the specified index minus one.IMPLEMENTATION NOTE: For efficiency reasons this method does not return an immutable list iterator. Calling any of the
add(Object)
,remove()
orset(Object)
methods on the returnedListIterator
does not throw an exception but could result in unexpected behaviour.- Specified by:
listIterator
in interfaceList<Attribute>
- Specified by:
listIterator
in classSequentialListSegment<Attribute>
- Parameters:
index
- the index of the first item to be returned from the list iterator (by a call to thenext()
method).- Returns:
- a list iterator of the items in this list (in proper sequence), starting at the specified position in the list.
- Throws:
IndexOutOfBoundsException
- if the specified index is out of range (index < 0 || index > size()
).
-
populateMap
public Map<String,String> populateMap(Map<String, String> attributesMap, boolean convertNamesToLowerCase) Populates the specifiedMap
with the name/value pairs from these attributes.Both names and values are stored as
String
objects.The entries are added in order of apprearance in the source document.
An attribute with no value is represented by a map entry with a
null
value.Attribute values are automatically decoded before storage in the map.
- Parameters:
attributesMap
- the map to populate, must not benull
.convertNamesToLowerCase
- specifies whether all attribute names are converted to lower case in the map.- Returns:
- the same map specified as the argument to the
attributesMap
parameter, populated with the name/value pairs from these attributes. - See Also:
-
getDebugInfo
Returns a string representation of this object useful for debugging purposes.- Overrides:
getDebugInfo
in classSegment
- Returns:
- a string representation of this object useful for debugging purposes.
-
getDefaultMaxErrorCount
public static int getDefaultMaxErrorCount()Returns the default maximum error count allowed when parsing attributes.The system default value is 2.
When searching for start tags, the parser can find the end of the start tag only by parsing the attributes, as it is valid HTML for attribute values to contain '>' characters (see the HTML 4.01 specification section 5.3.2).
If the source text being parsed does not follow the syntax of an attribute list at all, the parser assumes that the text which was originally identified as the beginning of of a start tag is in fact some other text, such as an invalid '<' character in the middle of some text, or part of a script element. In this case the entire start tag is rejected.
On the other hand, it is quite common for attributes to contain minor syntactical errors, such as an invalid character in an attribute name. For this reason the parser allows a certain number of minor errors to occur while parsing an attribute list before the entire start tag or attribute list is rejected. This property indicates the number of minor errors allowed.
Major syntactical errors cause the start tag or attribute list to be rejected immediately, regardless of the maximum error count setting.
Some errors are considered too minor to count at all (ignorable), such as missing white space between the end of a quoted attribute value and the start of the next attribute name.
The classification of particular syntax errors in attribute lists into major, minor, and ignorable is not part of the specification and may change in future versions.
Errors are logged as they occur.
The value of this property is set using the
setDefaultMaxErrorCount(int)
method.- Returns:
- the default maximum error count allowed when parsing attributes.
- See Also:
-
setDefaultMaxErrorCount
public static void setDefaultMaxErrorCount(int value) Sets the default maximum error count allowed when parsing attributes.See the
getDefaultMaxErrorCount()
method for a full description of this property.- Parameters:
value
- the default maximum error count allowed when parsing attributes.
-
generateHTML
Returns the contents of the specified attributes map as HTML attribute name/value pairs.Each attribute (including the first) is preceded by a single space, and all values are encoded and enclosed in double quotes.
The map keys must be of type
String
and values must be objects that implement theCharSequence
interface.A
null
value represents an attribute with no value.- Parameters:
attributesMap
- a map containing attribute name/value pairs.- Returns:
- the contents of the specified attributes map as HTML attribute name/value pairs.
- See Also:
-