Package com.ibm.icu.charset
Class CharsetUTF8
java.lang.Object
java.nio.charset.Charset
com.ibm.icu.charset.CharsetICU
com.ibm.icu.charset.CharsetUTF8
- All Implemented Interfaces:
Comparable<Charset>
- Direct Known Subclasses:
CharsetCESU8
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) class
(package private) class
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int[]
private static final byte[]
private final boolean
Fields inherited from class com.ibm.icu.charset.CharsetICU
codepage, conversionType, hasFromUnicodeFallback, hasToUnicodeFallback, icuCanonicalName, maxBytesPerChar, maxCharsPerByte, minBytesPerChar, name, options, platform, ROUNDTRIP_AND_FALLBACK_SET, ROUNDTRIP_SET, subChar, subChar1, subCharLen, unicodeMask
-
Constructor Summary
ConstructorsConstructorDescriptionCharsetUTF8
(String icuCanonicalName, String javaCanonicalName, String[] aliases) -
Method Summary
Modifier and TypeMethodDescriptionprivate static final byte
encodeHeadOf1
(int char32) private static final byte
encodeHeadOf2
(int char32) private static final byte
encodeHeadOf3
(int char32) private static final byte
encodeHeadOf4
(int char32) private static final byte
encodeLastTail
(int char32) private static final byte
encodeSecondToLastTail
(int char32) private static final byte
encodeThirdToLastTail
(int char32) (package private) void
getUnicodeSetImpl
(UnicodeSet setFillIn, int which) This follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored.Methods inherited from class com.ibm.icu.charset.CharsetICU
contains, forNameICU, getCharset, getCompleteUnicodeSet, getNonSurrogateUnicodeSet, getUnicodeSet, isFixedWidth, isSurrogate
Methods inherited from class java.nio.charset.Charset
aliases, availableCharsets, canEncode, compareTo, decode, defaultCharset, displayName, displayName, encode, encode, equals, forName, hashCode, isRegistered, isSupported, name, toString
-
Field Details
-
fromUSubstitution
private static final byte[] fromUSubstitution -
BITMASK_FROM_UTF8
private static final int[] BITMASK_FROM_UTF8 -
isCESU8
private final boolean isCESU8
-
-
Constructor Details
-
CharsetUTF8
-
-
Method Details
-
encodeHeadOf1
private static final byte encodeHeadOf1(int char32) -
encodeHeadOf2
private static final byte encodeHeadOf2(int char32) -
encodeHeadOf3
private static final byte encodeHeadOf3(int char32) -
encodeHeadOf4
private static final byte encodeHeadOf4(int char32) -
encodeThirdToLastTail
private static final byte encodeThirdToLastTail(int char32) -
encodeSecondToLastTail
private static final byte encodeSecondToLastTail(int char32) -
encodeLastTail
private static final byte encodeLastTail(int char32) -
newDecoder
- Specified by:
newDecoder
in classCharset
-
newEncoder
- Specified by:
newEncoder
in classCharset
-
getUnicodeSetImpl
Description copied from class:CharsetICU
This follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored. Detects Unicode signature byte sequences at the start of the byte stream and returns number of bytes of the BOM of the indicated Unicode charset. 0 is returned when no Unicode signature is recognized.- Specified by:
getUnicodeSetImpl
in classCharsetICU
-