Package com.ibm.icu.charset
Class CharsetCESU8
java.lang.Object
java.nio.charset.Charset
com.ibm.icu.charset.CharsetICU
com.ibm.icu.charset.CharsetUTF8
com.ibm.icu.charset.CharsetCESU8
- All Implemented Interfaces:
Comparable<Charset>
The purpose of this class is to set isCESU8 to true in the super class, and to allow the Charset framework to open
the variant UTF-8 converter without extra setup work. CESU-8 encodes/decodes supplementary characters as 6 bytes
instead of the proper 4 bytes.
-
Nested Class Summary
Nested classes/interfaces inherited from class com.ibm.icu.charset.CharsetUTF8
CharsetUTF8.CharsetDecoderUTF8, CharsetUTF8.CharsetEncoderUTF8
-
Field Summary
Fields inherited from class com.ibm.icu.charset.CharsetICU
codepage, conversionType, hasFromUnicodeFallback, hasToUnicodeFallback, icuCanonicalName, maxBytesPerChar, maxCharsPerByte, minBytesPerChar, name, options, platform, ROUNDTRIP_AND_FALLBACK_SET, ROUNDTRIP_SET, subChar, subChar1, subCharLen, unicodeMask
-
Constructor Summary
ConstructorsConstructorDescriptionCharsetCESU8
(String icuCanonicalName, String javaCanonicalName, String[] aliases) -
Method Summary
Modifier and TypeMethodDescription(package private) void
getUnicodeSetImpl
(UnicodeSet setFillIn, int which) This follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored.Methods inherited from class com.ibm.icu.charset.CharsetUTF8
newDecoder, newEncoder
Methods inherited from class com.ibm.icu.charset.CharsetICU
contains, forNameICU, getCharset, getCompleteUnicodeSet, getNonSurrogateUnicodeSet, getUnicodeSet, isFixedWidth, isSurrogate
Methods inherited from class java.nio.charset.Charset
aliases, availableCharsets, canEncode, compareTo, decode, defaultCharset, displayName, displayName, encode, encode, equals, forName, hashCode, isRegistered, isSupported, name, toString
-
Constructor Details
-
CharsetCESU8
-
-
Method Details
-
getUnicodeSetImpl
Description copied from class:CharsetICU
This follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored. Detects Unicode signature byte sequences at the start of the byte stream and returns number of bytes of the BOM of the indicated Unicode charset. 0 is returned when no Unicode signature is recognized.- Overrides:
getUnicodeSetImpl
in classCharsetUTF8
-