DCMTK
Version 3.6.7
OFFIS DICOM Toolkit
|
A class for managing and converting between different character encodings. More...
Public Types | |
enum | ConversionFlags { AbortTranscodingOnIllegalSequence = 1 , DiscardIllegalSequences = 2 , TransliterateIllegalSequences = 4 } |
Constants to control encoder behavior, e.g. regarding illegal character sequences. More... | |
Public Member Functions | |
OFCharacterEncoding () | |
constructor. More... | |
OFCharacterEncoding (const OFCharacterEncoding &rhs) | |
copy constructor. More... | |
~OFCharacterEncoding () | |
destructor | |
OFCharacterEncoding & | operator= (const OFCharacterEncoding &rhs) |
copy assignment. More... | |
operator OFBool () const | |
check whether this object refers to a valid encoder. More... | |
OFBool | operator! () const |
check whether this object does not refer to a valid encoder. More... | |
OFBool | operator== (const OFCharacterEncoding &rhs) const |
check whether two OFCharacterEncoding instances refer to the same encoder. More... | |
OFBool | operator!= (const OFCharacterEncoding &rhs) const |
check whether two OFCharacterEncoding instances do not refer to the same encoder. More... | |
void | clear () |
clear the internal state. More... | |
unsigned | getConversionFlags () const |
get flags controlling converter behavior, e.g. specifying how illegal character sequences should be handled during conversion. More... | |
OFCondition | setConversionFlags (const unsigned flags) |
set flags controlling converter behavior, e.g. illegal character sequences should be handled during conversion. More... | |
OFCondition | selectEncoding (const OFString &fromEncoding, const OFString &toEncoding) |
select source and destination character encoding for subsequent conversion(s). More... | |
OFCondition | convertString (const OFString &fromString, OFString &toString, const OFBool clearMode=OFTrue) |
convert the given string between the selected character encodings. More... | |
OFCondition | convertString (const char *fromString, const size_t fromLength, OFString &toString, const OFBool clearMode=OFTrue) |
convert the given string between the selected character encodings. More... | |
Static Public Member Functions | |
static OFBool | hasDefaultEncoding () |
determine whether the underlying implementations defines a default encoding. More... | |
static OFString | getLocaleEncoding () |
get the character encoding of the currently set global locale. More... | |
static OFBool | supportsConversionFlags (const unsigned flags) |
determine whether the underlying implementation supports the given conversion flags. More... | |
static OFBool | isLibraryAvailable () |
check whether character set conversion is available, e.g. the underlying encoding library is available. More... | |
static OFString | getLibraryVersionString () |
get version information of the underlying character encoding library. More... | |
static size_t | countCharactersInUTF8String (const OFString &utf8String) |
count characters in given UTF-8 string and return the resulting number of so-called "code points". More... | |
Private Attributes | |
OFshared_ptr< Implementation > | TheImplementation |
shared pointer to internal implementation (interface to character encoding library) | |
A class for managing and converting between different character encodings.
The implementation relies on ICONV (native implementation or libiconv) or ICU, depending on the configuration.
Constants to control encoder behavior, e.g. regarding illegal character sequences.
Currently defined constants may be used to control the implementation's behavior regarding illegal character sequences. An illegal character sequence is a sequence of characters in the source string that is only valid in the context of the source string's character set and has no valid representation in the character set of the destination string. Use these constants to control the transcoding behavior in case an illegal sequence is encountered.
OFCharacterEncoding::OFCharacterEncoding | ( | ) |
constructor.
Will create an OFCharacterEncoding instance that does not refer to an encoder.
OFCharacterEncoding::OFCharacterEncoding | ( | const OFCharacterEncoding & | rhs | ) |
copy constructor.
Will share the encoder of another OFCharacterEncoding instance.
rhs | another OFCharacterEncoding instance. |
void OFCharacterEncoding::clear | ( | ) |
clear the internal state.
This resets the converter and potentially frees all used resources (if this is the last OFCharacterEncoding instance referring to the encoder).
OFCondition OFCharacterEncoding::convertString | ( | const char * | fromString, |
const size_t | fromLength, | ||
OFString & | toString, | ||
const OFBool | clearMode = OFTrue |
||
) |
convert the given string between the selected character encodings.
That means selectEncoding() has to be called prior to this method. Since the length of the input string has to be specified explicitly, the string can contain more than one NULL byte.
fromString | input string to be converted (using the source character encoding). A NULL pointer is regarded as an empty string. |
fromLength | length of the input string (number of bytes without the trailing NULL byte) |
toString | reference to variable where the converted string (using the destination character encoding) is stored (or appended, see parameter 'clearMode') |
clearMode | flag indicating whether to clear the variable 'toString' before appending the converted string |
OFCondition OFCharacterEncoding::convertString | ( | const OFString & | fromString, |
OFString & | toString, | ||
const OFBool | clearMode = OFTrue |
||
) |
convert the given string between the selected character encodings.
That means selectEncoding() has to be called prior to this method.
fromString | input string to be converted (using the source character encoding) |
toString | reference to variable where the converted string (using the destination character encoding) is stored (or appended, see parameter 'clearMode') |
clearMode | flag indicating whether to clear the variable 'toString' before appending the converted string |
|
static |
count characters in given UTF-8 string and return the resulting number of so-called "code points".
Please note that invalid UTF-8 encodings are not handled properly. ASCII strings (7-bit) are also supported, although OFString::length() is probably much faster.
utf8String | valid character string with UTF-8 encoding |
unsigned OFCharacterEncoding::getConversionFlags | ( | ) | const |
get flags controlling converter behavior, e.g. specifying how illegal character sequences should be handled during conversion.
|
static |
get version information of the underlying character encoding library.
Typical output format: "LIBICONV, Version 1.14". If character encoding is not available the output is: "<no character encoding library available>"
|
static |
get the character encoding of the currently set global locale.
|
static |
determine whether the underlying implementations defines a default encoding.
Most implementations define a default encoding, i.e. one can pass an empty string as the toEncoding and/or fromEncoding argument(s) of selectEncoding() to select the current locale's encoding. However, some iconv implementations inside the C standard library do not understand this.
|
static |
check whether character set conversion is available, e.g. the underlying encoding library is available.
If not, no conversion between different character encodings will be possible (apart from the Windows-specific wide character conversion functions).
OFCharacterEncoding::operator OFBool | ( | ) | const |
check whether this object refers to a valid encoder.
OFBool OFCharacterEncoding::operator! | ( | ) | const |
check whether this object does not refer to a valid encoder.
OFBool OFCharacterEncoding::operator!= | ( | const OFCharacterEncoding & | rhs | ) | const |
check whether two OFCharacterEncoding instances do not refer to the same encoder.
rhs | another OFCharacterEncoding instance. |
OFCharacterEncoding& OFCharacterEncoding::operator= | ( | const OFCharacterEncoding & | rhs | ) |
copy assignment.
Effectively calls clear() and then shares the encoder of another OFCharacterEncoding instance.
rhs | another OFCharacterEncoding instance. |
OFBool OFCharacterEncoding::operator== | ( | const OFCharacterEncoding & | rhs | ) | const |
check whether two OFCharacterEncoding instances refer to the same encoder.
rhs | another OFCharacterEncoding instance. |
OFCondition OFCharacterEncoding::selectEncoding | ( | const OFString & | fromEncoding, |
const OFString & | toEncoding | ||
) |
select source and destination character encoding for subsequent conversion(s).
The encoding names can be found in the documentation of the underlying implementation (e.g. libiconv). Typical names are "ASCII", "ISO-8859-1" and "UTF-8". An empty string denotes the encoding of the current locale (see getLocaleEncoding()).
fromEncoding | name of the source character encoding |
toEncoding | name of the destination character encoding |
OFCondition OFCharacterEncoding::setConversionFlags | ( | const unsigned | flags | ) |
set flags controlling converter behavior, e.g. illegal character sequences should be handled during conversion.
flags | the conversion flags that shall be used, a combination of the OFCharacterEncoding::ConversionFlags constants, e.g. TransliterateIllegalSequences | DiscardIllegalSequences. |
|
static |
determine whether the underlying implementation supports the given conversion flags.
flags | the flags to query, a combination of OFCharacterEncoding::ConversionFlags constants, e.g. TransliterateIllegalSequences | DiscardIllegalSequences. |