DCMTK  Version 3.6.1 20120515
OFFIS DICOM Toolkit
Public Member Functions | Static Public Member Functions | Protected Types | Protected Member Functions | Private Member Functions | Private Attributes | Friends
OFCharacterEncoding Class Reference

A class for managing and converting between different character encodings. More...

List of all members.

Public Member Functions

 OFCharacterEncoding ()
 constructor.
 ~OFCharacterEncoding ()
 destructor
void clear ()
 clear the internal state.
OFBool getTransliterationMode () const
 get mode specifying whether a character that cannot be represented in the destination character encoding is approximated through one or more characters that look similar to the original one
OFBool getDiscardIllegalSequenceMode () const
 get mode specifying whether characters that cannot be represented in the destination character encoding will be silently discarded
OFCondition setTransliterationMode (const OFBool mode)
 set mode specifying whether a character that cannot be represented in the destination character encoding is approximated through one or more characters that look similar to the original one.
OFCondition setDiscardIllegalSequenceMode (const OFBool mode)
 set mode specifying whether characters that cannot be represented in the destination character encoding will be silently discarded.
const OFStringgetLocaleEncoding () const
 get the current locale's character encoding
OFCondition updateLocaleEncoding ()
 updates the current locale's character encoding.
OFCondition selectEncoding (const OFString &fromEncoding, const OFString &toEncoding)
 select source and destination character encoding for subsequent conversion(s).
OFCondition convertString (const OFString &fromString, OFString &toString, const OFBool clearMode=OFTrue)
 convert the given string between the selected character encodings.
OFCondition convertString (const char *fromString, const size_t fromLength, OFString &toString, const OFBool clearMode=OFTrue)
 convert the given string between the selected character encodings.

Static Public Member Functions

static OFBool isLibraryAvailable ()
 check whether the underlying character encoding library is available.
static OFString getLibraryVersionString ()
 get version information of the underlying character encoding library.
static size_t countCharactersInUTF8String (const OFString &utf8String)
 count characters in given UTF-8 string and return the resulting number of so-called "code points".

Protected Types

typedef void * T_Descriptor
 type of the conversion descriptor (used by libiconv)

Protected Member Functions

OFCondition openDescriptor (T_Descriptor &descriptor, const OFString &fromEncoding, const OFString &toEncoding)
 allocate conversion descriptor for the given source and destination character encoding.
OFCondition closeDescriptor (T_Descriptor &descriptor)
 deallocate the given conversion descriptor that was previously allocated with openDescriptor().
OFBool isDescriptorValid (const T_Descriptor descriptor)
 check whether the given conversion descriptor is valid, i.e. has been allocated by a previous call to openDescriptor()
OFCondition convertString (T_Descriptor descriptor, const char *fromString, const size_t fromLength, OFString &toString, const OFBool clearMode=OFTrue)
 convert the given string between the specified character encodings.

Private Member Functions

 OFCharacterEncoding (const OFCharacterEncoding &)
OFCharacterEncodingoperator= (const OFCharacterEncoding &)
void createErrnoCondition (OFCondition &status, OFString message, const unsigned short code)
 create an error condition based on the curent value of "errno" and the given parameters.

Private Attributes

OFString LocaleEncoding
 current locale's character encoding
T_Descriptor ConversionDescriptor
 conversion descriptor used by libiconv
OFBool TransliterationMode
 transliteration mode (default: disabled)
OFBool DiscardIllegalSequenceMode
 discard illegal sequence mode (default: disabled)

Friends

class DcmSpecificCharacterSet

Detailed Description

A class for managing and converting between different character encodings.

The implementation relies on the libiconv toolkit (if available).


Constructor & Destructor Documentation

constructor.

Initializes the member variables, which includes the current locale's character encoding.


Member Function Documentation

clear the internal state.

This also closes the conversion descriptor if it was allocated before, so selectEncoding() has to be called again before a string can be converted to a new character encoding.

deallocate the given conversion descriptor that was previously allocated with openDescriptor().

Please do not pass arbitrary values to this method, since this will result in a segmentation fault.

Parameters:
descriptorconversion descriptor to be closed. After the descriptor has been deallocated, 'descriptor' is set to an invalid value - see isDescriptorValid().
Returns:
status, EC_Normal if successful, an error code otherwise. In case an invalid descriptor is passed, it is not regarded as an error.
OFCondition OFCharacterEncoding::convertString ( const OFString fromString,
OFString toString,
const OFBool  clearMode = OFTrue 
)

convert the given string between the selected character encodings.

That means selectEncoding() has to be called prior to this method.

Parameters:
fromStringinput string to be converted (using the source character encoding)
toStringreference to variable where the converted string (using the destination character encoding) is stored (or appended, see parameter 'clearMode')
clearModeflag indicating whether to clear the variable 'toString' before appending the converted string
Returns:
status, EC_Normal if successful, an error code otherwise
OFCondition OFCharacterEncoding::convertString ( const char *  fromString,
const size_t  fromLength,
OFString toString,
const OFBool  clearMode = OFTrue 
)

convert the given string between the selected character encodings.

That means selectEncoding() has to be called prior to this method. Since the length of the input string has to be specified explicitly, the string can contain more than one NULL byte.

Parameters:
fromStringinput string to be converted (using the source character encoding)
fromLengthlength of the input string (number of bytes without the trailing NULL byte)
toStringreference to variable where the converted string (using the destination character encoding) is stored (or appended, see parameter 'clearMode')
clearModeflag indicating whether to clear the variable 'toString' before appending the converted string
Returns:
status, EC_Normal if successful, an error code otherwise
OFCondition OFCharacterEncoding::convertString ( T_Descriptor  descriptor,
const char *  fromString,
const size_t  fromLength,
OFString toString,
const OFBool  clearMode = OFTrue 
) [protected]

convert the given string between the specified character encodings.

Since the length of the input string has to be specified explicitly, the string can contain more than one NULL byte.

Parameters:
descriptorpreviously allocated conversion descriptor to be used for the conversion of the character encodings
fromStringinput string to be converted (using the source character encoding)
fromLengthlength of the input string (number of bytes without the trailing NULL byte)
toStringreference to variable where the converted string (using the destination character encoding) is stored (or appended, see parameter 'clearMode')
clearModeflag indicating whether to clear the variable 'toString' before appending the converted string
Returns:
status, EC_Normal if successful, an error code otherwise
static size_t OFCharacterEncoding::countCharactersInUTF8String ( const OFString utf8String) [static]

count characters in given UTF-8 string and return the resulting number of so-called "code points".

Please note that invalid UTF-8 encodings are not handled properly. ASCII strings (7-bit) are also supported, although OFString::length() is probably much faster.

Parameters:
utf8Stringvalid character string with UTF-8 encoding
Returns:
number of characters (code points) in given UTF-8 string
void OFCharacterEncoding::createErrnoCondition ( OFCondition status,
OFString  message,
const unsigned short  code 
) [private]

create an error condition based on the curent value of "errno" and the given parameters.

The function OFStandard::strerror() is used to map the numerical value of the error to a textual description.

Parameters:
statusreference to variable where the condition is stored
messagemessage text that is used as a prefix to strerror()
codeunique status code of the error condition

get mode specifying whether characters that cannot be represented in the destination character encoding will be silently discarded

Returns:
current value of the mode. OFTrue means that the mode is enabled, OFFalse means disabled.

get version information of the underlying character encoding library.

Typical output format: "LIBICONV, Version 1.14". If the library is not available the output is: "<no character encoding library available>"

Returns:
name and version number of the character encoding library

get the current locale's character encoding

Returns:
the current locale's character encoding

get mode specifying whether a character that cannot be represented in the destination character encoding is approximated through one or more characters that look similar to the original one

Returns:
current value of the mode. OFTrue means that the mode is enabled, OFFalse means disabled.
OFBool OFCharacterEncoding::isDescriptorValid ( const T_Descriptor  descriptor) [protected]

check whether the given conversion descriptor is valid, i.e. has been allocated by a previous call to openDescriptor()

Parameters:
descriptorconversion descriptor to be checked
Returns:
OFTrue if the conversion descriptor is valid, OFFalse otherwise
static OFBool OFCharacterEncoding::isLibraryAvailable ( ) [static]

check whether the underlying character encoding library is available.

If the library is not available, no conversion between different character encodings will be possible.

Returns:
OFTrue if the character encoding library is available, OFFalse otherwise
OFCondition OFCharacterEncoding::openDescriptor ( T_Descriptor descriptor,
const OFString fromEncoding,
const OFString toEncoding 
) [protected]

allocate conversion descriptor for the given source and destination character encoding.

Please make sure that the descriptor is deallocated with closeDescriptor() when not needed any longer.

Parameters:
descriptorreference to variable where the newly allocated conversion descriptor is stored
fromEncodingname of the source character encoding
toEncodingname of the destination character encoding
Returns:
status, EC_Normal if successful, an error code otherwise
OFCondition OFCharacterEncoding::selectEncoding ( const OFString fromEncoding,
const OFString toEncoding 
)

select source and destination character encoding for subsequent conversion(s).

The encoding names can be found in the documentation of the libiconv toolkit. Typical names are "ASCII", "ISO-8859-1" and "UTF-8". An empty string denotes the locale dependent character encoding (see getLocaleEncoding()).

Parameters:
fromEncodingname of the source character encoding
toEncodingname of the destination character encoding
Returns:
status, EC_Normal if successful, an error code otherwise

set mode specifying whether characters that cannot be represented in the destination character encoding will be silently discarded.

By default, this mode is disabled.

Parameters:
modeenable mode by OFTrue or disable it by OFFalse
Returns:
status, EC_Normal if successful, an error code otherwise

set mode specifying whether a character that cannot be represented in the destination character encoding is approximated through one or more characters that look similar to the original one.

By default, this mode is disabled.

Parameters:
modeenable mode by OFTrue or disable it by OFFalse
Returns:
status, EC_Normal if successful, an error code otherwise

updates the current locale's character encoding.

This is only needed if the locale setting changed during the lifetime of this object, because the current locale's character encoding is always determined in the constructor. If possible the canonical encoding names listed in "config.charset" (see libiconv toolkit) are used.

Returns:
status, EC_Normal if successful, an error code otherwise

The documentation for this class was generated from the following file:


Generated on Tue May 15 2012 for DCMTK Version 3.6.1 20120515 by Doxygen 1.7.5.1-20111027