ULS provides two sets of functions for converting text to or from UCS-2.

Care should be taken as these two sets of functions differ slightly in their parameters. In particular, the UniStr*Ucs functions specify the output buffer first, followed by the input buffer. Conversely, the UniUconv*Ucs functions specify the input buffer parameters first, followed by the output buffer parameters.

In addition, the parameters used to specify length behave somewhat differently between the two sets of functions; see the next section for details.

UniStrToUcs and UniStrFromUcs

These functions may be used to convert a string as a single operation, using relatively straightforward syntax:

UniStrToUcs
Converts a multi-byte codepage string to a UCS-2 string.
UniStrFromUcs
Converts a UCS-2 string to a multi-byte codepage string.

Each of these functions takes the following parameters:

If the function returns successfully, the converted string is placed in the output buffer; the other parameters are unchanged. If an error is encountered, the conversion aborts and an error code is returned. Character substitution is always enabled, regardless of the UconvObject's attributes.

UniUconvToUcs and UniUconvFromUcs

These functions are also used to convert strings, but with a greater level of control and error recovery:

UniUconvToUcs
Converts a multi-byte codepage string to a UCS-2 string.
UniUconvFromUcs
Converts a UCS-2 string to a multi-byte codepage string.

Each of these functions takes the following parameters:

When the function returns, the converted string is placed in the output buffer, and the substitution count will contain the number of character substitutions performed. If an error is encountered during conversion, an error code will be returned, and the parameters will be modified as follows:

If the conversion completes with no errors, the input length value will be set to 0 when the function returns.

Character substitution

During conversion, particularly when converting from UCS-2 to another codepage, it often occurs that a character is encountered in the input string for which no equivalent character exists in the target codepage. When this happens, the unsupported character is normally replaced by a generic symbol or "substitution character". This process is known as character substitution.

Every codepage (as well as UCS-2) has its own default substitution character which is normally used when converting text to that codepage. For example, under codepage 850 the default substitution character is 0x7F (""). The default substitution character for UCS-2 is U+FFFD (the Unicode replacement character).

You may specify a different substitution character in the UconvObject, either through the conversion specifier passed to UniCreateUconvObject, or subsequently through UniSetUconvObject. The substitution character should be a displayable glyph under the target codepage.

When using the UniUconvToUcs and UniUconvFromUcs functions, substitution may be disabled by setting the attributes of the UconvObject accordingly. However, if substitution is disabled, these functions will return with an error condition (with the results described above) whenever an unsupported character is encountered.


[Back] [Next]