[an error occurred while processing this directive]

Software > OpenVMS Systems > Documentation > 82final > hangul > hangul_rtl_ref_htm

HP OpenVMS Systems Documentation

HP OpenVMS/Hangul RTL Korean Processing (HSY$) Manual

Contents

1.2 Features of HSYSHR

HSYSHR provides the following features and capabilities:

HSYSHR performs a wide range of general multi-byte processing operations. You can call the HSY$ routines instead of writing your own code to perform the operation.
Routines in HSYSHR follow the OpenVMS Procedure Calling Standard. It allows you to call any HSY$ routines from any programming language support in OpenVMS/Hangul, thus increasing program flexibility.
Because all routines are shared, they take up less virtual address space of a process.
When new versions of the HSYSHR are installed, you do not need to revise your calling program, and generally do not need to relink.

1.3 Linking with HSYSHR

Routines in HSYSHR execute entirely in the mode of the caller and are intended to be called in the user mode. To link your application that contains explicit calls to HSYSHR, use the following link command:

$ LINK program, SYS$LIBRARY:HSYIMGLIB.OLB/LIBRARY

Chapter 2
MULTI-BYTE CHARACTER CONCEPTS

This chapter describes some important concepts of multi-byte character that are used throughout the documentation.

2.1 What is Multi-byte Character?

DEC Hangul character set is implemented as a multi-byte character set containing Korean characters, punctuation marks and various kinds of symbols. Each multi-byte character refers to a two-byte character with the most significant bit of the first byte always set. In OpenVMS/Hangul operating system, the DEC Hangul character set is adopted, and all Korean characters are represented as multi-byte characters from the character set. For detailed discussion of the DEC Hangul character set, please refer to OpenVMS/Hangul User Guide.

2.2 Proper Character Boundary

In HSYSHR, most of the routines use characters as a processing entity contrary to conventional byte by byte processing. Some routines require the input character pointer pointing at the proper character boundary in the user buffer. "Pointing at the proper character boundary" means the character pointer should not point to the non-first-byte position of a multi-byte character.

2.3 Full Form and Half Form Character

In the DEC Hangul character set, there is a set of two-byte ASCII characters. To distinguish them from the conventional one-byte 7-bit ASCII characters, the terms "full form" and "half form" characters are used. Full form characters refer to two-byte ASCII characters whereas half form characters refer to one-byte 7-bit ASCII characters. Conversion services between full form and half form characters are provided by the conversion routines in HSYSHR. In some applications where character matching requires treating the full form and half form characters equivalent, the user can call the searching routines in HSYSHR and specify the conversion flag argument. Note that uppercasing and lowercasing can both be applied to these full form characters.

2.4 Multi-byte Character Unsigned Longword Representation

In HSYSHR, multi-byte character representation in single character argument is different from that found in the character string argument. Single character argument uses unsigned longword integer representation whereas characters in the string argument use the normal character string representation. An example is as follows. The two-byte character B0A1(hex) is represented differently in the following two cases.

Single character argument: (VMS Usage - longword_unsigned)

         +--+--+--+--+
         |00|00|B0|A1|
         +--+--+--+--+
         H           L

In a string argument: (VMS Usage - char_string)

             --+--+--+-   +--+
         ....  |A1|B0|....|  | start of string
             --+--+--+-   +--+
         H                   L

The read routines in HSYSHR read the buffer with character string format and return the character read in unsigned longword format. The write routines write the character in unsigned longword format to the buffer. The character written will be in character string format.

HSY$ Reference Section

This section provides detailed discussions of the routines provided in the Korean Processing Run Time Library HSYSHR.

HSY$CH_MOVE

HSY$CH_MOVE moves a substring from a specified source buffer to a specified destination buffer.

Format

HSY$CH_MOVE len,src,dst

Arguments

len

VMS usage: longword_signed

type: longword integer (signed)

access: read only

mechanism: by value

The length in bytes of the substring to be moved.
src

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: read only

mechanism: by value

The address of the starting position of the source buffer.
dst

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: read only

mechanism: by value

The address of the starting position of the destination buffer.

Description

This routine is multi-byte insensitive. If len is not specifying the proper multi-byte character boundary, e.g. it indicates the second byte of a two-byte character, then only half of the multi-byte character is moved to the last character of the destination string.

HSY$DX_TRIM

HSY$DX_TRIM trims trailing one-byte and multi-byte spaces and TAB characters.

Format

HSY$DX_TRIM dst,src,[len]

RETURNS

VMS usage: cond_value

type: longword (unsigned)

access: write only

mechanism: by value

Arguments

dst

VMS usage: char_string

type: character string

access: write only

mechanism: by descriptor

The destination string to store the trimmed string.
src

VMS usage: char_string

type: character string

access: read only

mechanism: by descriptor

The source string that is to be converted.
len

VMS usage: word_signed

type: word integer (signed)

access: write only

mechanism: by reference

The length in bytes of the trimmed string. If this optional argument is not supplied, no length information of the trimmed string will be returned to the caller.

Description

dst and src can contain one-byte and multi-byte characters.

CONDITION VALUES RETURNED

LIB$_INVSTRDES Invalid string descriptor. A string descriptor has an invalid value in its DSC$B_CLASS field.

LIB$_STRTRU Procedure successfully completed. String truncated.

LIB$_FATERRLIB Fatal internal error. An internal consistency check has failed.

LIB$_INSVIRMEM Insufficient virtual memory.

SS$_NORMAL Procedure successfully completed.

HSY$DX_TRUNC

HSY$DX_TRUNC truncates the input string to the specified length.

Format

HSY$DX_TRUNC dst,src,offset,[len]

RETURNS

VMS usage: cond_value

type: longword (unsigned)

access: write only

mechanism: by value

Arguments

dst

VMS usage: char_string

type: character string

access: write only

mechanism: by descriptor

The specified destination string to store the truncated string.
src

VMS usage: char_string

type: character string

access: read only

mechanism: by descriptor

The specified source string to be truncated.
offset

VMS usage: word_signed

type: word integer (signed)

access: read only

mechanism: by reference

The offset in bytes from the starting position of the source string which indicates the position of the first character just after the truncated string. Note that this offset may not be on the proper character boundary, e.g. it may point to the second byte of a two-byte character.
len

VMS usage: word_signed

type: word integer (signed)

access: write only

mechanism: by reference

The length in bytes of the truncated string. If this optional argument is not supplied, no length information of the truncated string will be returned to the caller.

Description

The value returned in len may not necessarily be equal to the value specified in offset since offset may not be pointing at the first byte of a multi-byte character. In any case, the character indicated by offset will be treated as the first character that follows the truncated string.

CONDITION VALUES RETURNED

LIB$_INVSTRDES Invalid string descriptor. A string descriptor has an invalid value in its DSC$B_CLASS field.

LIB$_STRTRU Procedure successfully completed. Truncated string is further truncated due to insufficient space allocated in the destination string buffer.

LIB$_FATERRLIB Fatal internal error. An internal consistency check has failed.

LIB$_INSVIRMEM Insufficient virtual memory.

SS$_NORMAL Procedure successfully completed.

HSY$TRIM

HSY$TRIM trims trailing one-byte and multi-byte spaces and TAB characters.

Format

HSY$TRIM str,len

RETURNS

VMS usage: longword_signed

type: longword integer (signed)

access: write only

mechanism: by value

The offset in bytes from the starting position of the input string which indicates the position of the terminating character of the trimmed string. If the terminating character is a multi-byte character, the returned offset will be pointing to the first byte of the multi-byte character.

Arguments

str

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: read only

mechanism: by value

The address of the starting position of the input string to be trimmed.
len

VMS usage: longword_signed

type: longword integer (signed)

access: read only

mechanism: by value

The length in bytes of the input string.

Description

str can contain one-byte and multi-byte characters.

HSY$TRUNC

HSY$TRUNC returns the position of the first character that follows the truncated string.

Format

HSY$TRUNC str,len,offset

RETURNS

VMS usage: longword_signed

type: longword integer (signed)

access: write only

mechanism: by value

The offset in bytes which indicates the position of the first character just follows the truncated string. If this character is a multi-byte character, the offset will be pointing at the first byte of the multi-byte character.

Arguments

str

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: read only

mechanism: by value

The address of the starting position of the input string.
len

VMS usage: longword_signed

type: longword integer (signed)

access: read only

mechanism: by value

The length in bytes of the input string.
offset

VMS usage: longword_signed

type: longword integer (signed)

access: read only

mechanism: by value

The offset in bytes of the character just follows the truncated string. It may not be on the proper character boundary, e.g. it can point to the second byte of a two-byte character.

Description

str can contain one-byte and multi-byte characters. This routine helps you to position offset to the proper character boundary. Its function is similar to routine HSY$CH_CURR but with different parameter interface.

HSY$CH_GCHAR

HSY$CH_GCHAR reads the current character.

Format

HSY$CH_GCHAR cur,end

RETURNS

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: write only

mechanism: by value

The current character.

Arguments

cur

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: read only

mechanism: by value

The address of the current position of the specified current character. Note that this address must be on the proper character boundary, e.g. it should not point to the second byte of a two-byte character.
end

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: read only

mechanism: by value

The address of the string terminating position plus one as illustrated below:
+---+---+---+---+ .. | | | | | +---+---+---+---+ string ^ end

Description

This routine reads a character with end of buffer checking. FFFF (hex) will be returned when read past the end of buffer. If the current character is a one-byte 7-bit control character or one-byte 8-bit character (e.g. an 8-bit character followed by a 7-bit control character), the one-byte 7-bit or 8-bit character will be returned. No updating of current pointer is done since cur is passed by value.

HSY$CH_GNEXT

HSY$CH_GNEXT reads the current character.

Format

HSY$CH_GNEXT cur,end

RETURNS

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: write only

mechanism: by value

The current character.

Arguments

cur

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: modify

mechanism: by reference

The address of the current position of the specified current character. Note that this address must be on the proper character boundary, e.g. it should not point to the second byte of a two-byte character.
end

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: read only

mechanism: by value

The address of the string terminating position plus one as illustrated below:
+---+---+---+---+ .. | | | | | +---+---+---+---+ string ^ end

Description

This routine reads a character with end of buffer checking. FFFF (hex) will be returned when read past the end of buffer. If the current character is a one-byte 7-bit control character or one-byte 8-bit character (e.g. an 8-bit character followed by a 7-bit control character), the one-byte 7-bit or 8-bit character will be returned. Updating of the current pointer is done. After the read action, cur will be updated to the next character position pointing at the proper character boundary. This routine is useful for successive character reading.

HSY$CH_NEXTG

HSY$CH_NEXTG reads the next character, skipping the current character.

Format

HSY$CH_NEXTG cur,end

RETURNS

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: write only

mechanism: by value

The next character.

Arguments

cur

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: modify

mechanism: by reference

The address of the current position of the specified current character. Note that this address must be on the proper character boundary, e.g. it should not point to the second byte of a two-byte character.
end

VMS usage: longword_unsigned

type: longword integer (unsigned)

access: read only

mechanism: by value

The address of the string terminating position plus one as illustrated below:
+---+---+---+---+ .. | | | | | +---+---+---+---+ string ^ end

Description

This routine reads the next character, skipping the current character. FFFF (hex) will be returned when read past the end of buffer. If the next character is a one-byte 7-bit control character or one-byte 8-bit character (e.g. an 8-bit character followed by a 7-bit control character), the one-byte 7-bit or 8-bit character will be returned. Updating of the current pointer is done. After the read action, cur will be updated to the next character position pointing at the proper character boundary.

Contents

HP OpenVMS Systems Documentation

HP OpenVMS/Hangul RTL Korean Processing (HSY$) Manual

1.2 Features of HSYSHR

1.3 Linking with HSYSHR

Chapter 2MULTI-BYTE CHARACTER CONCEPTS

2.1 What is Multi-byte Character?

2.2 Proper Character Boundary

2.3 Full Form and Half Form Character

2.4 Multi-byte Character Unsigned Longword Representation

HSY$ Reference Section

HSY$CH_MOVE

Format

Arguments

len

src

dst

Description

HSY$DX_TRIM

Format

RETURNS

Arguments

dst

src

len

Description

CONDITION VALUES RETURNED

HSY$DX_TRUNC

Format

RETURNS

Arguments

dst

src

offset

len

Description

CONDITION VALUES RETURNED

HSY$TRIM

Format

RETURNS

Arguments

str

len

Description

HSY$TRUNC

Format

RETURNS

Arguments

str

len

offset

Description

HSY$CH_GCHAR

Format

RETURNS

Arguments

cur

end

Description

HSY$CH_GNEXT

Format

RETURNS

Arguments

cur

end

Description

HSY$CH_NEXTG

Format

RETURNS

Arguments

cur

end

Description

Chapter 2
MULTI-BYTE CHARACTER CONCEPTS