Qore Programming Language  1.12.2
QoreEncoding.h File Reference
#include <qore/common.h>
#include <qore/QoreThreadLock.h>
#include <cstring>
#include <map>
#include <string>
#include <strings.h>
Include dependency graph for QoreEncoding.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

class  QoreEncoding
 defines string encoding functions in Qore More...
 
class  QoreEncodingManager
 manages encodings in Qore More...
 

Typedefs

typedef qore_offset_t(* mbcs_charlen_t) (const char *str, size_t valid_len)
 for multi-byte encodings: gives the number of total bytes for the character given one or more characters More...
 
typedef size_t(* mbcs_end_t) (const char *str, const char *end, size_t num_chars, bool &invalid)
 for multi-byte character set encodings: gives the number of bytes for the number of chars
 
typedef unsigned(* mbcs_get_unicode_t) (const char *p)
 returns the unicode code point for the given character, assumes there is enough data for the character and that the character is valid (must be checked before calling)
 
typedef size_t(* mbcs_length_t) (const char *str, const char *end, bool &invalid)
 for multi-byte character set encodings: gives the length of the string in characters
 
typedef size_t(* mbcs_pos_t) (const char *str, const char *ptr, bool &invalid)
 for multi-byte character set encodings: gives the character position of the ptr
 

Variables

DLLEXPORT const QoreEncodingQCS_DEFAULT
 the default encoding for the Qore library
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_1
 latin-1, Western European encoding
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_10
 latin-6, Nordic character set
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_11
 Thai character set.
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_13
 latin-7, Baltic rim character set
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_14
 latin-8, Celtic character set
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_15
 latin-9, Western European with euro symbol
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_16
 latin-10, Southeast European character set
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_2
 latin-2, Central European encoding
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_3
 latin-3, Southern European character set
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_4
 latin-4, Northern European character set
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_5
 Cyrillic character set.
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_6
 Arabic character set.
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_7
 Greek character set.
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_8
 Hebrew character set.
 
DLLEXPORT const QoreEncodingQCS_ISO_8859_9
 latin-5, Turkish character set
 
DLLEXPORT const QoreEncodingQCS_KOI7
 Russian: Kod Obmena Informatsiey, 7 bit characters.
 
DLLEXPORT const QoreEncodingQCS_KOI8_R
 Russian: Kod Obmena Informatsiey, 8 bit.
 
DLLEXPORT const QoreEncodingQCS_KOI8_U
 Ukrainian: Kod Obmena Informatsiey, 8 bit.
 
DLLEXPORT const QoreEncodingQCS_USASCII
 ascii encoding
 
DLLEXPORT const QoreEncodingQCS_UTF16
 UTF-16 (only UTF-* are multi-byte encodings)
 
DLLEXPORT const QoreEncodingQCS_UTF16BE
 UTF-16BE (only UTF-* are multi-byte encodings)
 
DLLEXPORT const QoreEncodingQCS_UTF16LE
 UTF-16LE (only UTF-* are multi-byte encodings)
 
DLLEXPORT const QoreEncodingQCS_UTF8
 UTF-8 multi-byte encoding (only UTF-8 and UTF-16 are multi-byte encodings)
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1250
 Windows 1250: Central/Eastern European.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1251
 Windows 1251: Cyrillic: Russian, Ukrainian, Balarusian, Bulgarian, Serbian Cyrillic, Macedonian, ...
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1252
 Windows 1252: European: Spanish, French, German.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1253
 Windows 1253: Greek.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1254
 Windows 1254: Turkish.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1255
 Windows 1255: Hebrew.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1256
 Windows 1256: Arabic.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1257
 Windows 1257: Baltic.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_1258
 Windows 1258: Vietnamese.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_874
 Windows 874: Latin/Thai - similar to ISO-8859-11.
 
DLLEXPORT const QoreEncodingQCS_WINDOWS_936
 Windows 936: Simplified Chinese.
 
DLLEXPORT QoreEncodingManager QEM
 the QoreEncodingManager object
 

Detailed Description

provides definitions related to character encoding support in Qore including the QoreEncoding class and QCS_DEFAULT, the default encoding for the Qore library

Typedef Documentation

◆ mbcs_charlen_t

typedef qore_offset_t(* mbcs_charlen_t) (const char *str, size_t valid_len)

for multi-byte encodings: gives the number of total bytes for the character given one or more characters

Parameters
stra pointer to the character data to check
lenthe number of valid bytes at the start of the character pointer
Returns
0=invalid, positive = number of characters needed, negative numbers = number of additional bytes needed to perform the check