Public Types | Public Member Functions | Static Public Member Functions | Static Public Attributes | Static Protected Member Functions
Poco::TextEncoding Class Reference

#include <TextEncoding.h>

Inheritance diagram for Poco::TextEncoding:
Inheritance graph
[legend]

List of all members.

Public Types

enum  { MAX_SEQUENCE_LENGTH = 6 }
typedef int CharacterMap [256]
typedef SharedPtr< TextEncodingPtr

Public Member Functions

virtual const char * canonicalName () const =0
 Destroys the encoding.
virtual const CharacterMapcharacterMap () const =0
virtual int convert (const unsigned char *bytes) const
virtual int convert (int ch, unsigned char *bytes, int length) const
virtual bool isA (const std::string &encodingName) const =0
virtual ~TextEncoding ()

Static Public Member Functions

static void add (TextEncoding::Ptr encoding)
static void add (TextEncoding::Ptr encoding, const std::string &name)
static TextEncodingbyName (const std::string &encodingName)
static TextEncoding::Ptr find (const std::string &encodingName)
static TextEncoding::Ptr global (TextEncoding::Ptr encoding)
static TextEncodingglobal ()
static void remove (const std::string &encodingName)

Static Public Attributes

static const std::string GLOBAL
 Return the current global TextEncoding object.

Static Protected Member Functions

static TextEncodingManagermanager ()
 Name of the global TextEncoding, which is the empty string.

Detailed Description

An abstract base class for implementing text encodings like UTF-8 or ISO 8859-1.

Subclasses must override the canonicalName(), isA(), characterMap() and convert() methods and need to be thread safe and stateless.

TextEncoding also provides static member functions for managing mappings from encoding names to TextEncoding objects.

Definition at line 53 of file TextEncoding.h.


Member Typedef Documentation

Definition at line 73 of file TextEncoding.h.

Definition at line 66 of file TextEncoding.h.


Member Enumeration Documentation

anonymous enum
Enumerator:
MAX_SEQUENCE_LENGTH 

Definition at line 68 of file TextEncoding.h.


Constructor & Destructor Documentation

The map[b] member gives information about byte sequences whose first byte is b. If map[b] is c where c is >= 0, then b by itself encodes the Unicode scalar value c. If map[b] is -1, then the byte sequence is malformed. If map[b] is -n, where n >= 2, then b is the first byte of an n-byte sequence that encodes a single Unicode scalar value. Byte sequences up to 6 bytes in length are supported.

Definition at line 141 of file TextEncoding.cpp.


Member Function Documentation

void Poco::TextEncoding::add ( TextEncoding::Ptr  encoding) [static]

Returns a pointer to the TextEncoding object for the given encodingName, or NULL if no such TextEncoding object exists.

Definition at line 174 of file TextEncoding.cpp.

void Poco::TextEncoding::add ( TextEncoding::Ptr  encoding,
const std::string &  name 
) [static]

Adds the given TextEncoding to the table of text encodings, under the encoding's canonical name.

If an encoding with the given name is already registered, it is replaced.

Definition at line 180 of file TextEncoding.cpp.

TextEncoding & Poco::TextEncoding::byName ( const std::string &  encodingName) [static]

Transform the Unicode character ch into the encoding's byte sequence. The method returns the number of bytes used. The method must not use more than length characters. Bytes and length can also be null - in this case only the number of bytes required to represent ch is returned. If the character cannot be converted, 0 is returned and the byte sequence remains unchanged. The default implementation simply returns 0.

Definition at line 158 of file TextEncoding.cpp.

virtual const char* Poco::TextEncoding::canonicalName ( ) const [pure virtual]
virtual const CharacterMap& Poco::TextEncoding::characterMap ( ) const [pure virtual]

Returns true if the given name is one of the names of this encoding. For example, the "ISO-8859-1" encoding is also known as "Latin-1".

Encoding name comparision are be case insensitive.

Implemented in Poco::UTF16Encoding, Poco::Latin9Encoding, Poco::ASCIIEncoding, Poco::Latin1Encoding, Poco::UTF8Encoding, and Poco::Windows1252Encoding.

int Poco::TextEncoding::convert ( const unsigned char *  bytes) const [virtual]

Returns the CharacterMap for the encoding. The CharacterMap should be kept in a static member. As characterMap() can be called frequently, it should be implemented in such a way that it just returns a static map. If the map is built at runtime, this should be done in the constructor.

Reimplemented in Poco::UTF16Encoding, Poco::Latin9Encoding, Poco::ASCIIEncoding, Poco::Latin1Encoding, Poco::UTF8Encoding, and Poco::Windows1252Encoding.

Definition at line 146 of file TextEncoding.cpp.

int Poco::TextEncoding::convert ( int  ch,
unsigned char *  bytes,
int  length 
) const [virtual]

The convert function is used to convert multibyte sequences; bytes will point to a byte sequence of n bytes where getCharacterMap()[*bytes] == -n.

The convert function must return the Unicode scalar value represented by this byte sequence or -1 if the byte sequence is malformed. The default implementation returns (int) bytes[0].

Reimplemented in Poco::UTF16Encoding, Poco::Latin9Encoding, Poco::ASCIIEncoding, Poco::Latin1Encoding, Poco::UTF8Encoding, and Poco::Windows1252Encoding.

Definition at line 152 of file TextEncoding.cpp.

TextEncoding::Ptr Poco::TextEncoding::find ( const std::string &  encodingName) [static]

Returns the TextEncoding object for the given encoding name.

Throws a NotFoundException if the encoding with given name is not available.

Definition at line 168 of file TextEncoding.cpp.

Removes the encoding with the given name from the table of text encodings.

Definition at line 192 of file TextEncoding.cpp.

Sets global TextEncoding object.

This function sets the global encoding to the argument and returns a reference of the previous global encoding.

Definition at line 200 of file TextEncoding.cpp.

virtual bool Poco::TextEncoding::isA ( const std::string &  encodingName) const [pure virtual]

Returns the canonical name of this encoding, e.g. "ISO-8859-1". Encoding name comparisons are case insensitive.

Implemented in Poco::UTF16Encoding, Poco::Latin9Encoding, Poco::ASCIIEncoding, Poco::Latin1Encoding, Poco::UTF8Encoding, and Poco::Windows1252Encoding.

Name of the global TextEncoding, which is the empty string.

Definition at line 206 of file TextEncoding.cpp.

void Poco::TextEncoding::remove ( const std::string &  encodingName) [static]

Adds the given TextEncoding to the table of text encodings, under the given name.

If an encoding with the given name is already registered, it is replaced.

Definition at line 186 of file TextEncoding.cpp.


Member Data Documentation

const std::string Poco::TextEncoding::GLOBAL [static]

Return the current global TextEncoding object.

Definition at line 159 of file TextEncoding.h.


The documentation for this class was generated from the following files:


pluginlib
Author(s): Tully Foote and Eitan Marder-Eppstein
autogenerated on Sat Dec 28 2013 17:20:20