lexical analysis More...
Public Types | |
enum | token_type { uninitialized, literal_true, literal_false, literal_null, value_string, value_number, begin_array, begin_object, end_array, end_object, name_separator, value_separator, parse_error, end_of_input } |
token types for the parser More... | |
Public Member Functions | |
void | get_number (basic_json &result) const |
return number value for number tokens | |
string_t | get_string () const |
return string value for string tokens | |
string_t | get_token () const |
return string representation of last read token | |
lexer (const string_t &s) noexcept | |
constructor with a given buffer | |
lexer (std::istream *s) noexcept | |
constructor with a given stream | |
lexer () | |
default constructor | |
lexer (const lexer &) | |
lexer | operator= (const lexer &) |
token_type | scan () noexcept |
long double | str_to_float_t (long double *, char **endptr) const |
parse floating point number | |
double | str_to_float_t (double *, char **endptr) const |
parse floating point number | |
float | str_to_float_t (float *, char **endptr) const |
parse floating point number | |
void | yyfill () noexcept |
append data from the stream to the internal buffer | |
Static Public Member Functions | |
static string_t | to_unicode (const std::size_t codepoint1, const std::size_t codepoint2=0) |
create a string from a Unicode code point | |
static std::string | token_type_name (token_type t) |
return name of values of type token_type (only used for errors) | |
Private Attributes | |
string_t | m_buffer |
the buffer | |
const lexer_char_t * | m_content = nullptr |
the buffer pointer | |
const lexer_char_t * | m_cursor = nullptr |
pointer to the current symbol | |
const lexer_char_t * | m_limit = nullptr |
pointer to the end of the buffer | |
const lexer_char_t * | m_marker = nullptr |
pointer for backtracking information | |
const lexer_char_t * | m_start = nullptr |
pointer to the beginning of the current symbol | |
std::istream * | m_stream = nullptr |
optional input stream |
lexical analysis
This class organizes the lexical analysis during JSON deserialization. The core of it is a scanner generated by [re2c](http://re2c.org) that processes a buffer and recognizes tokens according to RFC 7159.
enum nlohmann::basic_json::lexer::token_type |
token types for the parser
uninitialized |
indicating the scanner is uninitialized |
literal_true |
the `true` literal |
literal_false |
the `false` literal |
literal_null |
the `null` literal |
value_string |
a string -- use get_string() for actual value |
value_number |
a number -- use get_number() for actual value |
begin_array |
the character for array begin `[` |
begin_object |
the character for object begin `{` |
end_array |
the character for array end `]` |
end_object |
the character for object end `}` |
name_separator |
the name separator `:` |
value_separator |
the value separator `,` |
parse_error |
indicating a parse error |
end_of_input |
indicating the end of the input buffer |
nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::lexer | ( | const string_t & | s | ) | [inline, explicit] |
nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::lexer | ( | std::istream * | s | ) | [inline, explicit] |
nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::lexer | ( | ) |
default constructor
nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::lexer | ( | const lexer & | ) |
void nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::get_number | ( | basic_json & | result | ) | const [inline] |
return number value for number tokens
This function translates the last token into the most appropriate number type (either integer, unsigned integer or floating point), which is passed back to the caller via the result parameter.
This function parses the integer component up to the radix point or exponent while collecting information about the 'floating point representation', which it stores in the result parameter. If there is no radix point or exponent, and the number can fit into a number_integer_t or number_unsigned_t then it sets the result parameter accordingly.
If the number is a floating point number the number is then parsed using std:strtod (or std:strtof or std::strtold).
[out] | result | basic_json object to receive the number, or NAN if the conversion read past the current token. The latter case needs to be treated by the caller function. |
string_t nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::get_string | ( | ) | const [inline] |
return string value for string tokens
The function iterates the characters between the opening and closing quotes of the string value. The complete string is the range [m_start,m_cursor). Consequently, we iterate from m_start+1 to m_cursor-1.
We differentiate two cases:
1. Escaped characters. In this case, a new character is constructed according to the nature of the escape. Some escapes create new characters (e.g., `"\\n"` is replaced by `"\n"`), some are copied as is (e.g., `"\\\\"`). Furthermore, Unicode escapes of the shape `"\\uxxxx"` need special care. In this case, to_unicode takes care of the construction of the values. 2. Unescaped characters are copied as is.
std::out_of_range | if to_unicode fails |
string_t nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::get_token | ( | ) | const [inline] |
lexer nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::operator= | ( | const lexer & | ) |
token_type nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::scan | ( | ) | [inline] |
This function implements a scanner for JSON. It is specified using regular expressions that try to follow RFC 7159 as close as possible. These regular expressions are then translated into a minimized deterministic finite automaton (DFA) by the tool [re2c](http://re2c.org). As a result, the translated code for this function consists of a large block of code with `goto` jumps.
long double nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::str_to_float_t | ( | long double * | , |
char ** | endptr | ||
) | const [inline] |
parse floating point number
This function (and its overloads) serves to select the most approprate standard floating point number parsing function based on the type supplied via the first parameter. Set this to static_cast<number_float_t*>(nullptr).
[in] | type | the number_float_t in use |
[in,out] | endptr | recieves a pointer to the first character after the number |
double nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::str_to_float_t | ( | double * | , |
char ** | endptr | ||
) | const [inline] |
parse floating point number
This function (and its overloads) serves to select the most approprate standard floating point number parsing function based on the type supplied via the first parameter. Set this to static_cast<number_float_t*>(nullptr).
[in] | type | the number_float_t in use |
[in,out] | endptr | recieves a pointer to the first character after the number |
float nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::str_to_float_t | ( | float * | , |
char ** | endptr | ||
) | const [inline] |
parse floating point number
This function (and its overloads) serves to select the most approprate standard floating point number parsing function based on the type supplied via the first parameter. Set this to static_cast<number_float_t*>(nullptr).
[in] | type | the number_float_t in use |
[in,out] | endptr | recieves a pointer to the first character after the number |
static string_t nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::to_unicode | ( | const std::size_t | codepoint1, |
const std::size_t | codepoint2 = 0 |
||
) | [inline, static] |
create a string from a Unicode code point
[in] | codepoint1 | the code point (can be high surrogate) |
[in] | codepoint2 | the code point (can be low surrogate or 0) |
std::out_of_range | if code point is > 0x10ffff; example: `"code points above 0x10FFFF are invalid"` |
std::invalid_argument | if the low surrogate is invalid; example: `""missing or wrong low surrogate""` |
static std::string nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::token_type_name | ( | token_type | t | ) | [inline, static] |
void nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::yyfill | ( | ) | [inline] |
string_t nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::m_buffer [private] |
const lexer_char_t* nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::m_content = nullptr [private] |
const lexer_char_t* nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::m_cursor = nullptr [private] |
const lexer_char_t* nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::m_limit = nullptr [private] |
const lexer_char_t* nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::m_marker = nullptr [private] |
const lexer_char_t* nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::m_start = nullptr [private] |
std::istream* nlohmann::basic_json< ObjectType, ArrayType, StringType, BooleanType, NumberIntegerType, NumberUnsignedType, NumberFloatType, AllocatorType >::lexer::m_stream = nullptr [private] |