Skip to content

Commit 88de72c

Browse files
committed
Single-byte character set #95
1 parent 7f88c19 commit 88de72c

File tree

1 file changed

+23
-14
lines changed

1 file changed

+23
-14
lines changed

v2-0-RC2/doc/02FieldEncoding.md

Lines changed: 23 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -502,7 +502,8 @@ discussion of enum fields.
502502
#### Range attributes for char fields
503503

504504
Character fields are constrained to single-byte characters sets. The recommended encoding is ISO/IEC 8859-1:1998 Latin alphabet No. 1.
505-
However, other 8-bit encodings may be specified in a message schema.
505+
However, other 8-bit encodings may be specified in a message schema. The value of characterEncoding attribute should be a preferred
506+
character set name registered with Internet Assigned Numbers Authority (IANA).
506507

507508
Latin alphabet No. 1 reserves two ranges for control codes defined by ISO/IEC 6429:1992 control character sets C0 and C1.
508509

@@ -527,7 +528,7 @@ This is the standard encoding for char type. Note that the length attribute defa
527528
A character may be specified with a different 8-bit encoding.
528529

529530
```xml
530-
<type name="cyrillic" primitiveType="char" description="Latin/Cyrillic alphabet" characterEncoding="8859-5"/>
531+
<type name="cyrillic" primitiveType="char" description="Latin/Cyrillic alphabet" characterEncoding="ISO-8859-5"/>
531532
```
532533

533534
A field may be specified with a constant character value.
@@ -562,9 +563,11 @@ primitiveType="char" and a length attribute is required.
562563
Range attributes minValue and maxValue do not apply to fixed-length
563564
character arrays.
564565

565-
US-ASCII is the default encoding of character arrays to conform to usual
566-
FIX values. The characterEncoding attribute may be specified to override
567-
encoding.
566+
Character arrays are constrained to single-byte characters sets with the same character ranges as a single-character field. The recommended encoding is ISO/IEC 8859-1:1998 Latin alphabet No. 1.
567+
568+
Other 8-bit encodings may be specified in a message schema with the characterEncoding attribute. The value of characterEncoding should be a preferred
569+
character set name registered with IANA.
570+
568571

569572
#### Examples of fixed-length character arrays
570573

@@ -576,6 +579,12 @@ A typical string encoding specification
576579
<field type="string6" name="Symbol" id="55" semanticType="String"/>
577580
```
578581

582+
A character array with an explicit character set, Latin alphabet No. 1.
583+
584+
```xml
585+
<type name="string6" primitiveType="char" characterEncoding="ISO-8859-1"/>
586+
```
587+
579588
Wire format of a character array in character and hexadecimal formats
580589

581590
M S F T
@@ -592,9 +601,9 @@ A character array constant specification. As for a non-constant value, if the co
592601

593602
### Variable-length string encoding
594603

595-
Variable-length string encoding is used for variable length ASCII
596-
strings or embedded non-ASCII character data (like EncodedText field). A
597-
length member conveys the size of the string that follows.
604+
Variable-length string encoding is used for variable length
605+
strings or character data with a multi-byte character set (like FIX EncodedText field). A
606+
length member conveys the size of the string that follows in octets, which may be different than the number of characters.
598607

599608
On the wire, length immediately precedes the data.
600609

@@ -614,10 +623,10 @@ for length.
614623

615624
### Range attributes for string Length
616625

617-
| Schema attribute | length uint8 | length uint16 | data |
618-
|------------------|-------:|-------:|------|
619-
| minValue | 0 | 0 | N/A |
620-
| maxValue | 254 | 65534 | N/A |
626+
| Schema attribute | length uint8 | length uint16 |
627+
|------------------|----------------:|----------------:|
628+
| minValue | 0 | 0 |
629+
| maxValue | 254 | 65534 |
621630

622631
If the Length element has minValue and maxValue attributes, it specifies
623632
the minimum and maximum *length* of the variable-length data.
@@ -1091,7 +1100,7 @@ allow more choices.
10911100
### Value encoding
10921101

10931102
If a field is of FIX data type char, then its valid values are
1094-
restricted to US-ASCII printable characters. See [Character encoding](#character) above.
1103+
restricted to single-byte printable characters. See [Character encoding](#character) above.
10951104

10961105
If the field is of FIX data type int, then a primitive integer data type
10971106
should be selected that can contain the number of choices. For most
@@ -1320,7 +1329,7 @@ session protocol.
13201329
| Field value less than minValue | The encoded value falls below the specified valid range. |
13211330
| Field value greater than maxValue | The encoded value exceeds the specified valid range. |
13221331
| Null value set for required field | The null value of a data type is invalid for a required field. |
1323-
| String contains invalid characters | A String contains non-US-ASCII printable characters or other invalid sequence if a different characterEncoding is specified. |
1332+
| String contains invalid characters | A character or character array contains controls characters or a string contains an invalid sequence if a different characterEncoding is specified. |
13241333
| Required members not populated in MonthYear | Year and month must be populated with non-null values, and the month must be in the range 1-12. |
13251334
| UTCTimeOnly exceeds day range | The value must not exceed the number of time units in a day, e.g. greater than 86400 seconds. |
13261335
| TZTimestamp and TZTimeOnly has missing or invalid time zone | The time zone hour and minute offset members must correspond to an actual time zone recognized by international standards. |

0 commit comments

Comments
 (0)