Skip to content

Commit a930f7e

Browse files
authored
Merge branch 'master' into issue94
2 parents 4defaff + 62fd24f commit a930f7e

File tree

5 files changed

+660
-36
lines changed

5 files changed

+660
-36
lines changed

v2-0-RC2/doc/01Introduction.md

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,9 @@ may be customized as needed by agreement between counterparties.
6363
Glossary
6464
------------------------------------------------------------------------------------------------------
6565

66+
67+
**Character set** - A mapping between a sequence of octets and a sequence of characters.
68+
6669
**Data type** - A field type with its associated encoding attributes,
6770
including backing primitive types and valid values or range. Some types
6871
have additional attributes, e.g. epoch of a date.
@@ -149,13 +152,11 @@ References
149152

150153
### Related FIX Standards
151154

152-
*Simple Open Framing Header*, FIX Protocol, Limited. Version 1.0 Draft Standard
153-
specification has been published at
154-
<http://www.fixtradingcommunity.org/>
155+
[Simple Open Framing Header](https://www.fixtrading.org/packages/fix-simple-open-framing-header-draft-standard-1-0)
156+
FIX Protocol, Limited. Version 1.0 Draft Standard
155157

156-
For FIX semantics, see the current FIX message specification, which is
157-
currently [FIX 5.0 Service Pack 2](http://www.fixtradingcommunity.org/pg/structure/tech-specs/fix-version/50-service-pack-2)
158-
with Extension Packs.
158+
[FIX 5.0 Service Pack 2](https://www.fixtrading.org/standards/fix-5-0-sp-2/)
159+
FIX semantics with Extension Packs.
159160

160161
### Dependencies on other standards
161162

@@ -166,6 +167,9 @@ normative for SBE.
166167
[IEEE 754-2008](http://ieeexplore.ieee.org/servlet/opac?punumber=4610933) A
167168
Standard for Binary Floating-Point Arithmetic
168169

170+
[IETF RFC 2978](https://tools.ietf.org/html/rfc2978)
171+
IANA Charset Registration Procedures. See [Character Sets](https://www.iana.org/assignments/character-sets/character-sets.xml)
172+
169173
[ISO 639-1:2002](http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=22109)
170174
Codes for the representation of names of languages - Part 1: Alpha-2
171175
code
@@ -181,10 +185,14 @@ Codes for the representation of currencies and funds
181185
Data elements and interchange formats - Information interchange -
182186
Representation of dates and times
183187

188+
[ISO/IEC 8859-1:1998](https://www.iso.org/standard/28245.html)
189+
8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1
190+
184191
[ISO 10383:2012](http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=61067)
185192
Securities and related financial instruments - Codes for exchanges and
186193
market identification (MIC)
187194

188195
*W3C XML Schema version 1.0* [Part 1](https://www.w3.org/TR/xmlschema-1/) [Part 2](https://www.w3.org/TR/xmlschema-2/)
189196

190197
[W3C XML Inclusions (XInclude) Version 1.0](https://www.w3.org/TR/xinclude/)
198+

v2-0-RC2/doc/02FieldEncoding.md

Lines changed: 47 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -98,10 +98,10 @@ See Common field schema attributes below.
9898
| PriceOffset | Decimal encoding | [Decimal encoding](#decimal-encoding) | A decimal number representing a price offset, which can be mathematically added to a Price. |
9999
| Amt | Decimal encoding | [Decimal encoding](#decimal-encoding) | A field typically representing a Price times a Qty. |
100100
| Percentage | Decimal encoding | [Decimal encoding](#decimal-encoding) | A field representing a percentage (e.g. 0.05 represents 5% and 0.9525 represents 95.25%). |
101-
| char | Character | [Character encoding](#character) | Single US-ASCII character value. Can include any alphanumeric character or punctuation. All char fields are case sensitive (i.e. m != M). |
102-
| String | Fixed-length character array | [Fixed-length character](#fixed-length-character-array) | A fixed-length character array of ASCII encoding |
103-
| String | Variable-length data encoding | [Variable-length string](#variable-length-string-encoding) | Alpha-numeric free format strings can include any character or punctuation. All String fields are case sensitive (i.e. morstatt != Morstatt). ASCII encoding. |
104-
| String—EncodedText | String encoding | [Variable-length string](#variable-length-string-encoding) | Non-ASCII string. The character encoding may be specified by a schema attribute. |
101+
| char | Character | [Character encoding](#character) | Single-byte character value. Can include any alphanumeric character or punctuation. All char fields are case sensitive (i.e. m != M). |
102+
| String | Fixed-length character array | [Fixed-length character](#fixed-length-character-array) | A fixed-length character array of single-byte encoding |
103+
| String | Variable-length data encoding | [Variable-length string](#variable-length-string-encoding) | Alpha-numeric free format strings can include any character or punctuation. All String fields are case sensitive (i.e. morstatt != Morstatt). |
104+
| String—EncodedText | String encoding | [Variable-length string](#variable-length-string-encoding) | A string. The character encoding may be specified by a schema attribute. |
105105
| XMLData | String encoding | [Variable-length string](#variable-length-string-encoding) | Variable-length XML. Must be paired with a Length field. |
106106
| data | Fixed-length data | [Fixed-length data](#fixed-length-data) | Fixed-length non-character data |
107107
| data | Variable-length data encoding | [Variable-length data](#variable-length-data-encoding) | Variable-length data. Must be paired with a Length field. |
@@ -487,30 +487,36 @@ Character data may either be of fixed size or variable size. In Simple
487487
Binary Encoding, fixed-length fields are recommended in order to support
488488
direct access to data. Variable-length encoding should be reserved for
489489
character strings that cannot be constrained to a specific size. It may
490-
also be used for non-ASCII encoded strings.
490+
also be used for multi-byte encodings.
491491

492492
### Character
493493

494-
Character fields hold a single character. They are most commonly used
495-
for field with character code enumerations. See [Enumeration encoding](#enumeration-encoding) below for
494+
Character fields hold a single character of a single-byte character set. They are most commonly used
495+
for fields with character code enumerations. See [Enumeration encoding](#enumeration-encoding) below for
496496
discussion of enum fields.
497497

498-
| FIX data type | Description | Backing primitive | Length (octet) |
499-
|---------------|-----------------------------|-------------------|---------------:|
500-
| char | A single US-ASCII character | char | 1 |
498+
| FIX data type | Description | Backing primitive | Length (octet) |
499+
|---------------|--------------------|-------------------|---------------:|
500+
| char | A single character | char | 1 |
501501

502502
#### Range attributes for char fields
503503

504-
Valid values of a char field are printable characters of the US-ASCII
505-
character set (codes 20 to 7E hex.) The implicit nullValue is the NUL
506-
control character (code 0).
504+
Character fields are constrained to single-byte characters sets. The recommended encoding is ISO/IEC 8859-1:1998 Latin alphabet No. 1.
505+
However, other 8-bit encodings may be specified in a message schema. The value of characterEncoding attribute should be a preferred
506+
character set name registered with Internet Assigned Numbers Authority (IANA).
507507

508-
Schema attribute | char |
508+
Latin alphabet No. 1 reserves two ranges for control codes defined by ISO/IEC 6429:1992 control character sets C0 and C1.
509+
510+
The implicit nullValue is the NUL control character (code 0).
511+
512+
| Schema attribute | char |
509513
|------------------|--------|
510514
| minValue | hex 20 |
511-
| maxValue | hex 7e |
515+
| maxValue | hex ff |
512516
| nullValue | 0 |
513517

518+
The range hexidecimal 7f-9f is a reserved for control character set C1.
519+
514520
#### Encoding of char type
515521

516522
This is the standard encoding for char type. Note that the length attribute defaults to 1, producing a single character rather than a character array.
@@ -519,17 +525,21 @@ This is the standard encoding for char type. Note that the length attribute defa
519525
<type name="charType" primitiveType="char"/>
520526
```
521527

528+
A character may be specified with a different 8-bit encoding.
529+
530+
```xml
531+
<type name="cyrillic" primitiveType="char" description="Latin/Cyrillic alphabet" characterEncoding="ISO-8859-5"/>
532+
```
533+
522534
A field may be specified with a constant character value.
523535
```xml
524536
<field type="charType" name="OptAttribute" id="206" presence=constant>P</field>
525537
```
526538

527-
Wire format of char encoding of "A" (ASCII value 65, hexadecimal 41)
539+
Wire format of char encoding of "A" (value 65, hexadecimal 41)
528540

529541
`41`
530542

531-
532-
533543
### Fixed-length character array
534544

535545
Character arrays are allocated a fixed space in a message, supporting
@@ -551,9 +561,10 @@ primitiveType="char" and a length attribute is required.
551561
Range attributes minValue and maxValue do not apply to fixed-length
552562
character arrays.
553563

554-
US-ASCII is the default encoding of character arrays to conform to usual
555-
FIX values. The characterEncoding attribute may be specified to override
556-
encoding.
564+
Character arrays are constrained to single-byte characters sets with the same character ranges as a single-character field. The recommended encoding is ISO/IEC 8859-1:1998 Latin alphabet No. 1.
565+
566+
Other 8-bit encodings may be specified in a message schema with the characterEncoding attribute. The value of characterEncoding should be a preferred
567+
character set name registered with IANA.
557568

558569
#### Examples of fixed-length character arrays
559570

@@ -565,6 +576,12 @@ A typical string encoding specification
565576
<field type="string6" name="Symbol" id="55" semanticType="String"/>
566577
```
567578

579+
A character array with an explicit character set, Latin alphabet No. 1.
580+
581+
```xml
582+
<type name="string6" primitiveType="char" characterEncoding="ISO-8859-1"/>
583+
```
584+
568585
Wire format of a character array in character and hexadecimal formats
569586

570587
M S F T
@@ -581,9 +598,9 @@ A character array constant specification. As for a non-constant value, if the co
581598

582599
### Variable-length string encoding
583600

584-
Variable-length string encoding is used for variable length ASCII
585-
strings or embedded non-ASCII character data (like EncodedText field). A
586-
length member conveys the size of the string that follows.
601+
Variable-length string encoding is used for variable length
602+
strings or character data with a multi-byte character set (like FIX EncodedText field). A
603+
length member conveys the size of the string that follows in octets, which may be different than the number of characters.
587604

588605
On the wire, length immediately precedes the data.
589606

@@ -603,10 +620,10 @@ for length.
603620

604621
### Range attributes for string Length
605622

606-
| Schema attribute | length uint8 | length uint16 | data |
607-
|------------------|-------:|-------:|------|
608-
| minValue | 0 | 0 | N/A |
609-
| maxValue | 254 | 65534 | N/A |
623+
| Schema attribute | length uint8 | length uint16 |
624+
|------------------|----------------:|----------------:|
625+
| minValue | 0 | 0 |
626+
| maxValue | 254 | 65534 |
610627

611628
If the Length element has minValue and maxValue attributes, it specifies
612629
the minimum and maximum *length* of the variable-length data.
@@ -1080,7 +1097,7 @@ allow more choices.
10801097
### Value encoding
10811098

10821099
If a field is of FIX data type char, then its valid values are
1083-
restricted to US-ASCII printable characters. See [Character encoding](#character) above.
1100+
restricted to single-byte printable characters. See [Character encoding](#character) above.
10841101

10851102
If the field is of FIX data type int, then a primitive integer data type
10861103
should be selected that can contain the number of choices. For most
@@ -1309,7 +1326,7 @@ session protocol.
13091326
| Field value less than minValue | The encoded value falls below the specified valid range. |
13101327
| Field value greater than maxValue | The encoded value exceeds the specified valid range. |
13111328
| Null value set for required field | The null value of a data type is invalid for a required field. |
1312-
| String contains invalid characters | A String contains non-US-ASCII printable characters or other invalid sequence if a different characterEncoding is specified. |
1329+
| String contains invalid characters | A character or character array contains controls characters or a string contains an invalid sequence if a different characterEncoding is specified. |
13131330
| Required members not populated in MonthYear | Year and month must be populated with non-null values, and the month must be in the range 1-12. |
13141331
| UTCTimeOnly exceeds day range | The value must not exceed the number of time units in a day, e.g. greater than 86400 seconds. |
13151332
| TZTimestamp and TZTimeOnly has missing or invalid time zone | The time zone hour and minute offset members must correspond to an actual time zone recognized by international standards. |

v2-0-RC2/doc/04MessageSchema.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ However, since it is not supported by all XML processors, the SBE XSD is constra
1313
Certain elements of the SBE message schema support inclusion from a separate XML file. The result of the XInclude mechanism
1414
is a single XML infoset, so the schema description below applies whether a single file is used or multiple files are assembled.
1515

16+
1617
XML namespace
1718
-----------------------------------------------------------------------------------------------------------
1819

@@ -99,6 +100,7 @@ instances of `<types>`, if desired, to organize them by categories. Each `<types
99100
The `<types>` element has attribute `xml:base` to support inclusion from a separate XML file using the XInclude mechanism.
100101
Thus, common encoding types may be shared across multiple SBE message schemas.
101102

103+
102104
Within each set, an unbound number of encodings will be listed in any
103105
sequence:
104106

0 commit comments

Comments
 (0)