-
Notifications
You must be signed in to change notification settings - Fork 2
Decompile Notes
I decided to decompile a message file for fun ... Yeah, down another rabbit hole.
The MSG file layout is as follows:
Main Header | Index | Country Info | Messages [ || Ext Header ]]
This header is defined in mkmsgf.h by the MSGHEADER structure which has a size of 31 bytes.
// Header of message file
typedef struct _MSGHEADER
{
uint8_t magic_sig[8]; // Magic word signature
uint8_t identifier[3]; // Identifier (SYS, DOS, NET, etc.)
uint16_t numbermsg; // Number of messages
uint16_t firstmsg; // Number of the first message
int8_t offset16bit; // Index table is 16-bit offsets 0 uint32_t / 1 uint16_t
uint16_t version; // File version 2 - New Version 0 - Old Version
uint16_t hdroffset; // pointer - Offset of index table - size of _MSGHEADER
uint16_t countryinfo; // pointer - Offset of country info block (cp)
uint32_t extenblock; // pointer to ext block - 0 if none
uint8_t reserved[5]; // Must be 0 (zero)
} MSGHEADER, *PMSGHEADER;
The signature is:
char signature[] = {0xFF, 0x4D, 0x4B, 0x4D, 0x53, 0x47, 0x46, 0x00};
or: 0xFF MKMSGF 0x00
This is a funny animal which contains pointers to the offset of each message. It is located after the Mian Header and before the Country Information.
- messageinfo->hdroffset is the size of header and as an offset the start of index.
- messageinfo->countryinfo - 1 is the end of index.
Each index record points to a message using either a "uint16_t" or "uint32_t" size. So, the max uint16_t size is 65535 which would somewhat determine use by the file size. I do not know where that decision to switch from uint16_t to uint32_t is made in the original MKMSGF or why. I assume it was left over from a time when the message size was smaller or just to save space back in the day.
Eitherway, the pointer size is determined by messageinfo->offset16bit. If 0 then index uses uint32_t and if 1 then uint16_t is used.
This header is defined in mkmsgf.h by the FILECOUNTRYINFO structure which has a size of 302 bytes.
// Country Info block of message file
typedef struct _FILECOUNTRYINFO
{
uint8_t bytesperchar; // Bytes per char (1 - SBCS, 2 - DBCS)
uint16_t country; // ID country
uint16_t langfamilyID; // Language family ID (As in CPI Reference)
uint16_t langversionID; // Language version ID (As in CPI Reference)
uint16_t codepagesnumber; // Number of codepages
uint16_t codepages[16]; // Codepages list (Max 16)
uint8_t filename[CCHMAXPATH]; // Name of file
uint8_t filler; // filler byte - not used
} FILECOUNTRYINFO, *PFILECOUNTRYINFO;
A message exists for each pointer in the index with the format:
Comp_ID | Message ...
Comp_ID - A three-character component identifier Msg_Num - A four-digit message number Msg_Type - A single character specifying message type (E, H, I, P, W, ?)
- E : Error
- H : Help
- I : Information
- P : Prompt
- W : Warning
- ? : no message assigned to this number Colon_Space - A colon (:), followed by a blank space.
Now, I think I have this figured out ... maybe.
If messageinfo->extenblock !=0, then an extended header exists. The messageinfo->extenblock is a pointer/offset to EXTHDR structure.
// extended header block
typedef struct _EXTHDR
{
uint16_t hdrlen; // length of ???
uint16_t numblocks; // number of additional FILECOUNTRYINFO blocks
} EXTHDR, *PEXTHDR;
I am not sure what hdrlen represents, but numblocks should be the number of FILECOUNTRYINFO blocks that follow. I assume that you can have multiple language messages in one file. I just need a file to get an example file to understand this better.
Decompiling OSO001.MSG results in a messageinfo->extenblock of 0x2ACF6 (175350) which is the last bytes of the file. The EXTHDR results are:
** Has an extended header **
Ext header length: 302
Number ext blocks: 0
This is odd because hdrlen is the size of FILECOUNTRYINFO (302 bytes), but there are 0 blocks following. I just need to look into this more in the future.
If you are using the old IBM MKMSGF, the identifier and message lines must end with <LF> <CR>. See Original MKMSGF Issues for more details.