raP File Format - OFP

From Bohemia Interactive Community
Revision as of 14:37, 13 July 2006 by Mikero (talk | contribs) (work in progress, hands off please.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Gah!

dont' touch this one folks just at moment, it's a cut 'n paste from my website and needs *severe* re-wording.

I've plunked it here in my sandbox to get to work on it as and when....




Bin 2 CPP compression

The mission.sqm contained within the three official Bis campaigns is compressed and not directly readable by a text editor such as notepad. Some refer to this as being encrypted, which is misleading. It might be true that by compressing these files Bis intended by proxy to make them 'encrypted', but, essentially, they are simply compressed data similar in intent to zip, rar or pbo files. Various utilities exist which refer to binary <> cpp compression and extraction (or encoding and decoding). Again, these terms are misleading because the file concerned is not executable binary data, just compressed strings and values. The intention of this compression is to reduce the quantity of identically named strings and produce a 'binary' file that closely reflects the overall, and very minimal, construct of the text version of any sqm file. The construct of a sqm file is minimal and quite rigid. There is no need here to elaborately define what a sqm file is. But, it is worth understanding the basics of these files to understand the very small requirements needed of a compression utility. The end result is that the structure, the construct, of an 'encrypted' file represents very closely how the ofp engine works with all text data internally. sqm's only contain one of 3 types of construct names, variables, classes, names are names = variable; variables come in 4 flavours name="A string"; name=77; // short integer name= 1.855; // float name[]={......}; // an array containing more name's including (possibly) more arrays or more variables

           thing[]={ 1.0,  7.67,   "Elephants", fred[]={......} };

A compression utility encodes each of these basic types; The only other construct of a sqm file is the class class classname [:inherit] { ...}; They are very similar to arrays[] and may contain multiply embedded classes; [:inherit] is optional and simply refers to another classname This is the only other construct that a compression utility needs to encode. An encrypted 'binary file' encodes everything as class filename {

   number of embedded classes
       lots of embedded classes and variables

}; number of #defines (optional)

  1. define table (if any)

the beginning class, the filename, is not recorded in the humanly readable text output. See below for #defines A compressed mission sqm has the first 7 bytes of the file encoded as follows: "\0raP\004\0\0" The rest of the file contains packets of 3 different construct types as noted above with the 1st byte defining what 'type' it is. Thus struct Packet {

   byte    PacketType;	// 0,1 or 2
   ....... depends on packet type

}; Packet Type 0: Classname Packet Type 1: Variables Packet Type 2: Arrays The very first packet encountered is a classname. It is the enclosing class for *everything* else in the file. The name of his class is the name of the file. It is *not* recorded in humanly readable text output. Packet Type 0: Classname class Classname: InheritedClassName { Packets... }; struct ClassPacket {

byte		PacketType;		// = 0 == class
IndexedString 	Classname;
Asciiz		InheritedClassName;	// optional or zero length string
BIS_short	nImbeddedPackets;	// Iterates thru embedded Packet(s) can be zero

}; Packet Type 1 Variables The first byte of this packet defines what type of variable. Thus struct VarPacket {

byte		PacketType;		// = 1
byte		VarType;		// = 0 to 2
IndexedString 	SomeName;
.... depends on VarType

}; VarType0 String VarType1 Float VarType2 LongInteger SomeName="SomeOtherName"; struct VarTypString {

byte		PacketType;	// = 1
byte		VarType;	// = 0
IndexedString	SomeName;
IndexedString	SomeOtherName;

}; SomeName=1.23445; struct VarTypFloat {

byte		PacketType;	// = 1
byte		VarType;	// = 1
IndexedString	SomeName;
float		value;		// 4 bytes

}; SomeName=123; struct VarTypLongInteger {

byte		PacketType;	// = 1
byte		VarType;	// = 2
IndexedString	SomeName;
int		value;		// 4 bytes

};

Packet Type 2 Arrays Arrays[] contain one of four element types. They are the traditional variables mentioned above with an added tweak of an embedded array type. Here, i refer to them as constants, simply because they are stand alone values, not associated with a name thus SomeName[]={ constant,constant[],constant,....}; struct ArrayPacket {

byte		PacketType;		// = 2
IndexedString 	SomeName;
BIS_short	nConstTypes;		// iterate thru ConstTypes, can be 0
.... depends on ConstTypes

}; ConstType0 String ConstType1 Float ConstType2 LongInteger ConsType3 Embedded_Array { constant, constant, ...}; "SomeName", struct ConstTypString {

byte		VarType;	// = 0
IndexedString	SomeName;

}; 1.234, struct ConstTypFloat {

byte		VarType;	// = 1
float		value;		// 4 bytes

}; 123, struct ConstTypLongInteger {

byte		VarType;	// = 2
int		value;		// 4 bytes

}; {{constants...},{constants...},....}, struct ConstTypeArray {

byte		VarType;	// = 3
BIS_short	nConstTypes;		// iterate thru ConstTypes

... depends on constypes }; with the above construct (embedded array) each embedded array can contain any constant, including, another embedded array The difference of course is these embedded arrays have no individual name associated with them (unlike the packet array) Added Wrinkles

  1. defines

Optionally, an encrypted file can contain a #define table after the filename class definition. Long NumberOfDefines Struct DefTable {

   Asciiz String;
   Long    value;

}[NumberOfDefines]


type definitions Bis_Short the value us either one, or two bytes. { int val;

if ((val = GetByte())==EOF) return EOF;
if (val & 0x80)
{
int extra;
 
 if ((extra = GetByte())==EOF) return EOF;
 val += (extra - 1) * 0x80;
}
return val;

}

IndexedString struct {

   Bis_Short    index;
   Asciiz          String;

}; a table of strings is recorded according to it's index number when that specific index number is first encountered. Although the values appear to be ordinal (0,1,2,3,4,5) you should not assume so. 0 ="Peter" 1="Paul" 2="Mary" These are defined index strings and appear, individually, and uniquely, within the mission.sqm as and when they are first encountered. From then on you will only see an index string as 1="" because 1 has been defined earlier on. Note that this is unlike a postscript dictionary in that strings are defined on an add -hoc basis, not at beginning, only when encountered, this 0="peter" 0= 0= 1="mary" 0= 1= 0= 2="fred" 2= 1= etc