raP File Format - OFP: Difference between revisions

From Bohemia Interactive Community
Jump to navigation Jump to search
m (wip)
Line 144: Line 144:
The embedded packets, eg, the body of this class, immediately follows this packet.
The embedded packets, eg, the body of this class, immediately follows this packet.


The following comment applies to OFP only. Arma/Elite are quite different.
Bare in mind, that the following data (the body of this class) may indeed have further embedded packets, which may have further embedded packets, which may have....  All of which are contiguous in the datastream (OFP Only).
 
Bare in mind, that the following data (the body of this class) may indeed have further embedded packets, which may have further embedded packets, which may have....  All of which are contiguous in the datastream.


===PacketType1:  Variables===
===PacketType1:  Variables===

Revision as of 06:52, 14 July 2006

Gah!

dont' touch this one folks just at moment, it's a cut 'n paste from my website and needs *severe* re-wording.

I've plunked it here in my sandbox to get to work on it as and when....

am including further info about elite and arma


Introduction

raP encoding applies to any humanly readable text file in OFP that contains class statements. Examples of files that are, or should be, raPified, are mission.sqm, config.cpp, description.ext.

In fact, any text file that contains class statements, contains nothing else but class statements. So much so, that entire contents of that file, is considered to be a class !!!

eg

class mission.sqm
{
  ...
};

The fact of the matter is, if you do not raPify these files, the engine will before using them (and thus causing uneccessary cpu load)

raP encoding simply means that the data inherent in these types of files has been sanitised (stripped of commments and crud) and massaged into a form of indexed lookup table for the engine to use directly. Once done, it is free of the need to check for syntax errors, among other things. Hence, much much faster processing.

These types of files were once known as 'encrypted' or 'binarised' files. They are no such thing. They are simply a cleaner. closer equivalent to what the engine uses internally. For instance, all your savegames are raP encoded (there is no, text equivalent).

A raP encoded file is detected by the magic signature '\0raP' in the first four bytes of the file. Because of the leading 0 byte, no text file can inadvertently have this signature.

Importantly, the filename extension is immaterial.

The engine will work with config.cpp as a raP encoded entity, just as it would work with config.bin.

Tools

Various utilities exist which refer to binary <> cpp compression and extraction (or encoding and decoding). Again, these terms are misleading because the file concerned is not executable binary data, just tokenised strings and values.

Basics

There is no need here to elaborately define what a mission.sqm file is. But, it is worth understanding the basics of these (types of) files to understand the very small requirements needed to raPify them.

class files only contain 3 types of construct

ClassNames, TokenNames, Arrays

class classname [:inherit] {...};

[:inherit] is optional and simply refers to another classname.

(For your interest the [] are known as bacchus naur format and mean optional. Whatever is within the [...] is optional. The [] do not appear in the text file.)

The class body, the {...} contains more, Classnames, TokenNames, Arrays, or nothing at all.

TokenNames come in 3 flavours

aString="A string";
anInteger=77; 
aFloat 1.855;

For more on this subject, see TokenNameValueTypes

Arrays

anArray[]={......}; 

an array containing elements including (possibly) more arrays or more TokenNames (but not classes)

 thing[]={ 1.0,  7.67,   "Elephants", fred[]={......} };

raPifying encodes each of these basic types.

Construct

all raPified data can be expressed as

class filename
{ 
   class FirstEmbeddedClass
   {
      ... tokenames
      class FirstEmbeddedEmbeddedClass
      {
        ...
      };
      ...
    };
    ...
    class LastEmbeddedClass
    {
    };
 };
       

number of #defines (optional)

  1. define table (if any)

the beginning class, the filename, is not recorded in the humanly readable text output. See below for #defines

Header

A raPified file has the first 4 bytes of the file encoded as follows:

"\0raP"

For OFP and RESISTANCE the next three bytes are

"\004\0\0"

see elsewhere for Elite and ArmA.

The rest of the file contain Class Body packets of 3 different construct types with the 1st byte defining what 'type' it is.

Thus

struct ClassBody
{
   byte    PacketType; // 0 Classname
                       // 1 TokenName
                       // 2 Array
   ....... depends on packet type
};
The very first packet encountered is a classname. It is the enclosing class for *everything* 
else in the file. The name of his class is the name of the file. It is *not* recorded in humanly
readable text output.

Packets

PacketType0: Classname

class Classname: InheritedClassName {  Packets... };
struct ClassPacket
{
 byte		PacketType;		// = 0 == class
 IndexedString 	Classname;
 Asciiz		InheritedClassName;	// optional or zero length string
 BIS_short	nImbeddedPackets;	// Iterates thru embedded Packet(s) can be zero
};


Having no embedded packets is quite legal.

The embedded packets, eg, the body of this class, immediately follows this packet.

Bare in mind, that the following data (the body of this class) may indeed have further embedded packets, which may have further embedded packets, which may have.... All of which are contiguous in the datastream (OFP Only).

PacketType1: Variables

The first byte of this packet defines what type of variable. Thus

struct VarPacket
{
 byte		PacketType;		// = 1
 byte		VarType;		// = 0 to 2
 IndexedString 	SomeName;
 .... depends on VarType
};

VarType0 String

SomeName="SomeOtherName";

struct VarTypString 
{
 byte		PacketType;	// = 1
 byte		VarType;	// = 0
 IndexedString	SomeName;
 IndexedString	SomeOtherName;
};

VarType1 Float

SomeName=1.23445;

struct VarTypFloat
{
 byte		PacketType;	// = 1
 byte		VarType;	// = 1
 IndexedString	SomeName;
 float		value;		// 4 bytes
};

VarType2 Integer

SomeName=123;

struct VarTypLongInteger
{
 byte		PacketType;	// = 1
 byte		VarType;	// = 2
 IndexedString	SomeName;
 int		value;		// 4 bytes
};

PacketType2: Arrays

Arrays[] contain four possible element types. They are the traditional variables mentioned above with an added tweak of an embedded array type.

thus

SomeName[]={ Element,Element[],"element",....};

struct ArrayPacket
{
 byte		PacketType;		// = 2
 IndexedString 	SomeName;
 BIS_short	nConstTypes;		// iterate thru ConstTypes, can be 0
 .... depends on ConstTypes
};

ArrayType0 String

"SomeName",

struct ArrayString 
{
 byte		VarType;	// = 0
 IndexedString	SomeName;
};

ArrayType1 Float

1.234,

struct ArrayFloat
{
 byte		VarType;	// = 1
 float		value;		// 4 bytes
};


ArrayType2 Integer

123,

struct ArrayInteger
{
 byte		VarType;	// = 2
 int		value;		// 4 bytes
};

ArrayType3 Embedded_Array

{array(...},....},

struct EmbeddedArray
{
 byte		VarType;	// = 3
 BIS_short	nArrayElements;	// iterate thru ConstTypes
... depends on elememt types in this embedded array 
};

with the above construct (embedded array) each embedded array can contain any ArrayType, including, another embedded array. The difference of course is these embedded arrays have no individual name associated with them (unlike the packet array).


Added Wrinkles

  1. defines

Optionally, an encrypted file can contain a #define table after the filename class definition.

Long NumberOfDefines

Struct DefTable
{
   Asciiz String;
   Long    value;
}[NumberOfDefines];


type definitions Bis_Short the value us either one, or two bytes. { int val;

if ((val = GetByte())==EOF) return EOF;
if (val & 0x80)
{
int extra;
 
 if ((extra = GetByte())==EOF) return EOF;
 val += (extra - 1) * 0x80;
}
return val;

}

IndexedString struct {

   Bis_Short    index;
   Asciiz          String;

}; a table of strings is recorded according to it's index number when that specific index number is first encountered. Although the values appear to be ordinal (0,1,2,3,4,5) you should not assume so. 0 ="Peter" 1="Paul" 2="Mary" These are defined index strings and appear, individually, and uniquely, within the mission.sqm as and when they are first encountered. From then on you will only see an index string as 1="" because 1 has been defined earlier on. Note that this is unlike a postscript dictionary in that strings are defined on an add -hoc basis, not at beginning, only when encountered, this 0="peter" 0= 0= 1="mary" 0= 1= 0= 2="fred" 2= 1= etc