PBO File Format: Difference between revisions
(Work in progress) |
m (bibliography) |
||
Line 132: | Line 132: | ||
<hr> | <hr> | ||
Bibliography | ==Bibliography== | ||
confucious : http://www.ofpec.com/editors/resource_view.php?id=414 | confucious : http://www.ofpec.com/editors/resource_view.php?id=414 | ||
ofpinternals : http://www.ofpec.com/editors/resource_view.php?id=147 | ofpinternals : http://www.ofpec.com/editors/resource_view.php?id=147 | ||
Bin2Cpp : http://www.ofpec.com/editors/resource_view.php?id=665 | Bin2Cpp : http://www.ofpec.com/editors/resource_view.php?id=665 | ||
Cpp2Bin : http://www.ofpec.com/editors/resource_view.php?id=333 | Cpp2Bin : http://www.ofpec.com/editors/resource_view.php?id=333 | ||
Encryption : http://www.ofpec.com/editors/resource_view.php?id=830 | Encryption : http://www.ofpec.com/editors/resource_view.php?id=830 | ||
Info cpp<>bin | Info cpp<>bin | ||
Res Pbos | Res Pbos: http://www.ofpec.com/editors/resource_view.php?id=833 | ||
http://www.ofpec.com/editors/resource_view.php?id=833 | addenda to this doc | ||
addenda to this doc | Amalfi UnPbo : http://www.ofpec.com/editors/resource_view.php?id=358 | ||
Amalfi UnPbo : http://www.ofpec.com/editors/resource_view.php?id=358 | Amalfi MakePbo : http://www.ofpec.com/editors/resource_view.php?id=357 | ||
Amalfi MakePbo : http://www.ofpec.com/editors/resource_view.php?id=357 | Winpbo : http://www.ofpec.com/editors/resource_view.php?id=777 | ||
Winpbo : http://www.ofpec.com/editors/resource_view.php?id=777 | DePboDLL : http://www.ofpec.com/editors/resource_view.php?id=828 | ||
DePboDLL : http://www.ofpec.com/editors/resource_view.php?id=828 |
Revision as of 03:14, 13 July 2006
Pbo file structure and packing method
Intro
A pbo file originally meant 'packed binary object'. Through use however, it has come to represent a single 'package' to achieve a result. Such as a mission, such as, an addon.
A .pbo file is the output produced by the Mission Editor when 'exporting' and contains nothing more (and nothing less) than the content of all the files and folders making up a mission or campaign, or addon. It is a single file representation of a folder tree. The key to grasp is that anything you uniquely make in a folder, such as a mission, such as a campaign, such as an addon, can be conveniently packaged into a single file, called, a pbo.
The engine will internally expand any pbo back out to it's original, tree-folder, form.
Additionally, the engine will work with the equivalent non pbo versions of missions or campaigns, but not (unfortunately) Addons. Addons must be in 'pbo format' to be usable by the engine.
Compression
In addition to simply packaging all files and folders in a tree into a single file, some, all, or none of the files within can be compressed. Which type of files are compressed is entirely optional. Tools for creating compressed pbo files are Makepbo by Amalfi among others. The intent behind compression was for internet use and, in the 'good old days', simply to reduce hard disk storage requirements. The actual use of compression (a mild form of run length encoding) is becoming less 'popular' as it does represent a load on the engine. Elite, for instance, cannot work with compressed pbo files.
Binarised raP
config.bin definately, and often config.cpp, mission.sqm, or description.ext are raPified. This is not the same as compressed and is not part of the pbo decompression algorithm. The data for mission.sqm may indeed, also be compressed within the pbo, but the resulting output is often, a raPified version of the original mission.sqm text. It must be further decoded by utilities such as cpp2bin.
raPified files are mistakenly referred to as 'encrypted' or 'binarised'. They are no such thing.
Main Format
The format of a pbo is extremely simple. It contains
- a header
- one, contiguous data block.
- a four byte simple accumulative checksum
The header defines each file contained in the pbo, its size, date, name, whether it's compressed, and where it 'is' in the following data block. Every file, even zero length ones, are recorded in the header and each is referred to as an 'entry'. Entries, and consequently the 'file' they refer to, are contiguous.
The last 'entry' should be blank defining the next byte and all bytes thereafter to be the data block. However, resistance format pbo's sometimes obscure this.
Pbo Header Entry
A standard pbo entry as as follows
struct entry { Asciiz filename; //a zero terminated string defining the full path and filename, // relative to the name of this pbo. //Last entry in header has an empty string ('\0' char only). //Other fields in the last entry are filled by zero bytes. //Last entry slightly modified for resistance. . ulong PackingMethod; //0x00000000 uncompressed //0x43707273 packed //0x56657273 header start (resistance) ulong OriginalSize; // Unpacked: 0 or same value as the DataSize // Packed: Size of file after unpacking. // This value is needed for byte boundary unpacking // since unpacking itself can lead to bleeding of up // to 7 extra bytes. ulong Reserved; ulong TimeStamp; // meant to be the unix filetime of Jan 1 1970 +, but often 0 ulong DataSize; // The size in the data block. // This is also the file size when not packed };
Null Entries
Entries with no file name indicate boundaries. The obvious one being end of header.
There are two 'boundaries' used in pbo headers.
- Start of header, found only in Resistance style pbo's, and
- End of header
An end of header is (of course) mandatory. It is normally indicated by all other entries also being zero in the struct. However, a sometimes seen case is a 'signature' in the compression method for the pbo overall. And indication, that some, none, or all, of the pbo is compressed. Somewhat useless.
This is often the case when a product entry (Resistance) is inserted as the first entry. Thus a uniquely used compression signature of 0x56657273 means a product entry (all other fields zero) and 0x43707273 (all other fields zero) means end of header for some Resistance style pbo's.
The truth of the matter is that it doesn't matter muchly. Detection of the end of header, and, when applied, detection of a start of header, is indicated by no file name. The content of these entries is immaterial, the engine makes no use of them. However, certain 3rd party addon makers rely on the fact that *most* pbo extraction tools expect fields to be zero (even though they don't matter). As such, this prevents _some_ pbo's from being extracted by those tools.
Data compression
Is indicated when a signature of 0x43707273 and / or the filesizes do not match in the entry.
The following code also applies to the packing method employed in wrp OPWR files which have no header info simply a block of known output length that must be decoded.
The compressed data block is in contiguous packets of different lengths
block {packet1},,,,, {packetN} packet { byte Format; byte packetdata[...]; // no fixed length }
The contents of the packetdata contain mixtures of raw data that is passed directly to the output, and, 2byte pointers.
Format: bit values determine what the packetdata is. It is interpeted lsb first thus;
BitN =1 - append byte directly to file (read single byte) BitN= 0 - pointer (read two bytes)
for example:
format byte, is 0x34, binary notation is: 00010010.
There are two bytes in the block that will be passed directly to the output when encountered, and there are SIX pointers.
In this example, first byte of packetdata is passed to output, 2 bytes are read to make a pointer, next byte is passed (ultimately) to output and so on. For the very last packet of the very last block, it is almost inevitable that there will be excessive bits. These are ignored (truncated) as the final output length is always known from the Entry.
A pointer consists of a 12 bits address and 4 bit run length.
The pointer is a reference to somewhere in the previous 4k of built output. Obviously, it has to be in range of the currently built output. Given Intel's big endian word format the bytes b1 and b2 form a short word value B2B1
The format of B2B1 is unfortunately AAAA LLLL AAAAAAAA, requiring a bit of shift mask fiddling.
The address refers to the start of some data in the currently rebuilt part of the file. It is a value, relative to the current length of the reconstructed part of the file (FL).
The run length of the data to be copied, the 'pattern' has 4 bits and therefore, in theory, 0 to 15 bytes can be duplicated. In practice the values are 3..18 bytes because copying 0,1 or 2 bytes makes no sense.
Relative position (rpos) into the currently built output is calculated as
rpos = FL - ((B2B1 &0x00FF) + (B2B1 & 0xF000)>>4) )
The length of the data block: rlen
rlen = (B2B1 & 0x0F00>>8) + 3
With the values of rpos and rlen there are three basic situations possible:
rpos + rlen < FL // bytes to copy are within the existing reconstructed data block is added to the end of the file, giving a new length of FL = FL + rlen.
rpos + rlen > FL // data to copy exceeds what's available
In this situation the data block has a length of FL – rpos and it is added to the reconstructed file until FL = rpos + rlen.
rpos + rlen < 0 This is a special case where spaces are added to the decoded file until FL = FL,Initial + rlen
Bibliography
confucious : http://www.ofpec.com/editors/resource_view.php?id=414 ofpinternals : http://www.ofpec.com/editors/resource_view.php?id=147 Bin2Cpp : http://www.ofpec.com/editors/resource_view.php?id=665 Cpp2Bin : http://www.ofpec.com/editors/resource_view.php?id=333 Encryption : http://www.ofpec.com/editors/resource_view.php?id=830 Info cpp<>bin Res Pbos: http://www.ofpec.com/editors/resource_view.php?id=833 addenda to this doc Amalfi UnPbo : http://www.ofpec.com/editors/resource_view.php?id=358 Amalfi MakePbo : http://www.ofpec.com/editors/resource_view.php?id=357 Winpbo : http://www.ofpec.com/editors/resource_view.php?id=777 DePboDLL : http://www.ofpec.com/editors/resource_view.php?id=828