Difference between revisions of "BIN (File Format)"

From MK8
Jump to navigation Jump to search
(Integer arrays are not always integer arrays)
(10 intermediate revisions by one other user not shown)
Line 1: Line 1:
'''BIN''' is a file format used in [[Mario Kart 8]] to store binary lookup tables. They consist of several sections as defined in the file / main header, in which there are groups containing specific elements of data of a given type.
+
'''BIN''' is a file format used in [[Mario Kart 8]] to store binary lookup tables, like item probabilities at distances to the lead racer, the distances for this themselves, course information, audio information, controller mappings (possibly), kart body/tire/glider settings, engine statistics or (in Mario Kart 8 Deluxe) software racer skill (AI) and user interface configuration.
  
The format is similar to the one used in '''Mario Kart 7''', though it strips a lot of textual data describing the entries of the groups, which makes giving meaning to the values of the elements harder, though many elements are simply evolved or even ported from Mario Kart 7.
+
BIN files are found in the [[Filesystem/content/common/mush]] directory.
  
They are used to lookup typical tabular data like the item probabilities at distances to the lead racer, the distances for this themselves, course information, audio information, controller mappings (possibly), kart body / tire / glider settings or engine statistics.
+
== Format ==
 +
Each BIN file consists of several sections as defined in the file header. The sections have a unique identifier, which the game uses to determine how to parse the data available after the section header.
  
= Format =
+
Mario Kart 8 (Wii U) stores the files in big endian, Mario Kart 8 Deluxe in little endian.
  
The files are stored in big endian. In the following, C# data types are used to describe the data.
+
The format is similar to the one used in '''Mario Kart 7''', though it strips a lot of textual data describing the entries of the groups, which makes giving meaning to the values of the elements harder, though many elements are simply evolved or even ported from Mario Kart 7.
 
 
== Header ==
 
  
 +
=== File Header ===
 
Each BIN file starts with a main header which provides information about the available sections in the file.
 
Each BIN file starts with a main header which provides information about the available sections in the file.
 
{| class="wikitable"
 
{| class="wikitable"
! Offset
+
! Offset !! Size !! Type !! Description
! Size
 
! Type
 
! Description
 
 
|-
 
|-
| 0x00
+
| 0x00 || 4 || UInt32 || '''File identifier'''. Takes the form of a 4 character ASCII string acronym of the file's purpose.
| 4
 
| char[4]
 
| '''File identifier'''. Takes the form of a 4 character long ASCII string acronym of the file's purpose.
 
 
|-
 
|-
| 0x04
+
| 0x04 || 4 || Int32 || '''File size''' in bytes. Sometimes slightly off from the real file size, unclear why this is the case.
| 4
 
| int
 
| '''File size''' in bytes. Sometimes slightly off from the real file size, unclear why this is the case.
 
 
|-
 
|-
| 0x08
+
| 0x08 || 2 || Int16 || '''Number of sections''' following the header.
| 2
 
| short
 
| '''Number of sections''' following the header.
 
 
|-
 
|-
| 0x0A
+
| 0x0A || 2 || Int16 || '''Header size'''. Required to compute absolute offsets to the sections (s. below).
| 2
 
| short
 
| {{Unknown|Unknown value}}
 
 
|-
 
|-
| 0x0C
+
| 0x0C || 4 || Int32 || '''Version number'''. Always 1000. Can be used to determine endianness.
| 4
 
| int
 
| '''Version number'''. Always seems to be 0x00001000.
 
 
|-
 
|-
| 0x10
+
| 0x10 || 4 * Number of sections || Int32[numberOfSections] || '''Offsets to the sections''', relative to the end of the header.
| 4 * Number of sections
 
| int[numberOfSections]
 
| '''Offsets to the sections''', relative to the end of the header.
 
This means that the first offset is always 0, as the first section always starts directly after the header.
 
 
|}
 
|}
  
== Section ==
+
=== Section ===
 +
A section begins with a section header which stores a unique identifier and some parameters required to parse the following section data. The start of each section is aligned by 4 bytes.
  
Each section header at the offsets given in the main header describes the number of groups of data in the section, the number of elements in each group, and the type of the elements.
 
 
{| class="wikitable"
 
{| class="wikitable"
! Offset
+
! Offset !! Size !! Type !! Description
! Size
 
! Type
 
! Description
 
 
|-
 
|-
| 0x00
+
| 0x00 || 4 || UInt32 || '''Section identifier'''. Takes the form of a 4 character ASCII string acronym of the section's purpose.
| 4
 
| char[4]
 
| '''Section identifier'''. Takes the form of a 4 character long ASCII string acronym of the section's purpose.
 
 
|-
 
|-
| 0x04
+
| 0x04 || 2 || Int16 || '''Count''' parameter. Typically used to specify the number of elements in the data arrays (s. next parameter).
| 2
 
| short
 
| '''Element count'''. The number of elements in a group.
 
 
|-
 
|-
| 0x06
+
| 0x06 || 2 || Int16 || '''Repeat''' parameter. Typically used to specify the number of data arrays in this section.
| 2
 
| short
 
| '''Group count'''. The number of groups in the section.
 
 
|-
 
|-
| 0x08
+
|- bgcolor="#AAAAFF"
| 4
+
| colspan="4" align="center" | '''if Mario Kart 8'''
| int
+
|- bgcolor="#DDDDFF"
| '''Type ID'''. Depending on this value, the format of the section data is different.
+
| 0x08 || 4 || Int32 || '''Additional''' parameter. Can be used to assume the type of data following, but this is often not correct.
 +
|-
 +
|- bgcolor="#FFAAFF"
 +
| colspan="4" align="center" | '''if Mario Kart 8 Deluxe'''
 +
|- bgcolor="#FFDDFF"
 +
| 0x08 || 4 || Int32 || '''Unknown''' parameter.
 +
|- bgcolor="#FFDDFF"
 +
| 0x0C || 4 || Int32 || '''Additional''' parameter. Can be used to assume the type of data following, but this is often not correct.
 
|}
 
|}
  
Groups are simply arrays of the given number of elements. Depending on the type ID provided in the section header, the format (and thus size) of the elements differs. The type IDs seem to be the same as the ones used in Mario Kart 7.
+
=== Section Data ===
 +
Depending on the ''Section identifier'', the game knows how to interprete the data following in the section, with the help of the common parameters given in the section header. It can be assumed that the data is directly copied into console memory and then casted to structures. Thus, it is required to know which ''Section identifier'' maps to which structure.
  
The start and end of each section is aligned by 4 bytes, which is important for those sections containing strings.
+
However, even without knowing these structures, several generic section data formats are seen, which are described in the following. This list is not exhaustive.
  
=== Integer / Float Array (ID 0x00000000) ===
+
==== 3-dimensional Dword Array ====
 +
The section data consists of a 3-dimensional array of 4-byte integer or float values. The length of each dimension is given in the section header, except for the last dimension, which has to be computed manually if the resulting structure is not known:
  
The group elements are simply integer or float arrays of a specific length. The game code knows the length by switching on the section identifier (it can be assumed it directly reads those into structures holding only integer members). Without this knowledge, the length of the element arrays can still be computed with the following formula:
+
{| class="wikitable"
 +
! Dimension !! Length
 +
|-
 +
| 0 || '''Repeat''' parameter given in section header.
 +
|-
 +
| 1 || '''Count''' parameter given in section header.
 +
|-
 +
| 2 || section data size / '''Repeat''' / '''Count''' / sizeof(int)
 +
|}
  
elementLength = sectionSizeWithoutHeader / (groupCount * elementCount) / sizeof(int)
+
==== 2-dimensional String Array ====
  
Whether the type is of integer or float has to be determined by the programmer as the game also knows about this by code.
+
The data in this section are arrays of offsets pointing to 0-terminated strings following them. The number of string arrays is given in the '''Repeat''' and the length of each array in the '''Count''' parameter of the section header. Each array is stored as follows:
  
A group looks as follows:
 
 
{| class="wikitable"
 
{| class="wikitable"
! Offset
+
! Size !! Type !! Description
! Size
+
|-
! Type
+
| 4 * elementCount || Int32[elementCount] || '''String offsets''', relative to the end of the last offset. E.g., the first offset is always 0.
! Description
 
 
|-
 
|-
| 0x00
+
| - || String[elementCount] || '''Strings''', 0-terminated, pointed to by the preceeding offsets.
| 4 * elementCount * elementLength
 
| int[elementCount][elementLength] or
 
float[elementCount][elementLength]
 
| '''Integer or float arrays'''.
 
 
|}
 
|}
  
=== String Array (ID 0x000000A0) ===
+
==== 2-dimensional Indexed String Array ====
 +
 
 +
The data in this section are arrays of increasing indices and offsets pointing to 0-terminated strings following them. The number of string arrays is given in the '''Repeat''' and the length of each array in the '''Count''' parameter of the section header. Each array is stored as follows:
  
The group elements are an array of offsets pointing and followed by an array of 0-terminated strings. A group looks as follows:
 
 
{| class="wikitable"
 
{| class="wikitable"
! Offset
+
! Size !! Type !! Description
! Size
 
! Type
 
! Description
 
 
|-
 
|-
| 0x00
+
| 8 * elementCount || Entry[elementCount] || '''String entries''', each stored as follows:
| 4 * elementCount
+
|- bgcolor="#DDDDDD"
| int[elementCount]
+
| 1 || Byte || '''Index''' of this string. Starts at 0 and increased by 1.
| '''String offsets''', relative to the end of the last one. E.g., the first offset is always 0.
+
|- bgcolor="#DDDDDD"
 +
| 3 || - || '''Padding'''.
 +
|- bgcolor="#DDDDDD"
 +
| 4 || Int32 || '''String offset''', relative to the end of the last offset. E.g., the first offset is always 0.
 
|-
 
|-
| -
+
| - || String[elementCount] || '''Strings''', 0-terminated, pointed to by the preceeding offsets.
| -
 
| string[elementCount]
 
| '''Strings''', 0-terminated, pointed to by the preceeding offsets.
 
 
|}
 
|}
  
=== Other Sections ===
+
== Tools ==
 
+
The following tools can handle BIN files:
There are several other types holding other values (most ending with a list of strings) which have not been covered yet here.
 
 
 
= Tools =
 
 
 
The following tools can operate on BIN files:
 
  
* [https://github.com/Syroot/NintenTools.MarioKart8 NintenTools.MarioKart8] provides a .NET library to access BIN data
+
* [[BIN Editor]], by [[Ray Koopa]]: can visualize and modify data stored in Item.bin and Performance.bin files
 +
* [https://github.com/Syroot/NintenTools.MarioKart8 NintenTools.MarioKart8], by [[Ray Koopa]]: provides a .NET library to access BIN data
  
 
[[Category:File Format]]
 
[[Category:File Format]]

Revision as of 19:19, 13 September 2017

BIN is a file format used in Mario Kart 8 to store binary lookup tables, like item probabilities at distances to the lead racer, the distances for this themselves, course information, audio information, controller mappings (possibly), kart body/tire/glider settings, engine statistics or (in Mario Kart 8 Deluxe) software racer skill (AI) and user interface configuration.

BIN files are found in the Filesystem/content/common/mush directory.

Format

Each BIN file consists of several sections as defined in the file header. The sections have a unique identifier, which the game uses to determine how to parse the data available after the section header.

Mario Kart 8 (Wii U) stores the files in big endian, Mario Kart 8 Deluxe in little endian.

The format is similar to the one used in Mario Kart 7, though it strips a lot of textual data describing the entries of the groups, which makes giving meaning to the values of the elements harder, though many elements are simply evolved or even ported from Mario Kart 7.

File Header

Each BIN file starts with a main header which provides information about the available sections in the file.

Offset Size Type Description
0x00 4 UInt32 File identifier. Takes the form of a 4 character ASCII string acronym of the file's purpose.
0x04 4 Int32 File size in bytes. Sometimes slightly off from the real file size, unclear why this is the case.
0x08 2 Int16 Number of sections following the header.
0x0A 2 Int16 Header size. Required to compute absolute offsets to the sections (s. below).
0x0C 4 Int32 Version number. Always 1000. Can be used to determine endianness.
0x10 4 * Number of sections Int32[numberOfSections] Offsets to the sections, relative to the end of the header.

Section

A section begins with a section header which stores a unique identifier and some parameters required to parse the following section data. The start of each section is aligned by 4 bytes.

Offset Size Type Description
0x00 4 UInt32 Section identifier. Takes the form of a 4 character ASCII string acronym of the section's purpose.
0x04 2 Int16 Count parameter. Typically used to specify the number of elements in the data arrays (s. next parameter).
0x06 2 Int16 Repeat parameter. Typically used to specify the number of data arrays in this section.
if Mario Kart 8
0x08 4 Int32 Additional parameter. Can be used to assume the type of data following, but this is often not correct.
if Mario Kart 8 Deluxe
0x08 4 Int32 Unknown parameter.
0x0C 4 Int32 Additional parameter. Can be used to assume the type of data following, but this is often not correct.

Section Data

Depending on the Section identifier, the game knows how to interprete the data following in the section, with the help of the common parameters given in the section header. It can be assumed that the data is directly copied into console memory and then casted to structures. Thus, it is required to know which Section identifier maps to which structure.

However, even without knowing these structures, several generic section data formats are seen, which are described in the following. This list is not exhaustive.

3-dimensional Dword Array

The section data consists of a 3-dimensional array of 4-byte integer or float values. The length of each dimension is given in the section header, except for the last dimension, which has to be computed manually if the resulting structure is not known:

Dimension Length
0 Repeat parameter given in section header.
1 Count parameter given in section header.
2 section data size / Repeat / Count / sizeof(int)

2-dimensional String Array

The data in this section are arrays of offsets pointing to 0-terminated strings following them. The number of string arrays is given in the Repeat and the length of each array in the Count parameter of the section header. Each array is stored as follows:

Size Type Description
4 * elementCount Int32[elementCount] String offsets, relative to the end of the last offset. E.g., the first offset is always 0.
- String[elementCount] Strings, 0-terminated, pointed to by the preceeding offsets.

2-dimensional Indexed String Array

The data in this section are arrays of increasing indices and offsets pointing to 0-terminated strings following them. The number of string arrays is given in the Repeat and the length of each array in the Count parameter of the section header. Each array is stored as follows:

Size Type Description
8 * elementCount Entry[elementCount] String entries, each stored as follows:
1 Byte Index of this string. Starts at 0 and increased by 1.
3 - Padding.
4 Int32 String offset, relative to the end of the last offset. E.g., the first offset is always 0.
- String[elementCount] Strings, 0-terminated, pointed to by the preceeding offsets.

Tools

The following tools can handle BIN files: