Ghost Data (File Format)

From MK8
Revision as of 16:05, 11 January 2023 by Scutlet (talk | contribs) (Added replay data info; updated or cleaned up other sections)
Jump to navigation Jump to search
Under Construction
This article is not finished. Help improve it by adding accurate information or correcting grammar and spelling.
Platform Notice
This article is about Mario Kart 8 for Wii U. Mario Kart 8 Deluxe uses a similar file format, but that is still a WIP and the offsets are different.

In Mario Kart 8, Ghost Data is stored in DAT-files. This includes player ghosts, staff ghosts, and downloaded ghosts ("time trial ghosts" in short). MKTV replays have a similar file format, and relics of such replays are thus present in time trial ghost data. Each ghost is stored in a separate DAT-file.

Player Ghosts, Downloaded Ghosts, and MKTV Replays are stored in the user's save folder. Staff Ghosts are stored among the game files.

Differences with Mario Kart Wii

In Mario Kart Wii, as well as earlier titles, ghost files most notably store controller inputs. When a ghost is viewed, the game replays these controller inputs. If the game is perfectly deterministic, the same controller inputs should always yield the same finishing time. In practice however, Mario Kart Wii is not quite able to recreate the circumstances in which the ghost was recorded leading to the rare Wiggler Glitch. This causes the ghost to descynchronise from the original Time Trial, often failing to finish.

In Mario Kart 8, this is different. Ghost files also contain kart positions in some form in addition to controller inputs. In practise, this means that a ghost is always able to finish a race in the exact same way (give or take) as it was recorded in. This can be verified by replacing the file contents of a ghost file with that of another. When doing so, the ghost will ignore any collisions of the track (e.g. drive through walls) and follow the layout of the course it was initially recorded in.

As a result, the Wiggler Glitch cannot occur in Mario Kart 8.

The relation to Mario Kart 7 ghosts is currently unknown.

Filename

Some ghost data information is encoded into the filename of the ghost file. The first two characters indicate what type of ghost data is inside the file. The remaining characters should be interpreted as hexadecimal values. The tables below assumes the characters are parsed as hex values.

Timestamp

Several timestamps are encoded in a ghost's filename. These all have an identical format. These offsets are relative to the start of the timestamp data. Examples of timestamps are the total time and lap splits.

Timestamp
Offset Size Description
0 1 Minutes.
1 2 Seconds.
3 3 Milliseconds.
6 End of timestamp

A timestamp of 9:59:999 is thus represented as 9 3b 3e7 (spaces added for clarity).

Data

Ghost Data Filename
Offset Size Description
0 2 Ghost Type. "sg" for Staff Ghost, "gs" for Player Ghost (possibly a shorthand for "ghost save"), "dg" for Downloaded Ghost, "rp" for MKTV Replay.
2 2 Ghost Number. For staff ghosts and player ghosts this is simply the track ID minus 16.

For downloaded ghosts it is a value from 0x00-0x0f, representing one of the 16 available downloaded ghost slots. If the game detects multiple downloaded ghosts with the same ghost number (even if they're not for the same track), it will keep the ghost for the track with the lowest trackID and discard all others. This number also allows the game to differentiate between ghosts which have otherwise identical data in the filename.

4 2 Track ID. Each track has a unique value here. This does not correspond to the order in which the apear in the game's cups. E.g. Mario Circuit has a value of 16.
6 2 Character ID. E.g. 04 for Yoshi.
8 2 Character Variant ID. Only used for Yoshi, Shy Guy, and Mii (standard and Amiibo suits)
10 2 Mii Weight Class. Only used for Mii to signify whether it is a light (0x00), medium (0x01), or heavy (0x02) character. Defaults to 0x00 for all other characters.
12 2 Kart ID. E.g. 09 for the Buggybud.
14 2 Wheels ID. E.g. E.g. 06 for the Button Wheels.
16 2 Glider ID. E.g. 01 for the Cloud Glider.
18 6 Finishing Time timestamp.
24 6 Lap 1 Split timestamp.
30 6 Lap 2 Split timestamp.
36 6 Lap 3 Split timestamp.
42 6 Lap 4 Split timestamp. If this lap split goes unused (e.g., for tracks that only have three laps), then a placeholder lap split of 9:59:999 (i.e., 93b3e7) is used. This was always the case prior to the introduction of GCN Baby Park in the second DLC pack in version 4 of the game.

It is unknown what happens to this data when modifying the lap count of a track to be higher than 3 prior to version 4 of the game.

48 6 Lap 5 Split timestamp. Same restriction as the Lap 4 split.
54 40 Player Name in UTF-16 (big Endian).
94 2 Flag ID. E.g. 5e for the Dutch flag.
96 2 Motion Control Flag. 01 if motion controls are used, 00 otherwise.
98 2/4 Unknown. Probably padding (always 0). For some player ghosts prior to version 4 of the game, the length of this unknown data was sometimes 2 instead of 4. The cause for this is unknown.
- The following values are not present in the filename prior to version 4 of the game. The game will still accept ghosts that omit this data as of version 4 of the game. On the contrary, earlier versions of the game reject ghosts that include this data. Manually removing this part of the filename will make earlier versions of the game also accept the ghosts.
102 6 Lap 6 Split timestamp. 9:59:999 (i.e., 93b3e7) if lap split is unused.
108 6 Lap 7 Split timestamp. 9:59:999 if lap split is unused.
- End of version 4 data.
100/102 (v1-3), 114 (v4) End of filename; ".dat" extension here.

Effect In-Game

Changing these values by renaming the file affects how a ghost is displayed. Changing any of the values will result in the game displaying the modified information on the ghost summary (the additional info accessed by pressing "+" on the controller) that is shown before viewing the replay. Furthermore, changing the Character ID, Kart ID, Wheels ID, or Glider ID will result in the game using these during the ghost race itself, regardless of the values for these inside the file itself. Doing so can break a character's animations or SFX.

If the Track Number and Track ID do not correspond to the same track, then the game will be unable to recognise the file.

The game will only show one staff ghost for each track even if multiple are present among the game files. In such cases, the game will choose a ghost in the following order until a difference is found: - The lowest finishing time - The lowest character ID - Other unverified choices

It is not possible to add or remove ghosts through CEMU graphic packs. Behaviour on real hardware through plugins like SDCaffiineis not verified.

Player Ghost File Header

The following header is absent for Staff Ghosts. However, a Player or Downloaded Ghost can easily be converted into a Staff Ghost by removing this header and renaming the first two characters of the filename from "gs"/"dg" to "sg". This header is identical to the header of a save file. Ghosts downloaded through the Nintendo Clients package do not have this header.

Extra Ghost Header
Offset Size Description
0x00 4 "CTG0" in ASCII. Unknown abbreviation.
0x04 4 Game Version. 04 01 00 04 for the latest update.
0x08 4 File Size (in bytes).
0x0c 44 Probably Padding (always 0)
0x38 4 CRC-32 Checksum over all bytes after this header until the end of the file.
0x3c 12 Probably Padding (always 0)
0x48 End of extra ghost header

Data

The remaining offset values will assume that the additional 0x48 bytes of header data are not present in the ghost file, so add an additional 0x48 to each of the offsets below when working with ghost data set or downloaded in-game.

Body Header
Offset Size Description
0x00 6 File magic. Always 00 00 04 00 03 A0.
0x06 2 Version Indicator. 00 01 for v1.0, 00 02 for v2.0, 00 03 for v3.0, 80 00 for v4.0 and v4.1.
0x08 4 Size of payload (length of everything after this 0x48-byte header).

Timestamp

When a new ghost is created, the timestamp from the user's local time is used and included in the ghost data.

Timestamp
Offset Size Description
0x0c 4 Year.
0x10 4 Month.
0x14 4 Day.
0x18 4 Day of Week, with 0x00 being Sunday.
0x1c 4 Hour.
0x20 4 Minute.
0x24 4 Second.

Driver Data

These values do not actually seem to be used. Instead, this data is read from the ghost's filename.

Driver information consists of 12 identical data blocks (one for each driver) of length 0x1c. The first driver slot represents the data of the recorded ghost. If the ghost was recorded while racing against another ghost, the data of this other ghost is stored in the second driver slot with a special CPU flag (see below).

For downloaded ghosts, the third and fourth slot will in such case be occupied by two "Marios" using the standard vehicle combination (Standard Kart, Standard Wheels, Super Glider) with a CPU-flag for unoccupied slots (see below). This could indicate that racing against multiple ghosts at the same time was a planned feature, just like in Mario Kart 7 where up to 7 ghosts can be raced against at the same time.

The 2nd through 4th slots are identical to the remainder 8 slots if the ghost was recorded without racing against another ghost. The fifth through 12th slots are always occupied with filler data (mostly FF-values) and the "unoccupied" CPU-flag.

In Mario Kart 8 Deluxe, the 2nd through 4th driver slots can sometimes also be filled with other valid combo data. The cause for this is currently unknown. So far, this has only been observed in a 200cc ghost for Big Blue.

Individual Driver Data
Offset Size Description
0x00 4 Vehicle Body ID.
0x04 4 Tires ID.
0x08 4 Glider ID.
0x0c 4 Character ID.
0x10 1 Character Variant ID.
0x11 1 Mii Weight Class.
0x12 2 Padding. Always 0.
0x14 4 CPU Flag. 0 for a human player, 1 for a CPU, 3 for the vs ghost in the second driver slot, 4 when the slot is unoccupied.
0x18 4 Team Flag. 0 for red team, 1 for blue team, 2 for a solo team (always the case in Time Trials or during other solo races).
0x1c End of driver data.

The offsets above are relative to the following offets for each driver.

Driver Data
Offset Description
0x2c Player 1
0x48 Player 2
0x64 Player 3
0x80 Player 4
0x9c Player 5
0xb8 Player 6
0xd4 Player 7
0xf0 Player 8
0x10c Player 9
0x128 Player 10
0x144 Player 11
0x160 Player 12
0x17c End of driver data

Race Settings

Race settings store flags or other identifiers of a race. These are sometimes used in the UI for MKTV Replays.

Race Settings
Offset Size Description
0x17c 4 Track ID.
0x180 4 Online Mode. 0 for local play. 1 for online play.
0x184 4 Game Mode. 0 for Grand Prix, 1 for Time Trial, 2 for VS Race, 3 for Battle Mode.
0x188 4 Unknown. Always 00 00 00 02.
0x18c 4 Unknown. Always 00 00 00 00 or 00 00 00 01.
0x190 4 CC Mode. 0 for 50cc, 1 for 100cc, 2 for 150cc, 3 for 200cc.
0x194 1 Mirror Flag. 0 for standard play, 1 for mirror mode.
0x195 1 Team Flag. 0 for solo play, 1 for team play. If set, drivers with a "Solo" team flag are shown on as if they were on the red team.
0x196 1 Unknown. Always 03.
0x197 1 Unknown. Always 00.
0x198 16 Unknown.
0x1a8 4 Number of Drivers. Number is equal to the number of drivers with a CPU flag for "human", "cpu", or "vs ghost" (see Driver Data above).

Mii Info

In every single ghost file there is embedded Mii data so that the game can reconstruct the Mii in the case that a Mii was used in the run. Below is the start offset and size of the Mii data. If a Mii was not used during the race, this data will match the Mii of the user profile. Furthermore, this data is used to display the driver name when replaying a ghost.

Mii Info
Offset Size Description
0x244 92 Mii Data. See full description on 3dsbrew. Note that the embedded Mii name is in UTF-16 (little Endian).
0x2a0 2 Probably Padding.
0x2a2 2 CRC-16 XMODEM Checksum.
0x2a4 End of Mii data.

Location

This data is used to display the driver's flag at the end of a race when replaying a ghost.

Location
Offset Size Description
0x2a4 1 Country ID.
0x2a5 1 Sub-region ID.
0x2a6 2 Probably Padding (always 0).
0x2a8 1 Country ID repeat (purpose unknown).
0x2a9 1 Sub-region ID repeat (purpose unknown).
The Dutch flag is missing from the 200cc staff ghost for Tour Merry Mountain

In Mario Kart 8 Deluxe, the staff ghosts of Wave III of the Booster Course Pass have this data zeroed out, meaning that no country flags show when completing a race against them. Staff ghosts of Wave I and II do have this data properly set.


Player Info

Player Info
Offset Size Description
0x304 20 Mii Name in UTF-16 big Endian. Purpose Unknown.

Total Time & Lap Splits

This data is on the overview at the end of a ghost replay. If a track does not have a lap, its lap time is set to 9:59:999 just like in the ghost filename. The exception to this are the lap splits for lap 6 and 7, which are instead 0:00:000 if left unused.

Each timestamp has the same format:

Timestamp
Offset Size Description
0x00 2 Minutes.
0x02 1 Seconds.
0x03 1 Padding (always 0).
0x04 2 Milliseconds.
0x06 6 Probably Padding (always 0). Sometimes 08 00 00 00 00 00 for lap 6 split. Could be related to the ghost version in the Body Header.
0x0c End of timestamp.

The offsets above are relative to the following offsets:

Total Time & Lap Splits
Offset Size Description
0x330 0x0c Lap 1 Split.
0x33c 0x0c Lap 2 Split.
0x348 0x0c Lap 3 Split.
0x354 0x0c Lap 4 Split.
0x360 0x0c Lap 5 Split.
0x36c 0x0c Finishing Time.
0x378 0x0c Lap 6 Split.
0x384 0x0c Lap 7 Split.
0x390 End of timestamps.

Additionally, (staff) ghosts recorded in v4 of the game store additional 9:59:999 data, which might be placeholders for laps 8, 9, and 10. In earlier versions of the game this data was always 0. It is unknown if this data is properly written to if the lap count of an existing track is increased. Likewise, the behaviour of the UI nor the ghost filename is known if this is done.

Possible Lap Split Placeholders
Offset Size Description
0x390 0x0c Possibly Padding or another lap split placeholder. Always 0.
0x33c 0x0c Lap 8 Split Placeholder.
0x3a8 0x0c Lap 9 Split Placeholder.
0x3b4 0x0c Lap 10 Split Placeholder.
0x3c0 0x40 Possibly Padding and/or more lap split placeholders. Always 0.
0x400 End of possible lap split placeholders.

Replay Infos

This section contains all data needed to replay a ghost. It can be exchanged with that of another ghost file, making the game replay the ghost of the injected data (assuming all CRC-elements are fixed).

There are several Yaz0 sections at the end of the ghost file that represent the actual replay itself. In order to obtain this data, all Yaz0 sections need to be decompressed and concatenated. The decompressed and concatenated data of these yaz0 blocks are henceforth referenced to as simply "replay data". Each yaz0 block contains a compressed portion of the replay data of length exactly 0x8000, with the exception of the last block where it may be smaller.

All Yaz0 blocks are padded to be a multiple of 0x40. However, recompressing the replay data and not padding it will still allow the game to accept the ghost file and replay the ghost. At this point in time, there have not yet been fully successful attempts at recompressing and injecting the replay data back into the same ghost. Specifically, such a recompressed ghost properly replays the ghost until the last portion of the replay, after which the ghost will warp back in time until the replay ends. This likely indicates that some additional verification is located elsewhere.

Replay Header

The replay header stores information regarding the yaz0 blocks at the end of the file, and the section data within them when uncompressed.

Replay Header
Offset Size Description
0x400 4 CRC-32 Checksum over all bytes after this value until the first yaz0 section.
0x404 4 First Yaz0 Section Offset. Offset for the first yaz0 block, relative to the start of the replay infos.
0x408 4 Unknown. Always 00 04 ab b8 for time trial ghosts.
0x40c 4 Unknown. For time trial ghosts, always matches the value at 0x460.
0x410 4 Number of subsections in the replay data. Value depends on the number of wheels of the vehicle body. Will always be 15 (two wheels), 16 (three wheels) or 17 (four wheels).
0x414 4 Unknown.
0x418 8 Unknown. Always 00 00 00 8c 00 00 00 30 for time trial ghosts.
0x420 8 Unknown.
0x428 4 Number of subsections in the replay data (repeat). Always identical to the number of subsections in the replay data for time trial ghosts. Purpose unknown.
0x42c 2c Unknown.
0x458 4 Size of decompressed Data plus the offset of the first yaz0 block.
0x45c 4 Number of sections in the replay data.
0x460 4 Unknown. For time trial ghosts, always matches the value at 0x40c.
0x464 4 Unknown. Always 7f 7f ff ff for time trial ghosts.
0x468 4 Number of yaz0 blocks at the end of the file.
0x46c 20 Probably Padding.
0x48c End of replay header.

Replay Header

After the replay header, offsets to each replay data section (see below) and yaz0 block are located.

Replay Header
Offset Size Description
0x48c 4 First section offset in the replay data, relative to the start of the replay infos.
0x490 Offsets to all other sections in the replay data follow. Number of sections is indicated earlier at offset 0x45c.
- 4 First yaz0 block offset, relative to the start of the replay infos.
- Offsets to all other yaz0 blocks follow. Number of yaz0 blocks is indicated earlier at offset 0x468.

Unknown Data

Some unknown data follows. When modifying the ghost and setting all this data to 0, the game still accepts the ghost, though the ghost no longer replays and warps below the track (like the coordinate origin). Coins and laps are also no longer properly counted.

Replay Data

Replay data consists of a number of sections. Each of these sections is identical in structure, and represents a portion of the replay for some number of frames (likely 64 frames). These sections all consist of an equal number of subsections (indicated at offset 0x410), though some of these subsections can be absent.

Replay Data Section
Offset Size Description
0x00 4 CRC-32 Checksum over the data after this checksum until the end of the section.
0x04 2 Size of the first subsection in this section, or 0xff if the subsection is absent.
0x06 Sizes of all other subsections in this section follow.
- - Data of the first subsection.
- Data of all other subsections in this section follows.

Each of these subsections contains data for a specific part of the replay for some number of frames. Note that the lengths of subsections at the same index of a section can have different lengths. For example, the first subsection of the first and second section in the replay data may have different lengths. The cause for this is currently unknown.

Each subsection is responsible for the following parts of a replay. The subsection indices below assume a 4-wheel vehicle was used.

Replay Data Subsections
Subsection Index Description
0 Coin & Lap Count. Coin and Lap values can occur more than once in a single subsection. Coin count is located at offset -1 (i.e., relative to the end of the section). Lap count at offset -10 from the first coin value.
1 Unknown SFX. For example, includes the glider tunnel SFX (but not the glider SFX itself) in Big Blue
2 Kart Position & Drift/Mushroom item VFX & Directional Controller Inputs.
3 Driver rotation & Kart Scale & Glider Popup.
4 Unknown. Always absent entirely.
5 Front-Left Wheel Position & Rotation.
6 Front-Right Wheel Position & Rotation.
7 Back-Left Wheel Position & Rotation.
8 Back-Right Wheel Position & Rotation.
9 Item Slot UI & Various VFX. VFX includes things like items (E.g. the super horn), map objects (E.g. GBA Mario Circuit oil slicks), spin boosts, or bump/crash stars. VFX does not include kart drift, or the purple circle VFX when collecting coins on the F-zero tracks.
10 Driver Rotation & Scale. Possibly includes animations as well.
11 Camera. Influences the behaviour of the camera.
12 Unknown.
13 Item State, like the shroom count. Does not control the UI.
14 Unknown.
15 Kart SFX. SFX includes coin collection, tricks, and glider activation.
16 Driver SFX.

Tools

The following tools can handle Ghost Data (DAT) files:

  • MK8Leaderboards: Fetches and formats data from the time trial leaderboards. Includes a Discord bot.
  • Poltergust: A (staff) ghost visualisation, extraction, and conversion tool.