Conversation
The STM parser had multiple incorrect assumptions leading to wrong dye color lookups: - Header is 4 × u16 (magic, version, entry_count, unknown), not u32 + i32 - New format uses u32 keys/offsets with 254 dyes, not u16/128 - Format detection via bytes [0x0A..0x0B] (TexTools heuristic) - Three sub-table encoding modes: Singleton, OneToOne, Indexed - Indexed mode uses 1-based indices with 0xFF marker byte - Expose entries as HashMap<u16, StmEntry> with typed accessors - Add Dawntrail template ID mapping (>= 1000 → subtract 1000) Based on TexTools xivModdingFramework STM.cs reference implementation.
|
Thanks for the investigation! Unfortunately I don't want to accept LLM contributions, but that is 100% not your fault - I did not disclose this anywhere in the project. That is amended to the contributing guide now, sorry about that. Apart from personal reasons, this project is copy-left but alas any generated output - as of right now - is pretty much uncopyrightable. Eventually I'm sure me or someone else will step up to figure out the new STM format though :) |
That's fair, I fully understand that! So made it a draft and disclaimed that. Hope some findings here maybe helpful! Thank you anyway! |
closes: #35
Summary
Rewrites the STM parser (
src/stm.rs) to correctly parsestainingtemplate.stmfiles, including old format(Endwalker, u16 keys, 128 dyes) and new format (Dawntrail, u32 keys, 254 dyes). The previous implementation had
structural issues that prevented correct data extraction.
Investigation Resources
stainingtemplate.stmfiles extracted from game dataSTM.cs
reference implementation
An experimental pattern file for
staintemplate.stm:https://gist.github.com/AzurIce/4fb6d40b8e27f306b058542428d69dcf
STM Binary Format Summary
The findings documented below may serve as a reference for the format specification.
Header (8 bytes)
0x534D, "MS" in little-endian)0x0101for new format)Old vs New Format Detection
Check bytes at
[0x0A]and[0x0B](3rd and 4th bytes of the first key entry at offset 0x08):num_dyes = 128num_dyes = 254Current game files use the new format.
Key/Offset Tables
Immediately after the header (at offset 0x08):
u16 × Nu32 × Nu16 × Nu32 × Ndata_base = 8 + key_size × 2 × entry_countdata_base + offsets[i] × 2Entry Structure
Each entry begins with 5 × u16 cumulative end offsets (in half-word units):
Sub-table byte ranges are derived from cumulative differences:
[0, ends[0]×2)— element type: Half3 (3 × f16 = 6 bytes)[ends[0]×2, ends[1]×2)— Half3[ends[1]×2, ends[2]×2)— Half3[ends[2]×2, ends[3]×2)— Half1 (1 × f16 = 2 bytes)[ends[3]×2, ends[4]×2)— Half1Sub-table Encoding Modes
Each sub-table uses one of three encoding modes, determined by
array_size = sub_size / sizeof(T):1. Singleton (
array_size == 1):A single value, replicated for all
num_dyesentries.2. OneToOne (
array_size >= num_dyes):Direct storage of
num_dyesvalues, one per dye index.3. Indexed (
1 < array_size < num_dyes):Compressed storage using a palette and index table:
Where
P = (sub_size - num_dyes) / sizeof(T).Index interpretation:
0or255→ default value (zero)palette[index - 1]Dawntrail Template ID Mapping
Dawntrail materials use
template_id >= 1000in their ColorDyeTable (e.g. 1200, 1500).These map to legacy STM keys by subtracting 1000:
stm_key = template_id - 1000.Changes
StainingTemplate::from_existing()with correct binary parsingread_array<T>()generic method implementing all three sub-table encoding modesStmEntrystruct with per-channel accessor methodsDyePackstruct andget_dye_pack()convenience method with Dawntrail ID mappingread_half3_array()andread_half1_array()for type conversionTesting
stainingtemplate.stmfrom game data: 43 entries parsed, each with 254 dye valuestest_invalidfuzz test passes