This is the second in a series of posts on streamlining score sharing in beatmania IIDX. Using the data we found last time, we'll build an internal library to read score data from memory, find and hook a function to run our code on the result screen, and finally, hijack an import to get our library loaded automatically.
Before we get started, let me quickly emphasise the importance of testing. Indeed, haphazardly scanning memory is fine by me - so long as the results are reliable! After spending some time trying out various gameplay permutations, I found that some things didn't quite add up and needed adjusting.
Those gaps in the judgement structure were, unsurprisingly, values used for the P2 side. Interestingly, data is populated for both sides when playing in the Double play style. Just add them together to get the values shown on the result screen.
An additional 24 bytes after the current miss count in StageResultDrawFrame. These determine which cells in the 'best' and 'current' columns will appear and whether they should have a flashing background, indicating the best value.
Accounting for those new values, the first 56 bytes in StageResultDrawFrame all correspond to the left side of the result screen used for P1. The next 56 bytes repeat in the same order as the first, but for the P2 side.
The active play side, style, difficulty and current music entry pointer were all close together, so I combined them into a single structure called state_t.
All the updated constants, structures, classes, and a few new utility functions can be found in these header files.
Without further ado, let's get started! I'll be trying to keep the code simple, albeit a little untidy, for expository purposes. Feel free to re-organise and split things up into separate classes and files as you see fit.
std::uintptr_t bm2dx_addr = 0; // Base address of the 'bm2dx.exe' module.
state_t* state = nullptr; // Structure containing various game state data.
judgement_t* judgement = nullptr; // Structure containing judgement data for both players.
const char* player_name = nullptr; // DJ name of the player, works for both sides.
void* get_pacemaker_data = nullptr; // A function for retrieving pacemaker data.
// Print score data out to the console. Just a placeholder for now.
void scorehook_dump(StageResultDrawFrame* frame = nullptr) {}
DWORD WINAPI scorehook_init(LPVOID dll_instance) {
// Create a console window for printing text.
AllocConsole();
freopen_s((FILE**) stdout, "CONOUT$", "w", stdout);
// Get the base address of bm2dx.exe.
bm2dx_addr = (std::uintptr_t) GetModuleHandleA("bm2dx.exe");
// Cast various interesting areas of game memory.
state = (state_t*) (bm2dx_addr + game_state_addr);
judgement = (judgement_t*) (bm2dx_addr + judgement_addr);
player_name = (const char*) (bm2dx_addr + player_name_addr);
get_pacemaker_data = (void*) (bm2dx_addr + pacemaker_addr);
do {
if (GetAsyncKeyState(VK_F9))
scorehook_dump();
else if (GetAsyncKeyState(VK_F10))
break;
Sleep(100);
} while (true);
// Print some text, just so we know that something is happening.
printf("Detaching from process..\n");
// Free resources and detach from process.
FreeConsole();
FreeLibraryAndExitThread((HMODULE) dll_instance, EXIT_SUCCESS);
return EXIT_SUCCESS;
}
BOOL APIENTRY DllMain(HMODULE dll_instance, DWORD reason, LPVOID) {
if (reason == DLL_PROCESS_ATTACH)
CreateThread(NULL, NULL, scorehook_init, dll_instance, NULL, NULL);
return TRUE;
}
Here's my usual boilerplate code. When loaded into a process it creates a console window before entering an infinite input processing loop. We can invoke scorehook_dump by pressing F9, although this won't actually do anything just yet.
When the shutdown signal is received - in this case, the F10 key is pressed - the library frees any resources and detaches from the process; ready for changes to be made, the library to be recompiled and loaded into the process again.
If you really want to see an empty console window, feel free to compile and load that into the game now. Pretty much any remote library loader should work just fine. I'm using Process Hacker 2 with the "Miscellaneous > Inject DLL" context menu option.
Now for something a bit more practical - let's start filling in the scorehook_dump
function.
auto chart = bm2dx::get_chart();
auto judgement = bm2dx::get_judgement();
This is where the bm2dx_util.h helper functions come in handy.
get_chart uses the global state variable to check which side the player is on, the play style and chosen difficulty for that side, and finally, the current active music entry. Using these, it returns a chart_t containing the rating, BPM and note count.
get_judgement is a little simpler. It returns a judgement_player_t structure, which is just a cut down version of judgement_t containing values for a single player rather than both. For the Double play style, it combines both P1 and P2 values.
printf("Player name: %s\n\n", player_name);
printf("Active play side: %s\n", state->p1_active ? "P1": "P2");
printf("Active play style: %s\n\n", state->play_style == 0 ? "Single": "Double");
printf("Current music: %s - %s\n", state->music->artist, state->music->title);
printf("Chart note count: %i\n", chart.notes);
printf("Chart rating: Lv. %i\n", chart.rating);
printf("Chart BPM: %i ~ %i\n\n", chart.bpm_min, chart.bpm_max);
printf("PGREAT: %i\n", judgement.pgreat);
printf("GREAT: %i\n", judgement.great);
printf("GOOD: %i\n", judgement.good);
printf("BAD: %i\n", judgement.bad);
printf("POOR: %i\n\n", judgement.poor);
printf("COMBO BREAK: %i\n\n", judgement.combo_break);
printf("FAST: %i\n", judgement.fast);
printf("SLOW: %i\n\n", judgement.slow);
Player name: AIXXE
Active play side: P1
Active play style: Single
Current music: dj TAKA - Liberation
Chart note count: 760
Chart rating: Lv. 6
Chart BPM: 0 ~ 150
PGREAT: 463
GREAT: 239
GOOD: 58
BAD: 0
POOR: 0
COMBO BREAK: 0
FAST: 250
SLOW: 47
That all looks about right. Nice and easy so far. Now to add in some data from StageResultDrawFrame.
Let's not dwell on finding the address of this class programmatically just yet - I'll cover that a little further down. For now I've hard-coded in an address using the "Find out what accesses this address" technique from the previous post.
frame = (StageResultDrawFrame*) 0x4939DC00;
auto frame_data = state->p1_active ? frame->p1: frame->p2;
printf("Best clear type: %s\n", CLEAR_TYPE[frame_data.best_clear_type]);
printf("Current clear type: %s\n\n", CLEAR_TYPE[frame_data.current_clear_type]);
printf("Best DJ level: %s\n", DJ_LEVEL[frame_data.best_dj_level]);
printf("Current DJ level: %s\n\n", DJ_LEVEL[frame_data.current_dj_level]);
printf("Best EX score: %i\n", frame_data.best_ex_score);
printf("Current EX score: %i\n\n", frame_data.current_ex_score);
printf("Best miss count: %i\n", frame_data.best_miss_count);
printf("Current miss count: %i\n\n", frame_data.current_miss_count);
Best clear type: clear_failed
Current clear type: clear_fullcombo
Best DJ level: level_s_c
Current DJ level: level_s_a
Best EX score: 820
Current EX score: 1165
Best miss count: 60
Current miss count: 0
Excellent. That just leaves the pacemaker data now. Remember what I said last time?
As luck would have it, getting to this data programmatically would be easy. I would only need to allocate enough memory to store all the above, meaning 264 bytes, then call
sub_520A40
with a pointer to said memory.
That's still what we're going to do, but there's a slight complication.
int __usercall sub_512160@<eax>(_DWORD *a1@<eax>)
The address has changed in an update, making this sub_512160
now, but that's not it.
That __usercall calling convention is IDA's way of telling us that we should pass the a1 argument into the eax register before calling this function. The @<eax> here means that it will also return a value in the eax register.
It's a little different, but nothing a bit of inline assembly can't handle.
pacemaker_t pacemaker_data;
_asm {
lea eax, [pacemaker_data];
call [get_pacemaker_data];
}
printf("Pacemaker target: %i [%s]\n", pacemaker_data.score, pacemaker_data.name);
printf("Pacemaker type: %s\n", PACEMAKER_TYPE[pacemaker_data.type]);
Pacemaker target: 1216 [80%]
Pacemaker type: sg_pacemaker
That was pretty fast. Of course, there's still that one glaring issue: the hard-coded StageResultDrawFrame address.
Let's take another look at Class Informer for some inspiration.
If I had to guess, I'd bet that something in CStageResultScene is responsible for instantiating all those other stage result drawing classes. Perhaps we'll even find them, or something that could lead us to them somewhere inside.
That doesn't sound like an entirely unreasonable line of thought, so let's jump back to StageResultDrawFrame's virtual function table in IDA and press Ctrl+X on the first entry to bring up the cross-references list.
A single reference in sub_511010
. Here's the cut-down pseudo-code of the interesting parts.
_DWORD *__stdcall sub_511010(_DWORD *a1)
{
// ...
*a1 = &StageResultDrawFrame::`vftable';
// ...
return a1;
}
Seems that the a1
argument passed to this function becomes an instance of StageResultDrawFrame
.
int __thiscall sub_516930(void *this, int a2)
{
// ...
*(_DWORD *)a2 = &CStageResultScene::`vftable';
// ...
sub_511010((_DWORD *)(a2 + 1480));
// ...
}
Jumping up to where this is called takes us to sub_516930
. The code here suggests we'd be able to find a StageResultDrawFrame
by adding 1480 bytes to the address of a CStageResultScene
. Let's confirm that.
For this, you can utilise the ancient art of "setting breakpoints on a bunch of virtual functions until one of them fires". This only took a few attempts in the CStageResultScene
table before I was able to find sub_5B52B0.
Adjusted for the image base, this became bm2dx.exe+0x1B52B0.
If you look back at the earlier code, you'll see that the hard-coded address for StageResultDrawFrame was 0x4939DC00.
If we take that StageResultDrawFrame address, 4939DC00, then subtract the 4939D638 currently in the ecx
register, we get 5C8, or 1480 in decimal. So yeah, that works and we could do that, but after looking around in ReClass we can do even better.
It turns out there's a class a bit further down from StageResultDrawFrame called StageResultDrawParts. It literally just contains pointers to all the other classes, including itself. We're reaching levels of convenience that shouldn't be possible.
class StageResultDrawBg;
class StageResultDrawGraph;
class StageResultDrawDjLevel;
class StageResultDrawRivalWindow;
class StageResultDrawMusicInfo;
class StageResultDrawInvalidFrame;
class StageResultDrawDeadPoint;
class StageResultDrawParts {
private:
virtual ~StageResultDrawParts() = 0;
public:
StageResultDrawBg* bg;
StageResultDrawGraph* graph;
StageResultDrawDjLevel* dj_level;
StageResultDrawRivalWindow* rival_window;
StageResultDrawMusicInfo* music_info;
StageResultDrawFrame* frame;
StageResultDrawInvalidFrame* invalid_frame;
StageResultDrawDeadPoint* dead_point;
StageResultDrawParts* parts;
};
We're only interested in StageResultDrawFrame
at the moment but there's plenty more fun to be had with these classes.
For instance, you could extract all your rival names and scores from StageResultDrawRivalWindow
, or read out the groove graph points from StageResultDrawGraph
. I'll leave that up to you for now, though.
int __usercall sub_517330@<eax>(int a1@<eax>, int a2@<ecx>)
{
int result; // eax
switch ( a2 )
{
case 0: result = a1 + 144; break; // StageResultDrawBg
case 1: result = a1 + 420; break; // StageResultDrawGraph
case 2: result = a1 + 464; break; // StageResultDrawDjLevel
case 3: result = a1 + 476; break; // StageResultDrawRivalWindow
case 4: result = a1 + 1468; break; // StageResultDrawMusicInfo
case 5: result = a1 + 1480; break; // StageResultDrawFrame
case 6: result = a1 + 1652; break; // StageResultDrawInvalidFrame
case 7: result = a1 + 1660; break; // StageResultDrawDeadPoint
case 8: result = a1 + 1672; break; // StageResultDrawParts
default: result = 0; break;
}
return result;
}
Looking around a bit more, I found this further down in sub_516930
. As you might've guessed from the amount of cases in the switch
, this function gets called nine times in order to populate the pointers in StageResultDrawParts
.
I was initially thinking of hooking something from CStageResultScene
and just padding out the class until it reached either StageResultDrawFrame
or StageResultDrawParts
, but this function has given me another idea.
loc_516A56: ; CODE XREF: sub_516930+136↓j
00516A56 8B C5 mov eax, ebp
00516A58 E8 D3 08 00 00 call sub_517330 ; Call Procedure
00516A5D 89 02 mov [edx], eax
00516A5F 41 inc ecx ; Increment by 1
00516A60 83 C2 04 add edx, 4 ; Add
00516A63 83 F9 09 cmp ecx, 9 ; Compare Two Operands
00516A66 7C EE jl short loc_516A56 ; Jump if Less (SF!=OF)
I reckon we could intercept the call
instruction in the loop where this function gets called and redirect it to our code.
We would then have to call sub_517330
ourselves. Once again we're dealing with __usercall
, so we'd have access to one of the class pointers in eax
and the number from the switch in ecx
, which can be used to identify the class we're dealing with.
Then, simply check if ecx
matches 5
, the number for StageResultDrawFrame
, and call the scorehook_dump
function. Finally, return to the game by jumping back to the mov
instruction at 516A5D
, immediately after where we placed the hook.
That sounds like it would work, so let's go ahead and give it a shot.
constexpr std::uintptr_t result_hook_addr = 0x116A58;
void* result_hook_original_fn = nullptr; // 517330
void* result_hook_return_addr = nullptr; // 516A5D (516A58 + 5)
We'll define these properly when we install the hook.
__declspec(naked) void scorehook_intercept() {
__asm {
// instructions go here..
}
}
The __declspec(naked)
attribute prevents the compiler from generating an unnecessary prologue or epilogue for this function. We'll be writing the whole hook function using the inline assembler. Perhaps not strictly necessary, but certainly more fun.
call [result_hook_original_fn];
As mentioned earlier, calling this function puts the important data into the eax
and ecx
registers.
The only relevant class to us is StageResultDrawFrame
, so let's make sure that's the one we're dealing with here.
cmp ecx, 5;
jne back;
Remember that 5
was the index for StageResultDrawFrame
in the switch
, so if the current value in ecx
is not 5
then we should jump to the back
label. It's not defined yet, but it'll allow us to return to the function we placed our hook in.
pushad;
Now that we're definitely dealing with the right class, we need to make sure we don't mess up the existing registers when we call scorehook_dump
. We can do this by pushing all their values onto the stack with a single convenient instruction.
push eax;
call [scorehook_dump];
Before calling scorehook_dump
, we push the StageResultDrawFrame*
argument it expects from eax
to the stack.
pop eax;
popad;
Almost done. Now just clean up that argument we pushed, then pop everything back into the correct registers.
back:
jmp [result_hook_return_addr];
Finally, we hand control back to the game by returning to the next instruction in the function we hooked.
I don't know about you, but I'm pretty excited to try this out. There's just last thing we have to take into account since the call we're replacing is a relative one. The bytes that make up the instruction are E8 D3 08 00 00
.
How is IDA turning that into 517330
? Nothing too complicated, just take the address of the call, 516A58
, add the length of the instruction, meaning 5 bytes, then add the relative displacement part, D3 08
, so 8D3
.
516A58 + 5 + 8D3 = 517330
Well, that's the gist of it. For a proper explanation, check out the x86 instruction reference page for call
over here.
I've included a helper function called GetAbsoluteAddress in bm2dx_util.h
that will handle this for us. But that's enough explaining, now we just need to get this thing hooked up! I'll be using MinHook for this example.
// This should go before the 'do while' loop in scorehook_init.
MH_Initialize();
const std::uintptr_t hook_address = (bm2dx_addr + result_hook_addr);
result_hook_original_fn = GetAbsoluteAddress(hook_address);
result_hook_return_addr = (void*) (hook_address + 5);
MH_CreateHook((void*) hook_address, &scorehook_intercept, NULL);
MH_EnableHook((void*) hook_address);
// This should go after the 'do while' loop in scorehook_init.
MH_Uninitialize();
We've essentially reached the end for this part now. Compile and load that into the game and score data will be printed to the console when you reach the result screen. No more manual key pressing required!
Okay.. there's actually one more thing to do. Last one for real this time. Our library may be functional now, but the process of getting it running is still way too manual for my liking. Does anyone really want to do this every time they start the game?
If you don't, I've got just the trick. It's commonly known as DLL hijacking. I'll only be giving a quick rundown of how it can be applied here since it's already well documented elsewhere. By taking advantage of the DLL search order in Windows, we can replace an imported library with a fake version that acts identically (well, identically enough) to the original.
Here you can see the game executable imports d3d9.dll in order to call Direct3DCreate9. Let's cross our fingers and hope that's all the game expects of that library. We'll quickly build a drop-in replacement that provides the same functionality, then place it in the same directory as bm2dx.exe. According to MSDN, this would give it a higher search priority than the real one.
If SafeDllSearchMode is enabled, the search order is as follows:
- The directory from which the application loaded.
- The system directory. Use the GetSystemDirectory function to get the path of this directory.
We'll assume that the real d3d9.dll
can be found in the directory returned from GetSystemDirectory
.
Our fake library only needs to do a few things for this to work: export Direct3DCreate9
with the same signature as the original, load the real d3d9.dll
from the system directory, call the real Direct3DCreate9
and return the real result.
Anything else is up to us. We'll keep it simple and just load the scorehook library somewhere in-between.
#include <windows.h>
#include <filesystem>
class IDirect3D9* __stdcall Direct3DCreate9(UINT SDKVersion) {
// Ensure the function is exported without a decorated name.
#pragma comment(linker, "/EXPORT:" __FUNCTION__ "=" __FUNCDNAME__)
// Load our library first.
LoadLibrary(L"scorehook.dll");
// Assume we can find the real 'd3d9.dll' library in system32.
TCHAR system_dir[MAX_PATH];
GetSystemDirectory(system_dir, MAX_PATH);
// Locate the real 'Direct3DCreate9' function, call it and return the result.
const auto real_d3d9 = LoadLibrary(std::filesystem::path(system_dir).append("d3d9.dll").c_str());
const auto real_create = reinterpret_cast<decltype(&Direct3DCreate9)>(GetProcAddress(real_d3d9, "Direct3DCreate9"));
return real_create(SDKVersion);
}
Make sure that the final library has a single Direct3DCreate9 export, then just copy both libraries to the game\app
directory and you'll never have to load manually ever again! But yeah, you might want to add some basic error checking first.
Admittedly, the scorehook library doesn't do that much right now, so it's probably not that useful to have it loading automatically yet. This section just doesn't quite fit in any of the upcoming parts but I didn't want to leave it out.
But with that, we're finished for real this time. There's still more to go in this series so look forward to the next instalment, whenever that may be. Until next time, Merry Christmas and a Happy New Year!