Announcement

Collapse
No announcement yet.

PS2: Create Custom Subroutines

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PS2: Create Custom Subroutines

    Introduction

    WARNING: This is an advanced topic. Don't expect this guide to make any sense if you don't already have a fairly good grasp of MIPS assembly and/or programming and assembly language in general. If you've only just learned what hexadecimal is, you may read on, but don't contact us asking for a guide in laymen's terms. It just isn't possible to make one.

    I'm going to try to make this guide so novice hackers can get nearly as much out of it as more advanced hackers. For that reason, some things may not make sense to some people and some people will find other things to be tedious review. I feel for both groups, but please bear with me anyway.

    Subroutine codes work by forcing the game to execute a small bit of machine code that you've written. They are sometimes also called "breakpoint codes," "branch codes," and a few other names. Unlike the usual "ASM" codes, or codes that modify the game's own machine code directly to produce an effect, subroutine codes are created almost entirely from scratch by the hacker and stored in memory external to the game's executable.
    Forcing execution of these external subroutines creates several problems that you must address in your code.

    HTML Code:
    <ul><li>Registers used by your subroutine's code must be preserved.</li><li>If you make any calls to other subroutines in your code, registers used by them may need to be preserved by your code.</li><li>Your subroutine must return cleanly to the game. Ideally, the only noticeable changes in the game's behavior will be the effects the code was designed to create.</li><li>You must be able to force the game to call your subroutine.</li></ul>

    There are times when you can violate all the rules except the last one. Unfortunately there isn't any concrete method for making codes that call custom subroutines. Every program is different, but if you try to follow the rules above, you'll have less trouble in the long run.

    When To Jump
    That's the thorny issue. Do you really need a custom subroutine to produce the effect you want?

    When you make the decision, you need to be aware that codes containing subroutines are harder to make; they're harder to test; they're usually fairly long, making them prohibitive to the user; and the chance of user error is greatly increased once the code is released. Don't misunderstand. It's not that these codes are undesirable or a choice of last resort. It's simply that the vast majority of effects can be created with less than a handful of code lines by modifying the game's data or its op codes. On the other hand, I have one subroutine code that is over seventy lines long. I spent a week looking for a way to use the game's programming to accomplish the same effect and never found one.

    Those issues are things to keep in mind. The ultimate decision to jump to a custom subroutine is made by one factor: The number of code lines needed to produce the effect exceeds the number you can replace without negatively impacting the program. That's all there is to it. If you need seven lines of op codes to create an effect and the part of the program you need to put them in has only two lines that you can safely replace, you need a subroutine.

    ASM Codes
    Let's start with a simple example of regular ASM codes. The final product of this borders on a subroutine code; it replaces several parts of the game's code with custom instructions that make a loop behave differently. The modified code is part of a larger subroutine, but by looking at it, you can get a bit of an idea how to write custom code without being thrown into all the issues surrounding subroutines right away.



    The above is a screen shot from the game in question, "Disgaea: Hour of Darkness." What is shown is the starting inventory: Three items named, "Mint Gum". Three is hardly a treasure trove and you're allowed sixteen in the item bag, so let's start out by trying to fill all 16 slots.
    Here's the code that creates the initial inventory:



    The highlighted line is setting up the item digit to be sent to the function "Item_Generate." The label is mine, not the programmer's. (Don't ask me why it was moved to one register and then immediately shifted to another. I don't get it either.) Looking down toward the bottom of the image, you can see the operation "slti v0, s1, $0003," at address 0x001A3B88. The branch following it checks the results of the operation and branches to the top of the loop (address 0x001A3B40). Since three seems a bit paltry, let's try changing the comparison to "slti v0, s1, $0010" and see if all 16 slots get filled with a Mint Gum. The cheat code for that will be: 201A3B88 2A220010.



    Sure enough, we get 16 Mint Gums. They aren't very useful though. This game allows characters to have millions of hit points, so an item that restores 40 doesn't stay useful for long. Since we know what op tells the game to make Mint Gum, we could replace it with another item, but how useful is a code that gives you 16 of the same thing? So let's try to create a customized inventory.
    To do that, we obviously can't just replace ops. You could just increment the item digit, you'd have to replace about three instructions, but that's still not very "custom". So what we'll do in this example is allow the player to create an inventory list in memory, that our new code will retrieve and use.

    To save on operations, we'll put the list in memory at 0x000F0000. That way the address can be loaded onto a register with just a load upper immediate (LUI) op. The item digits are 16 bits wide (halfword) so the code will need to store a series of halfwords starting at that address. The part of the program that loads 0x7D2 prior to the Item_Generate call will need to load those halfwords onto the appropriate register. We'll also need the the code that modified the loop limit from our previous example.

    Let's start with the custom list codes:
    100F0000 00000???
    100F0002 00000???
    100F0004 00000???
    100F0006 00000???
    100F0008 00000???
    100F000A 00000???
    100F000C 00000???
    100F000E 00000???
    100F0010 00000???
    100F0012 00000???
    100F0014 00000???
    100F0016 00000???
    100F0018 00000???
    100F001A 00000???
    100F001C 00000???
    100F001E 00000???

    Pretty simple, right? I'll insert random item values for testing; it's not really important for this guide.

    Now let's get the address loaded onto a register so we can retrieve the values. But there's a problem here. We can't corrupt the data that the game is going to use. Putting new values on registers only works if you're replacing data with new data that serves the same function. Here, we'll be replacing god knows what with an address. Chances are the address we load isn't going to be a useful bit of data for the game and unless we're careful, the game will crash. We're going to have to look around and find registers that we can safely use. How do we do that? It's pretty simple really. The easiest way is to look near the end of the subroutine and see what registers are restored prior to the JR op. Here's the end of the subroutine we're working in:



    Quite a few registers are getting restored. That doesn't necessarily mean we can use them safely, but it does mean that using them won't result in corrupt data on the register once the subroutine returns control to its caller. Now we need to look through the code and make sure that, between the time we plan to use the registers and the time they're restored, they aren't being used by the game. It just so happens that several aren't, including s7 and we'll use that to store the address.

    <i>Note: There are several ways of finding usable registers. The method above works, as does looking for memory loads (e.g. lw reg, off(reg)), immediate loads and register clears (e.g. daddu a0, zero, zero). As long as there's an instruction that <b>replaces</b> the value on the register prior to any instruction that might try to use the corrupted value you put on it, loading the register for your own purposes should be fine.</i>

    Okay, we know where we can store the address to use in our load ops and now we need to find a place to put the instruction to load the address. It just happens that there's a NOP just above the block of instructions in figure 2. That's where we will place the LUI to set up the address. The operation is "lui s7, $000F"and the code will be: 201A3B3C 3C17000F

    Now the address is on s7, the loop will execute 16 times, filling the inventory, and the custom digits are stored in memory from 0xF0000 to 0xF001E. All that's left is to retrieve all the values. All that's required there is to replace the instruction that loads a2 or a0 with 0x7D2. I'll replace the instruction at 0x001A3B48 with "lh a2, $fffe(s7)." I'll explain in a moment why the index/offset from the address is -2 (fffe) rather than zero.

    The last thing that has to be done to get this code working is to increment the address so the same item isn't loaded each time. There just happens to be a NOP after the BNE instruction that we can replace with no worries. Register s7 will need to have 2 added to it after each item is generated. So that will be "addiu s7, s7, $2" and the code will be: 201A3B90 26F70002
    The final product looks like this (modified lines are highlighted):



    Notice the "beq zero, zero" op above the first highlighted line? That branches to the end of the loop and causes the address to immediately be incremented by two. That's why the index value off of s7 was -2 instead of 0. There are several ways this could have been dealt with. One is to start the list at 0xF0002. Another way would be to change the halfword load to load a1 instead and increment the address on the next line. Any of them will work and it just proves that there is more than one way to solve the same problem.
    Now we have our final code:

    <b>Disgaea: Hour of Darkness
    Start With Custom, Filled Item Bag</b>
    100F0000 00000???
    100F0002 00000???
    100F0004 00000???
    100F0006 00000???
    100F0008 00000???
    100F000A 00000???
    100F000C 00000???
    100F000E 00000???
    100F0010 00000???
    100F0012 00000???
    100F0014 00000???
    100F0016 00000???
    100F0018 00000???
    100F001A 00000???
    100F001C 00000???
    100F001E 00000???
    201A3B3C 3C17000F
    201A3B48 86E6FFFE
    201A3B88 2A220010
    201A3B90 26F70002

    After a quick in-game test:



    Success! I duplicated item digits in the list a few times, that's why there's more than one of some things. (Don't ask about the item description. The game also has "Horse Weiner" as an accessory. The designers were frigging warped.)

    <u>Simple Subroutine Codes - Slow Motion</u>
    The previous example gave a brief demonstration of preserving registers. The game's programming had to be analyzed and a decision reached as to what register could be safely used in the cheat code. This is an important concept in creating custom subroutines that will be expanded upon in this example.
    Here are some of the rules to follow when making a custom subroutine cheat:
    Current Ongoing Projects :.
    Hacking Turbo Grafx 16 & CD Games and MSX

  • #2
    HTML Code:
    <ol><li>You must be able to force the game to call your subroutine.</li><li>Your subroutine must return cleanly to the game. Ideally, the only noticeable changes in the game's behavior will be the effects the code was designed to create.</li><li>Registers used by your subroutine's code must be preserved.</li></ol>
    Each of these will be covered by this example.
    A slow-motion code is a fairly generic custom subroutine. The actual code is very straight forward and there are a handful of guides with generic routines already floating around the internet. The basic premise is to create a loop that will slow down the game's execution just enough to be a useful cheat. We're going to create a slow-motion code for Final Fantasy X-2.

    For this code, following Rule 1 is pretty easy. The simplest place to call the routine from (often referred to as "hooking") is the scePadRead function in the game1. The scePadRead is part of a generic developer library that deals with the control pad. The easiest way to find it is to import labels in PS2Dis from a game that has its scePadRead labeled. There are other ways (link to joker/mcode guide(s)). The scePadRead in FFX-2 is located at 0x00321600.

    Now, the code definitely shouldn't interfere with the functions of the scePadRead. We only hook it here so that the loop will be executed frequently. Sticking the call in the wrong place might have some interesting effects on the pad. So we'll hook it at the return instruction, which is at 0x00321674. Here's what the scePadRead looks like, the hook will be the "jr ra" op near the bottom of the image:



    Now that we know where our call will be, we need to write the routine.
    The routine will be written in memory at 0x000C4000. The "jr ra" at the end of the scePadRead will need to be changed to jump to that address. There are a couple of ways this can be done. One is to use a "jal" op code. However "jal" is "jump and link" which means that the contents of register "ra" will be overwritten when it is executed. In this case, we don't want to destroy that value, so this code will use a plain "j" op code to 0x000C4000. The cheat code for that will be: 20321674 08031000

    Now the delay loop must be written. The code for that is pretty simple.



    A high value is dropped onto register v0 and then decremented in a loop until it reaches zero. The li op in the branch delay slot (0x000C4010) accomplishes the decrement. The actual op is addiu v0, v0, $-1, the li is just an attempt to be helpful by PS2Dis.
    You may be wondering why the code in the screen print starts at 0x000C4008 instead of 0x000C4000. Patience.
    Now we need to follow Rule 2 and return cleanly to the game. Rule 2 can be broken down into a few general guidelines:
    If you jal into a subroutine, you should jr ra back out of it and the return address (register ra) of the function you hooked had better be stored somewhere so it can be retrieved.

    If your subroutine will only be called from one place, it's often simpler to just j to it and then j back to the instruction after the branch delay for your hook. This saves you having to maintain the ra register.
    If your subroutine is sufficiently simple, you can merely use the return address of the hooked function and return to its caller. You have to be careful not to do this before the hooked function finishes its work.
    For this code, the third option will be used. That's why the hook replaced the jr ra op. Our little subroutine will be able to safely return to the scePadRead's return address.

    Now to add the final touch to the code. Notice that register v0 is used to create the delay. Whatever was on that register before is now gone. The scePadRead is called from a few places in FFX-2, so it's difficult to tell if any register can be safely used the way s7 was in the previous example. To be safe, we'll need to preserve register v0 before the main body of our code.
    Registers are generally preserved as a preamble to most subroutines. The most used method is to move the stack pointer (register sp) back a few bytes. That is, add a negative value to the stack pointer. The registers that need to be preserved are then stored at offsets from the stack pointer. You need to be aware how much data you need to store and shift the stack pointer back a sufficient number of bytes. There's no harm in using a larger value than necessary to be safe. We've only got one register to store in this subroutine, so we'll just back up the stack by -0x10 (0xfff0) or 16 bytes. The code for that is addiu sp, sp, $fff0; the cheat code for that will be 200C4000 27BDFFF0. To store v0 and preserve its contents we need only do a "sd v0, $0000(sp)"2; the cheat code will be 200C4004 FFA20000.

    That's only half of the task involved in preserving registers though. The second half is restoring the registers once the subroutine has finished its work. That includes restoring the stack pointer. All that's required for this code is to "ld v0, $0000(sp)" and restore v0's old contents. The cheat code for that is 200C4014 DFA20000. Finally, the stack pointer must be restored by adding positive 0x10 to it to restore its old value "addiu sp, sp, $0010". The cheat code for that is 200C401C 27BD0010.

    You may notice the single instruction gap between the restore of v0 and the restore for the stack pointer. This is there to allow room for the "jr ra" op that will return from the subroutine. The cheat code for this will be 200C4018 03E00008.
    The final result in assembly looks like this:



    The cheat code, including the "hook" will be:

    Final Fantasy x-2
    Slow Motion Code
    200C4000 27BDFFF0
    200C4004 FFA20000
    200C4008 3C020020
    200C400C 1C40FFFF
    200C4010 2442FFFF
    200C4014 DFA20000
    200C4018 03E00008
    200C401C 27BD0010
    20321674 08031000

    Wait though, slow motion is annoying and not the sort of thing you want active all the time. So let's make an on/off joker set up. We'll use L3 and R3 to deactivate and activate the code. All you have to do is joker the line with the hook in it (the last line) and provide two extra lines to "turn off" the hook by putting the game's old code back. So the real final code will be:

    Press R3 For Slow Motion, L3 Returns To Normal
    200C4000 27BDFFF0
    200C4004 FFA20000
    200C4008 3C020020
    200C400C 1C40FFFF
    200C4010 2442FFFF
    200C4014 DFA20000
    200C4018 03E00008
    200C401C 27BD0010
    D05B29C2 0000FFFB
    20321674 08031000
    D05B29C2 0000FFFD
    20321674 03E00008

    The lines with 20321674 are the on/off codes (03E00008 is off). The joker command for FFX-2 is D05B29C2 0000????. I'd provide a screenshot of the code in action, but it really wouldn't show anything. All pictures are in slow motion.


    HTML Code:
    <div style="border-top: thin solid; font-size: 8pt;">1 - Not all games have a scePadRead. If a game you're hacking doesn't have one, you'll need to find somewhere else to place the "hook". <p>2 - The store op is dependent on the number of bits in use in the register. While most times storing a word will be sufficient, it doesn't hurt to play it safe. When in doubt, store quad (<font face="courier new">sq reg, off(reg)</font>) </p><pGeneral Note - This code probably could have been done without the need to preserve registers, however I wanted to do so for demonstration purposes.</div>

    More Complex Subroutines And Code Optimizing
    Now we've got the basic rules of writing custom subroutines down, so let's expand on the concept a bit to create a more complex routine. This example will also include some tips on optimizing the code. In this case "optimization" refers more to refining the code into the fewest lines we can. Speed really isn't a consideration when you're writing such tiny amounts of machine code.
    This example will create a code for the game "Genso Suikoden III." What the code will do is give all characters an S rank in all their skills. Skills have an S rank when 0x8 is present in the memory that represents the rank, so there's the value we need. Now at first blush this might sound like something that could be accomplished simply with cheat device commands, but it really can't be.
    Each character has at most eight skills and ranks (all characters have space for eight, but they aren't always used).

    The way the skills are organized in memory looks like this, "S" stands for skill and "R" stands for rank: |S|R|S|R|S|R| and so on. Each value is one byte. There in lies the problem. You have to do 8-bit writes to set the ranks unless you want to overwrite the skills as well. There are 113 characters in the game, so that's about 8 * 113 = 904 lines of code to do it. Now on an AR MAX, you can do 8-bit slide codes and that would reduce it to 16, but that doesn't help people who don't have an AR MAX. And there's another problem, the AR MAX will write and 8 to all eight skills for all characters, even if the skill slot is unused. That may not be a problem, but it would be nice if it can be avoided.

    Luckily, each character occupies the same amount of memory and they're all adjacent to each other. That means all the subroutine has to do is load the address of the first character's first skill, make jumps to set all eight for that character and finally jump to the next character. There are 109 characters in the block, so the code can just use two nested loops to set the whole mess.

    Pseudo-code for this routine:
    HTML Code:
    <pre>	LOAD INITIAL ADDRESS
    	CHARACTER_COUNT = 0
    	WHILE (CHARACTER_COUNT < 109)
    		SKILL_COUNT = 0
    		WHILE (SKILL_COUNT < 8)
    			IF SKILL IS ZERO
    				JUMP TO NEXT SKILL
    			ELSE
    				SET SKILL RANK TO 'S' (8)
    			END_IF
    			INCREMENT SKILL_COUNT
    		END_WHILE
    		CHARACTER_COUNT = CHARACTER_COUNT + 1
    		JUMP TO NEXT CHARACTER
    	END_WHILE
    </pre>
    Now we have to decide where to hook it. After that, we have to choose what registers to use and then the subroutine can be written. Since this deals with skills, the normal routine for increasing them seems like a good place to hook it. The jump should be near the bottom of the routine, for safety and other reasons that will be explained shortly. Here's the end of the skill increase routine (it's rather large):



    Several registers are being restored from the stack at the end of this routine. That's perfect for us, because you can save instructions in your subroutine by putting the hook in the right spot and using as many registers as possible that are already saved on the stack. The ideal place to put the hook is before any registers, other than ra are restored. Unfortunately, we can't do that here. See the labels "__016c7fb4", "__016c7fb8" and "__016c7fbc"? Each of those indicate some branch instruction or other refers to that address, so it's tough to guarantee execution of instructions above it. We need to guarantee execution of our subroutine, so the hook should be on or after the last branch target (label). So we'll put our hook at 0x016C7FBC. (The label "__016c7fc4" is mine because my version of the subroutine references it. It's nothing to worry about.)

    Putting our hook there means that the "ld s4, $0040(sp)" op will be replaced. So somewhere in the subroutine we'll need to execute it on our own. The "ld s3, $0030(sp)" will be executed as it will be in the branch delay slot of our jump. That means it will be restored, so if we use it, we'll need to restore it again. If we can avoid using it, then all will be well, so we'll take care to avoid using it. That leaves s2, s1 and s0 free for use and they need not be preserved by us because they'll be restored on returning from our routine.

    We could just as well use whatever registers we want and preserve them ourselves, but there's a good reason for avoiding the need to preserve registers. Each register requires a store and a load op to preserve. In even a very simple subroutine, you can easily add a dozen or more lines to the code just to save registers. In a subroutine that uses five registers, finding away to avoid preserving four of them in your routine will save eight lines. If you can find a way to avoid preserving all five, then the code can be twelve lines shorter. You gain two more because you no longer need to maintain the stack pointer.
    Now we know that we can use s0, s1 and s2 freely. We'd might as well use s4 since we'll have to restore it anyway. So let's try to write the subroutine using only those four registers.

    We'll use s0 for the address. s1 and s2 will be the character and skill counters respectively and s4 will be multi-purpose. Here's the assembly for the pseudo-code:



    The last two operations are the replacement for the s4 restore and the return to the game routine. The return address is for the instruction immediately following the branch delay for the hook. We already know the address for the hook, so we just need to make it a j $000C4028 at that address. Making the addresses and machine code for the routine and hook into cheat codes gives us the final result:

    Genso Suikoden III
    One Lesson Promotes All Skills For All Characters To S Rank
    200C4028 3C100196
    200C402C 3610E79C
    200C4030 0000882D
    200C4034 0000902D
    200C4038 82140000
    200C403C 12800002
    200C4040 24140008
    200C4044 A2140001
    200C4048 26520001
    200C404C 2A540008
    200C4050 1680FFF9
    200C4054 26100002
    200C4058 26310001
    200C405C 2A34006D
    200C4060 1680FFF4
    200C4064 2610007C
    200C4074 DFB40040
    200C4078 085B1FF1
    216C7FBC 0803100A

    So our final result is nineteen lines long. Not too shabby. If we had been forced to preserve the registers we used, the final line count would have been twenty-nine, which is rather high. The AR MAX equivalent would be sixteen lines long and seventeen counting the verifier code. And the AR MAX won't be able to check whether a skill is actually present. That takes up two lines in this code, so the code-line count is really pretty good.

    General Note - There's a happy coincidence with the value for S rank being the same as the skill count. An extra code line could have been removed by testing for equality between the skill count and s4 instead of using the slti s4, s2, $0008 op.
    Current Ongoing Projects :.
    Hacking Turbo Grafx 16 & CD Games and MSX

    Comment


    • #3
      *drools*
      I may be lazy, but I can...zzzZZZzzzZZZzzzZZZ...

      Comment

      Working...
      X