Well folks, it appears the other shoe has dropped with regards to writing ARM subroutines on the 49G+. Thanks to the posting of Cyrille de Brébisson in the thread entitled "hp49g+ : THE END OF THE QUEST !" we now have a working example of a 49G+ program with an embedded ARM subroutine. This subroutine handles the core calculations for displaying mandelbrot set, and does it much faster than emulated Saturn code. This post is an attempt to provide a guide for others to do what Cyrille has done, and as such I will be using his example to describe the process. Note that you'll probably want to display this message using a fixed-pitch font to preserve the formatting. Here is Cyrille's program: ---------------------- START PROGRAM %%HP: T(3)A(R)F(.); << "D9D20430F2CCD20212008FB9760808F70E00F14D29E4192195E4293195E20070A1E30080A1E10C60A3E2920400E4 4640A1E3930500E54650A1E500E480E1090E53EA00000AC5004440E2930300E4002780E3C530A1E3003880E100665 2E1FFFFFA186961D5E10066C3E86961C5E0F18DB8E86961D5E1006683E86961C5E0F18DB8E071341F00108340E000 80B60AE280B331FE9228319AAF0A7C151717FA6E55F1B0210034E922814413437FFFEFFFFAF731F481A60BAF23300 81BFAAF53128AC0B4406340010880BFF860D01564B421544A44580160B44AF23104A7107A6E56C161AF23166A7311 BA6E10B54931FF8018F2120096A6F8F2120096E6F8080310180B338FC77621361B021001448D34150B2130" 13 CHR "" SREPL DROP 10 CHR "" SREPL DROP H-> EVAL >> ---------------------- END PROGRAM THE SATURN PORTION ------------------- The above program is actually a system RPL program with an embedded CODE object written in Saturn and ARM assembly. The interesting portion of this code is how the ARM portion is loaded and called within the Saturn memory space. Disassembling the code reveals the following: ... INTOFF GOSUB +000E4 // skip over the ARM code, leaving the starting // address of the arm code on the stack [0xE0 nibbles (0x70 bytes) of ARM code embedded here] offset_000E4: C=RSTK // C.A = start of ARM code D0=C D1=#80100 LC #000E0 BUSCC 60 // equivalent to MOVEDOWN, this effectively moves // the ARM code to address 0x80100. ... LC #80100 BUSCC FF // Call the ARM code! .... There are three important things to note from this example: 1. By GOSUB'ing over the ARM section into another section of Saturn code and then using C=RSTK, the address of the ARM code is obtained without need for manually calculating offsets. 2. The BUSCC 60 opcode (MOVEDOWN) is very important in that it places the ARM codes at an address that is both unused and on an 8-nibble boundary. ARM code must reside on 4-byte boundaries in the ARM domain, therefore in the Saturn domain they must reside on 8-nibble boundaries. 3. The selection of #80100 as the address to move the ARM code to was made because that address is almost always unused. Larger ARM subroutines will probably require that temporary memory be allocated large enough to accommodate the ARM code and the 8-nibble boundary padding. THE ARM PORTION --------------- Disassembling the ARM portion of the code reveals the following: STMFD SP!, {R4-R8,LR} // Save off R4-R8,LR LDR R2, [R1,#0x914] // Saturn register B (nibs 0 thru 7) LDR R3, [R1,#0x924] // Saturn register D (nibs 0 thru 7) MOV R7, R2 MOV R8, R3 MOV R6, #0x100 loc_18: MUL R4, R2, R2 MOV R4, R4,ASR#12 MUL R5, R3, R3 MOV R5, R5,ASR#12 ADD LR, R4, R5 CMP LR, #0x4000 BGT loc_60 SUB R4, R4, R5 MUL R3, R2, R3 ADD R2, R7, R4 MOV R3, R3,ASR#11 ADD R3, R8, R3 SUBS R6, R6, #1 BNE loc_18 LDRB R6, [R1,#0x968] // Saturn register ST BIC R6, R6, #1 STRB R6, [R1,#0x968] // ST bit 0 = 0 LDMFD SP!, {R4-R8,PC} loc_60: LDRB R6, [R1,#0x968] // Saturn register ST ORR R6, R6, #1 STRB R6, [R1,#0x968] // ST bit 0 = 0 LDMFD SP!, {R4-R8,PC} // Restore R4-R8, set PC to return addr. Important notes: 1. The ARM subroutine needs to save / restore the contents of certain registers if they are modified in the subroutine. These registers are R0, and any register higher than R3. 2. When an ARM subroutine is called via BUSCC FF, the R0 register contains the Saturn PC + 5, which effectively points to the next Saturn instruction. You can alter Saturn program flow by altering this register. The R1 register contains the base address for all the ARM globals, which can be used to access all of the Saturn registers. Altering this register will not affect the calling code. The LR register contains the return address to get back into Saturn emulation. 3. Whichever ARM compiler is used should be set produce position- independent code, if that isn't obvious already. It also must be set to produce little-endian code. It is likely possible to use the thumb instruction set, but you're probably better off sticking with ARM. Note that the GCC ARM compiler should work fine for this purpose when used with an ARM9 architecture setting. 4. The ARM PC register must be set to the ARM LR register at the end of the subroutine. Not doing so will at the very least crash the calculator, and might cause all memory to be lost, or put the calculator into a such a state that battery removal is the only way to correct it. 5. The HP48/49 binary file format (outside the calculator) combines two nibbles into one byte and swaps them such that the Saturn opcode 81B2 is stored as 18 2B. The ARM code must be inserted into this file as is with no nibble reversal. In the Saturn domain the nibbles are re-reversed automatically, but not in the ARM domain, so they must be loaded as is. Unfortunately this will make ARM code that is viewed via 49G memory display programs appear nibble reversed. API INFORMATION --------------- So far there is no real API information available, although experimentally I've been able to determine the following offsets from the global base register for accessing the emulated Saturn CPU registers: Base (R1) offset Description --------------------------------------- 0x90C Saturn register A (low order 8 nibbles) 0x910 Saturn register A (high order 8 nibbles) 0x914 Saturn register B (low order 8 nibbles) 0x918 Saturn register B (high order 8 nibbles) 0x91C Saturn register C (low order 8 nibbles) 0x920 Saturn register C (high order 8 nibbles) 0x924 Saturn register D (low order 8 nibbles) 0x928 Saturn register D (high order 8 nibbles) 0x92C Saturn register R0 (low order 8 nibbles) 0x930 Saturn register R0 (high order 8 nibbles) 0x934 Saturn register R1 (low order 8 nibbles) 0x938 Saturn register R1 (high order 8 nibbles) 0x93C Saturn register R2 (low order 8 nibbles) 0x940 Saturn register R2 (high order 8 nibbles) 0x944 Saturn register R3 (low order 8 nibbles) 0x948 Saturn register R3 (high order 8 nibbles) 0x94C Saturn register R4 (low order 8 nibbles) 0x950 Saturn register R4 (high order 8 nibbles) 0x954 Saturn register d0 0x958 Saturn register d1 0x95C Saturn register P 0x968 Saturn register ST 0x96C Saturn register HST 0x970 Saturn CARRY flag 0x974 Saturn DECIMAL_MODE flag Note: I have purposefully not include information on calling any of the internal ARM subroutines contained in the calculator's ROM image. These subroutines are generally designed to work on data from the Saturn domain, not ARM, so they're not really suitable for use in a user-written ARM subroutine. Also, there is no calling (trap) table, so they must be called using absolute addressing, which will change with every ROM release, so using them is a good way to make sure your program crashes in the next ROM release. SUMMARY ------- Hopefully this is enough information for users to start creating their own ARM subroutines. It seems that the best approach to take when writing these subroutines is to limit their scope to core functions that will benefit from the extra speed, and not to try and implement entire applications in ARM code. It is unlikely to provide a tangible benefit. The Saturn emulator inside the calculator is extremely efficient, and trying to implement UI functions outside of it will probably not be beneficial. Note that all of the above information was obtained through analysis of disassembled ARM code, the little hints of information scattered throughout comp.sys.hp48, and Cyrille's mandelbrot example. This information is free to distribute, although I'd appreciate a mention of where it came from. It is by no means guaranteed to be correct, so use it at your own risk! -Robert Hildinger