RISC OS 5.18 and OS_Heap
Herbert zur Nedden (92) 37 posts |
Hi, I have a nice app that uses OS_Heap quite a bit and suddenly this one fails… works on RISC OS since ages with absolutely no problems at all… but not in 5.18. It crashes with “Not a heap block” in the call of OS_Heap 3, i.e. when a heap block is to be released. There seems to be some change in the OS_Heap code (I did see some notes on this in the cange log about OS_Heap 7) that might result in incompatibilities… perhaps due to word-alignment forced in the odd place. The code (all in BASIC) uses OS_Heap 0 to initialize the heap, OS_Heap 2 to claim space on the heap, if needed increasing the heap size with OS_Heap 5. Space not needed anymore is released with OS_Heap 3 and when a heap block is released the heap size is shrunk in chunks of pages with OS_Heap 5 (page size in use is read with OS_ReadMemMapInfo) if possible. The basic space for the heap is gotten by changing the WimpSlot of the app if needed. The freeing up of heap space with OS_Heap 5 is not the issue (I deactivated that part of the code just to test). As soon as I removed the OS_Heap 3 calls, i.e. do not free up heap space anymore the app works fine (but eats up memory for sure). So perhaps a pointer to some guide or such outlining what to do to make the app work again (or better a change in OS_Heap to allow the old app to work) or whatever would be appreciated. Or perhaps I discovered a bug or my code that worked for ages is buggy? |
Herbert zur Nedden (92) 37 posts |
FIX found! I amended the code for OS_Heap 2 in my app and now it works: The code is called with S% as the size of the block to allocate; Heap% points to the heap (it is aligned) OLD: OS_Heap 2,Heap%,,S% NEW: OS_Heap 2,Heap%,,((S% OR &3) +1) So basically I ensure the size of the block allocated is a multiple of four. The OR ensures that the value has the last two bits set so that by adding 1 I get a multiple of four. This code results in a bit waste since if the size S% is already a multiple of 4 still I add another 4 so I refined it : NEW+: WISH: |
Sprow (202) 1158 posts |
You could also do S% = (S% + 3) AND NOT 3 to round up to the nearest multiple of 4 without waste when it is already so. However, I can’t reproduce the fault you described with DIM block 10000 which also requests a non multiple of 4? |
Herbert zur Nedden (92) 37 posts |
Thanks – perhaps it works since you don’t use the blocks fully and later free them or the crash comes when several blocks are allocated that are not multiple of 4 and thus some are not aligned…. The “AND NOT 3” is nice indeed and saves the IF. |
Jeffrey Lee (213) 6048 posts |
I’m rather skeptical of your fix – OS_Heap already does round up block sizes to be a multiple of four. Are you sure your code isn’t writing past the end of a heap block? |
Herbert zur Nedden (92) 37 posts |
Thanks… I guess I will do some more checking then – amazing that this app that I use by now for 15 years at least never presented any issues. |
Martin Avison (27) 1494 posts |
Herbert: long-standing bugs can suddenly surface for any number of reasons! I have already found it invaluable for diagnosing Heap problems, and it runs happily on RO5.18. Reporter also contains many other facilities, and is available from http://www.avisoft.f9.co.uk |
Matthew Phillips (473) 721 posts |
I wonder whether the changes to OS_Heap are behind the software problems David Holden alludes to in this newsgroup posting: http://groups.google.com/group/comp.sys.acorn.misc/msg/66ed832b71b8a06b?hl=en Unfortunately David does not say which software packages are affected or how, and did not respond to someone who asked for more details. Perhaps he is investigating? From Herbert’s experiences with FilerMinus, maybe the software is making unwarranted assumptions about OS_Heap’s allocation method? (All very speculative.) |
Steve Revill (20) 1361 posts |
PRM 1, page 349 states:
If programs are breaking because the author ignored that advice, then that’s not RISC OS 5’s problem. And saying “well, this code works fine on RISC OS 6” is hardly a persuasive argument. |
Martin Avison (27) 1494 posts |
I am tying to work with David Holden to help him resolve some of the problems with EFpro. One was because it assumed heap blocks were in multiples of 8, another a program bug which has become more obvious after the OS_Heap changes. Supporting legacy software is not easy! |
Herbert zur Nedden (92) 37 posts |
As for OS_Heap look at the thread about Filer- (https://www.riscosopen.org/forum/forums/4/topics/978). This app failed on 5.18 since it kind of guessed where the next heap block will start and now that the block size is a multiple of 4 instead of 8 the computaton had a good chance to fail (what it did). Filer- is fixed BTW (see above thread). Perhaps some other apps run into similar issues too, i.e. assume block size being multiples of 8 bytes. |
Steve Drain (222) 1620 posts |
I have recently foundered on this same rock with Basalt. Why was the API changed in the first place? The gain in memory must be minimal. |
Martin Avison (27) 1494 posts |
I have to say that the problems I found with EFpro were long standing bugs. There was always a nasty overwrite of storage outside the block requested. The API change just made this more obvious! Were the Basalt problems really problems with the changed API? Or bugs? |
Jeffrey Lee (213) 6048 posts |
The change was made to allow OS_Heap 7 to be implemented cleanly:
Since all the interesting OS heaps (RMA, PCI, etc.) start at MB aligned addresses, this means that with the 8 byte block size restriction in place, it would be impossible to allocate a block of memory at anything greater than 4 byte alignment. By removing the 8 byte block size restriction, OS_Heap 7 is free to insert as much or as little padding as it needs in order to create a block that’s at the alignment the caller desires. Before OS_Heap 7 came along, the PCI module was manually generating aligned addresses by allocating (blocksize + max(alignment,boundary)) bytes and then returning an address within that block. This was typically resulting in several hundred KB of memory going to waste – not an astronomical amount, but I’m sure you’ll agree that it’s something that was better off fixed. |
Steve Drain (222) 1620 posts |
A shortcut relying on the 8-byte size. That is just an excuse, instead of saying it was a bug, because I probably programmed it incorrectly first and then recognised why it worked. Mea culpa. Anyway, all sorted now, I hope. |
Steve Drain (222) 1620 posts |
I follow your reasoning. Thanks. The reasoning of the original designer appears to have been to allow minimum sized blocks to be joined into the free space, and I guess that the 8-byte solution required least processing. |
Steve Drain (222) 1620 posts |
I have supplementary question about using OS_Heap. Basalt needs to change the size of a block to a new size. OS_Heap 4 requires an increase or decrease as a parameter, not the new size. I find the old size using OS_Heap 6, and subtract 4 bytes. This method seems to be in jeopardy. I could claim an extra word to store the size within the claimed memory, but this seems extravagant. Am I safe doing what I am doing? |
Jeffrey Lee (213) 6048 posts |
I’d say that what you’re doing is safe. I’m sure there’s plenty of software doing exactly the same thing, so it’s not something we’d change unless we had a very good reason. I’ve got no idea what Acorn were thinking when they came up with OS_Heap 6. It’s stupid for it to return the size of the usable region plus header, if they don’t allow programmers to assume the header is always going to be the same size. |
Jon Abbott (1421) 2651 posts |
This change also breaks The Cobalt Seed, although despite a week of debugging I’ve yet to figure out exactly how. It wasn’t overrunning the block, but I did spot code which may have been walking heap allocations. I gave up in the end and changed OS_Heap to align to 8 bytes to match RO3.x behaviour which fixed all the problems I was seeing. Whilst debugging the game, I noticed that OS_Heap was returning misaligned addresses, so I started dumping the whole heap both before and after the OS_Heap 2 call and tried verifying it with Reporter – although it appears you can only report on the RMA on RO5.23 – I had to move to RO3.71 before the heap address I gave it was used. REPORTHEAP was reporting “Length > Free” which didn’t mean much to me, other than the heap was corrupt in some way. Looking at the entry in question, it’s length was zero. |