Ticket #547 (WontFix)Sun Mar 20 12:56:20 UTC 2022
objasm 4.12: Merged literal pool values consume excess space
Reported by: | Jeffrey Lee (213) | Severity: | Normal |
Part: | RISC OS: C/C++ toolchain | Release: | |
Milestone: | Status | WontFix |
Details by Jeffrey Lee (213):
If multiple instructions use “LDR reg, =label” to reference the same label, then the assembler should try to only generate one literal pool entry containing the address of that label. Although this logic appears to work, the problem is that each instruction still reserves 4 bytes of space in the literal pool, resulting in large literal pools which only contain a handful of values.
I spotted this when using “LDR reg, =label” to reference a label in another AREA (within the same source file); I haven’t checked if there are other cases which trigger the bug.
Changelog:
Modified by Jeffrey Lee (213) Sun, March 20 2022 - 12:59:55 GMT
- Attachment added: literal
Sample assembler file. “cc literal.s” will compile it, and then if you disassemble the binary you’ll see several words of wasted space between the end of the used portion of the literal pool and the UND instruction that was placed after the LTORG in the source. Adding/removing LDR =’s will add/remove an equivalent number of words of waste.
Modified by Ben Avison (25) Fri, May 06 2022 - 10:56:31 GMT
The reason this happens is that the algorithm to de-duplicate the literals relies on all symbols in the expression being defined in pass 1 of the assembly. If they’re not defined, it can’t tell whether the LDR can be replaced by a MOV, or whether one or more relocations need to be applied, so it errs on the side of caution and allocates 4 bytes of the next literal pool, just in case.
In pass 2, the symbols must be valid, or it’s an invalid source file. Objasm will still try to re-use memory locations in the literal pool if there are any duplicates, including for those expressions that included a forward reference, with the aim of reducing cache usage, but it can’t deallocate the additional bytes in the literal pool at that point because the location of all subsequent labels is fixed by this point. The redundant bytes of literal pool just get filled with zeros.
In your specific example, you can shrink the literal pool by moving the definition of the data area above the code area, which converts all the symbols in the expressions into backward references. Can you adapt this approach to the situation in which you originally encountered this issue?
Modified by Jeffrey Lee (213) Mon, May 09 2022 - 23:05:48 GMT
Thanks for the explanation. I first spotted it when looking at the disassembly of some test code, so I’m not worried about that particular case. But it’s good to know that it only affects forward references, so I can work around it in future if needed.
Modified by Sprow (202) Mon, May 16 2022 - 08:06:03 GMT
- Attachment added: ticket547.txt
I also tried the example on the latest armasm (in Keil for Cortex-M) and exactly the same thing happens, unused 0’s in the literal pool for forward references, so at least ObjAsm is on a par with the offering from a multi billion dollar chip designer!
Output attached (it uses ELF files now).
Modified by Sprow (202) Thu, June 02 2022 - 16:07:34 GMT
- Status changed from Open to WontFix
Ben’s explanation, and Arm’s doing the same, sounds like ‘WontFix’.