Ticket #258 (Fixed)Mon Dec 06 18:30:10 UTC 2010
Incorrect CSE optimisation in cc 5.61 with multiple array access
Reported by: | André Timmermans (100) | Severity: | Normal |
Part: | RISC OS: C/C++ toolchain | Release: | |
Milestone: | Status | Fixed |
Details by André Timmermans (100):
The follwing piece of code in compiled incorrectly
<pre>
<code>
//#pragma no_optimise_cse
#include <string.h>
#define MAX_DMA_DEFINTIONS (800*5)
typedef struct
{
void* address;
int size;
} dma_transfer_request_t;
typedef struct
{
int no_of_requests;
dma_transfer_request_t dma_source[MAX_DMA_DEFINTIONS];
dma_transfer_request_t dma_drain[MAX_DMA_DEFINTIONS];
} player_t;
void player_redrawLastFrame(player_t* player)
{
/*
dma_transfer_request_t*psource = &player->dma_source[i];
memcpy( player->dma_drain[i].address
, psource->address
, psource->size
);
*/
}
}
</code>
</pre>
The register should be set as follow on entry or memcpy:
<pre>
<code>
R0 = load(player address + offset of dma_drain in player + sizeof(dma_transfer_t) * i + 0)
R1 = load(player address + offset of dma_source in player + sizeof(dma_transfer_t) * i + 0)
R2 = load(player address + offset of dma_source in player + sizeof(dma_transfer_t) * i + 4)
</code>
</pre>
The compiler detects that the two arrays dma_drain and dma_source have the same dma_transfer_t structure it use it extract the common part to perform:
<pre>
<code>
tmp = player address + sizeof(dma_transfer_t) * i
R0 = load(tmp + offset of dma_drain in player + 0)
R1 = load(tmp + offset of dma_source in player + 0)
R2 = load(tmp + offset of dma_source in player + 4)
</code>
</pre>
However this becomes (R4 = player address, R5 = i, sizeof(dma_transfer_t) = 8):
<pre>
<code>
ADD R0,R5,R4,LSL #3
LDMIA R0!,{R1,R2}
ADD R0,R0,#&8000
SUB R0,R0,#&0304
LDR R0,[R0,#&CFC]
</code>
</pre>
Compiling with the pragma no_optimise_cse or with the alternate commented code
generates correct code.
Changelog:
Modified by Jeffrey Lee (213) Sat, January 25 2014 - 11:23:17 GMT
- Status changed from Open to Fixed
- Attachment added: 258
Looks like this was fixed sometime between 5.61 and 5.65. cc 5.65 output attached.