Processor kernel vulnerabilities
Pages: 1 2
Steve Pampling (1551) 8170 posts |
Always good to start the day on a bright note (irony) and people had got round to reading a few IT articles yesterday so… I replied that
I didn’t bother pointing out that privilege escalation in RO is an interesting fantasy for reasons we all know. The interesting effect of patching to cause things to work securely (which Intel clearly avoided) slows things down by up to 30%. Anyone want to speculate why Intel did it the insecure way? |
Jeffrey Lee (213) 6048 posts |
I think ARM would disagree with you on that one. Cortex-A75 is affected (although El Reg points out that A75 cores aren’t actually commercially available yet), and there’s a variant which affects A15, A57 and A72. Although to me, both Meltdown and Spectre sound as bad as each other, in that a CPU bug/design flaw is allowing code to gain access to information which it shouldn’t be able to access.
And as I’m sure you know, these bugs aren’t about privilege escalation.
Because nobody complained about it until now? |
Steve Pampling (1551) 8170 posts |
User code accessing areas they should not be able to, normally requiring root or similar permissions in essence.
I’m presuming they just did something convenient that worked without bothering to cover all all the test points. |
Jeffrey Lee (213) 6048 posts |
But only read-only access (at least until you use the extracted data to exploit a chink in the armour elsewhere). To me, privilege escalation implies both read and write access, so is much more powerful.
When your CPU contains billions of transistors, covering all the test points is rather non-trivial. I was implying that nothing had been done about it until now because no (good-natured) security researchers had found the issues until recently. As operating systems have become more and more secure, side-channel attacks on hardware are now the “in thing” for researchers to look at (and malware authors to exploit). Meltdown/Spectre is a combination of speculative execution, caches, and high-resolution timers. On their own these features are harmless, and when Intel first introduced the features 10+ years ago I doubt anyone within the company considered that they could be combined in such a way as to allow memory to be snooped (much like nobody at ARM or AMD considered it to be a problem). There’s only so much that Intel can patch via microcode updates, so until newer CPUs are released which are able to provide hardware mitigations for the attacks, the only solutions are software ones which by their nature make things slower. |
Jon Abbott (1421) 2651 posts |
When I first read the description of Meltdown I though…hmm, sounds remarkably like the erratum I found in the ARM7 macrocell a few months back…namely trigger an Abort and CPU pollutes the cache with a speculatively executed instruction that’s been incorrectly escalated. Its not the same issue of course, but is remarkable similar in both execution and outcome. The “fix” unmaps the kernel data area from what I gather, but what else is in the memory map that could be read though this method? Could it be used to speculatively read hardware devices for example? |
Jeffrey Lee (213) 6048 posts |
Unlikely – IO devices should be mapped using Device memory type, which doesn’t support speculative reads (since the hardware might be read-sensitive). You might be able to use it to read memory buffers which hardware is using, however (e.g. USB & SATA controllers typically store lots of state information in host RAM) |
Colin Ferris (399) 1814 posts |
Is this going to have any effect on RO users? |
Jeffrey Lee (213) 6048 posts |
Theoretically, yes. The “Spectre” attack is the one which affects ARM the most (many Cortex cores are susceptible to it), and it can be exploited by Javascript. So if you’re running a browser which supports Javascript and provides access to high-resolution timers then a malicious script could exploit the flaw and read memory which it shouldn’t be able to. Of course the default RISC OS timer is only 100Hz, and we don’t yet have a proper API for accessing the HAL timers, so I doubt any of the browsers which do support Javascript (or any other programs which might run untrusted scripts within a “secure” sandbox) will be vulnerable. |
Rick Murray (539) 13840 posts |
But, wait, enquiring minds want to know – when will the RISC OS kernel be patched to mitigate this heinous serious OMG problem? |
Jon Abbott (1421) 2651 posts |
I’ve now seen the technical detail on these exploits, both of which work on the principle of timing instructions to infer memory contents they don’t have access too. Meltdown exploits an erratum in the CPU implementation of speculative/pipeline execution, where the cache is polluted with data that’s been read speculatively. The only way to “patch” this 100% is disable the cache, partial patching can be done by unmapping sensitive data from memory and flushing the cache, or disable caching of sensitive areas. Patches that have already been rolled out to prevent access to kernel data use the unmap method.
AntiVirus and Interrupt Handler data will be in the memory map. No doubt AV vendors will issue patches if not done so already. Spectre exploits branch prediction and affects all CPU’s that speculatively execute branches, the only way to patch this 100% is disable branch prediction. I’m intrigued as to how this is going to be patched without disabling as its exploiting expected CPU behaviour.
PMSL |
Steve Drain (222) 1620 posts |
Eben Upton has posted a very clear explanation of why Raspberry Pi isn’t vulnerable to Spectre or Meltdown. Subsequent discussion expands this, and he has said he will be adding to it in due course. |
Jeffrey Lee (213) 6048 posts |
For some bizarre reason ARM have decided the best way to deal with Spectre is to add a new barrier instruction and require software to use it when privileged code accesses data using an array index provided by untrusted code. Which seems like the worst possible solution. Hopefully they’ve just worded things poorly, and this is just a workaround for current CPUs, pending a proper fix for future CPUs. Since both exploits resolve around being able to use speculative execution to load cache lines (and then measure timing of cache/memory accesses to determine how the cache has changed), my thought is that CPU designers could add a “speculation reference count” to cache line entries. I.e. whenever speculative execution causes a cache line to be fetched, a reference count associated with that cache line is incremented. If the speculative instruction is retired, the reference count will decrease, and if it hits zero the cache line is invalidated, returning the cache to its original state. Only if the speculative instruction is accepted (“architecturally executed”) will the cache line remain in the cache. If speculative memory accesses are allowed to evict cache lines, then the CPU might also need redesigning to hold any speculatively-fetched cache lines in a queue, so that software won’t be able to detect the speculative access by the fact that a previously-valid cache line has now vanished. Only once the instruction is architecturally executed will the speculative cache line be allowed to enter the main cache (and evict any other line as necessary). |
Rick Murray (539) 13840 posts |
I might be misunderstanding something here, but doesn’t this cache issue imply that the speculative execution totally ignores memory privilege permissions? Why doesn’t the MMU choke and stall the instruction when the processor is not in a sufficiently privileged mode for what the instruction is wanting to do? |
Jeffrey Lee (213) 6048 posts |
Memory access permissions are being correctly adhered to. The attacks don’t actually allow code to read the contents of protected memory, they just allow code to infer the value of addresses that speculatively executed code has been accessing, by checking to see whether the speculative execution has disturbed the cache in any way. ARM’s whitepaper explains it fairly well – https://developer.arm.com/support/security-update Or, as a summary: ; R0 comes from untrusted code CMP R0,#array_size MOVHS PC,LR ; Invalid values will halt execution here, but the following code may be speculative executed anyway LDRB R0,[R1,R0] ; Speculatively load from an out-of-bounds array offset; could therefore load any address in memory LDRB R0,[R2,R0,LSL #n] ; Use the first loaded value to perform a second speculative load If the attacking code can detect a cache line that’s been loaded or evicted by the second load instruction, it can deduce some of the address bits which that load instruction used. If it’s able to deduce enough of those address bits (e.g. by abusing a number of different routines which have different ‘n’ shift values) then it can deduce what value was loaded by the first load instruction. And since it can control the address that the first load instruction accesses, it can use it as a generic way of reading protected memory. I suppose another solution to the problem would be to make it so that privileged code uses completely separate caches to unprivileged code. |
Jon Abbott (1421) 2651 posts |
It does seem a little odd at first, but makes sense when you consider the mess you’d probably get into trying to implement the obvious solution of invalidating the speculatively accessed cache line for out of bounds accesses.
It wouldn’t surprise me if at some point we see processors with dedicated cores for privilege level code which are completely isolated, with their own cache and MMU. |
Rick Murray (539) 13840 posts |
Are you sure? While a piece of code cannot directly read inaccessible memory, it seems to me that the basic process is:
However, it seems to me that this can only work if the processor is in fact bypassing the MMU and memory access permissions while performing speculative execution (or else where would the value retrieved by the first instruction come from?). The actual executing code obeys the rules, but what the processor is doing internally surely isn’t. 1 This is code that is not directly executed (it would fault), it is arranged so the speculative execution system will run this in parallel to code that is being executed. When it comes time to make use of this code, it faults and is never actually performed, but traces of its behaviour remain in the way the cache behaves; thus implying the faulting happens at the point of attempting to execute the code, and not when the speculative execution tries to access protected memory, which is why I suggested that the speculative unit is in fact bypassing the MMU’s access permissions. |
Jeffrey Lee (213) 6048 posts |
You’re right that it can work if speculative execution isn’t taking into account memory permissions (and it wouldn’t surprise me if some CPUs do suffer from that). And it would certainly explain why there’s been a flurry of OS patches to make sure kernel memory is completely unmapped when processors are running user code (and skimming through the doc, I think that’s what the Meltdown attack is – it runs its attack fully from within unprivileged code) But in the general case (Spectre), you can perform the attack by making sure that the speculative execution is performed by a privileged OS routine (called from your unprivileged attack code). |
Steve Pampling (1551) 8170 posts |
and application patches – the latest version of Firefox on Windows specifically lists it as having a fix for “Meltdown”
From the info I’ve read Meltdown exploits a specific feature of Intel designs which isn’t applicable to other manufacturers. As you say the general case is Spectre, but from the item by Eben even that doesn’t apply to all ARM devices. All in all has it changed the security situation in RO? |
Rick Murray (539) 13840 posts |
That’s why my post on Friday; because even if RISC OS was running on an affected processor, it’s a waste of time thinking about a kernel patch when the OS memory (and indeed most things except specific DAs that are restricted to SVC access) are generally read/write from user mode (hello RMA), and for those things that need SVC access, it’s a simple call to OS_EnterOS to get privilege. So we clearly benefit from total immunity to the problem by virtue of having zero “security” to begin with. You can’t lose what you don’t have! |
Steve Pampling (1551) 8170 posts |
Well the main item would be remote code attack via something like javascript, but there is the question over how much support for such a javascript based exploit actually exists. As Jeffrey pointed out:
Sometimes the lack of features actually has a positive side. |
Rick Murray (539) 13840 posts |
There is, of course, a wider question to be asked. Why does an interpreted script language running in a browser need access to high resolution timers at all? I can understand access to a normal timer in various fractions of a second – perhaps centisecond like RISC OS because the old style ticker worked at a weird 18.2 ticks per second, but a timer with a high enough resolution to time cache load differences? Why? [or is x86 just REALLY slow?] |
Rick Murray (539) 13840 posts |
That said – using Javascript, does such a piece of code actually exist? You’d need to write a script to manipulate what the processor does in a very careful way to get the desired effects, plus be able to time behaviour well enough to determine what the cache is up to; while bypassing the overheads of the script interpreter and the browser being ‘polled’ in a preemptive manner (as we’d be talking Windows or Linux here). Is this even possible? It would be like trying to control what happens in the ARM core by using pure BASIC (no assembler). Surely the act of interpreting the instructions (in whatever manner the interpreter prefers) would destroy the ability to accurately control and time the things we need to control and time in order to make the side channel exploit work. That said – we can trust Google to find a way to eff it up → https://developer.chrome.com/native-client 2 1 Or does modern Javascript support inline assembler? Nothing would surprise me these days… 2 Note, of course, the “Portable Native Client” that is translated to the architecture in use blah blah blah explain to me how this isn’t what Java did decades ago? Oh, wait, Google invented it so it must be better, right? |
Jeffrey Lee (213) 6048 posts |
Allegedly it can be done in JavaScript without even requiring access to a timer (just have a timing loop running on a second thread). There are also JavaScript implementations of rowhammer. https://github.com/IAIK/rowhammerjs
Modern JavaScript engines use JIT, so the performance should be pretty good for basic arithmetic and memory access. |
Chris Mahoney (1684) 2165 posts |
This article about PowerPC/X360 was an interesting read! |
Martin Avison (27) 1494 posts |
re X360 – Just proves that bugs are reinvented just as often as wheels. |
Pages: 1 2