Naive questions on instruction set encodings.
GavinWraith (26) 1563 posts |
A propos Pablo’s Ultima VM I have some questions about the choice of encodings for instruction sets for machines, virtual or otherwise. What I am used to, when I look at the source code for a VM, is an encoding that is baked in. For example, in the Lua sources the files lopcodes.h and lopnames.h contain arrays of codes and names that must both be in a fixed order. |
Paolo Fabio Zaino (28) 1882 posts |
Dear Gavin,
Short answer: YES there are sneaky optimizations (quite a lot actually). Long (and not complete) answer: The way one envision how a byte code instruction has to be encoded depends on a lot of factors which include the general execution architecture the developers have decided to use for their VM, the CPU cache size, memory bandwidth and more (hence this answer is incomplete). I am actually using UltimaVM as a research project in bytecode execution optimization (for which using both RISC OS and Linux helps me representing the two antithesis of how OS are designed, it’s indeed a fashinating journey). In general, given how slow bytecode can be compared to native binaries, what holds the heaviest weight are performance considerations (over anything else), but this it’s a long discussion on its own. Static structures tend (obviously) to be faster (depending on how they are designed!).
Security concerns are primarly addressed to what a bytecode application can and cannot do and can and cannot access. This is what a security model is about. For instance on Ultima you can tell the interpreter what you want a bytecode applications to be able to do or not and to be able to access or not (included which certains instructions, networks, disks, directories, URLs, IPs etc.) and the interpreter must be able to enforce that. Randomising encoding just makes the bytecode hard to reverse engineer, so it may be a value in term sof obvuscation, but remember, in this case one can’t do pure randomization, otherwise even the interpreter or JIT won’t be able to make sense of the binary blob, and were there is a pattern… ;) One could, instead, encrypt the executable with a secret key and then provide a public key for decryption only, that would make the binary “readable”, but unmodifiable. So, that would help with the malware requirement you’ve mentioned at REST time (let’s be clear on this!) To protect the code at runtime, you need to use a similar approach I used for Ultima: Code is ALWAYS separated, constant and unmodifiable, the VM must enforce that (no self modifying code allowed either). In this case even if a library is malicious, it won’t be able to alter your code (or the code that calls it) aka the execution is silored (or in silos) where the degree of separation is defined by the security model. While the security model will also ensure it cannot access resources you don’t want it to. Hope this helps, TBH there is more, but not sure if you are interested in all the various theories, there are a lot of good research papers on various aspects of designing and developing bytecode VMs btw. |
GavinWraith (26) 1563 posts |
Paolo, I have a problem with Manjaro-ARM on my Pinebook Pro and I was hoping to get a wise word from you. But my email bounced. Could you email me? Last year I reinstalled Manjaro to the NVME drive successfully but I have forgotten what I did. After an update last month it has no sound or wifi or internet connectivity. |