Safeguarding the past, present and future of RISC OS for everyone

News | Downloads | Bugs | Bounties | Forums | Library

Forums → Aldershot →

Naive questions on instruction set encodings.

3 posts, 2 voices

Feb 22, 2024 5:53pm GavinWraith (26) 1563 posts	A propos Pablo’s Ultima VM I have some questions about the choice of encodings for instruction sets for machines, virtual or otherwise. What I am used to, when I look at the source code for a VM, is an encoding that is baked in. For example, in the Lua sources the files lopcodes.h and lopnames.h contain arrays of codes and names that must both be in a fixed order. Question 1. Is the correspondence between instructions and opcodes purely arbitrary? Or are there sneaky optimizations to be had by a a cunning choice, for a particular platform? Question 2. Are there security concerns arising from fixed encodings? Can encodings be randomized in a way that fools malware but does not break commercial use?

Feb 22, 2024 9:16pm Paolo Fabio Zaino (28) 1882 posts	Dear Gavin, Question 1. Is the correspondence between instructions and opcodes purely arbitrary? Or are there sneaky optimizations to be had by a a cunning choice, for a particular platform? Short answer: YES there are sneaky optimizations (quite a lot actually). Long (and not complete) answer: The way one envision how a byte code instruction has to be encoded depends on a lot of factors which include the general execution architecture the developers have decided to use for their VM, the CPU cache size, memory bandwidth and more (hence this answer is incomplete). I am actually using UltimaVM as a research project in bytecode execution optimization (for which using both RISC OS and Linux helps me representing the two antithesis of how OS are designed, it’s indeed a fashinating journey). In general, given how slow bytecode can be compared to native binaries, what holds the heaviest weight are performance considerations (over anything else), but this it’s a long discussion on its own. Static structures tend (obviously) to be faster (depending on how they are designed!). Question 2. Are there security concerns arising from fixed encodings? Can encodings be randomized in a way that fools malware but does not break commercial use? Security concerns are primarly addressed to what a bytecode application can and cannot do and can and cannot access. This is what a security model is about. For instance on Ultima you can tell the interpreter what you want a bytecode applications to be able to do or not and to be able to access or not (included which certains instructions, networks, disks, directories, URLs, IPs etc.) and the interpreter must be able to enforce that. Randomising encoding just makes the bytecode hard to reverse engineer, so it may be a value in term sof obvuscation, but remember, in this case one can’t do pure randomization, otherwise even the interpreter or JIT won’t be able to make sense of the binary blob, and were there is a pattern… ;) One could, instead, encrypt the executable with a secret key and then provide a public key for decryption only, that would make the binary “readable”, but unmodifiable. So, that would help with the malware requirement you’ve mentioned at REST time (let’s be clear on this!) To protect the code at runtime, you need to use a similar approach I used for Ultima: Code is ALWAYS separated, constant and unmodifiable, the VM must enforce that (no self modifying code allowed either). There is NO physical “dynamic or static link”, each library is loaded in its own code segment and can only interoperate either in a permissive environment (basically you execute both your code and library code within the same VM context, so they both can access the same Data memory on Ultima) or a restrictive environment (where they both are executed in two separate VM contexts and share result values through a virtual registers copy mechanism within the VM that also copies specified pre-mapped data memory segments) In this case even if a library is malicious, it won’t be able to alter your code (or the code that calls it) aka the execution is silored (or in silos) where the degree of separation is defined by the security model. While the security model will also ensure it cannot access resources you don’t want it to. Hope this helps, TBH there is more, but not sure if you are interested in all the various theories, there are a lot of good research papers on various aspects of designing and developing bytecode VMs btw.

May 2, 2024 7:36pm GavinWraith (26) 1563 posts	Paolo, I have a problem with Manjaro-ARM on my Pinebook Pro and I was hoping to get a wise word from you. But my email bounced. Could you email me? Last year I reinstalled Manjaro to the NVME drive successfully but I have forgotten what I did. After an update last month it has no sound or wifi or internet connectivity.

Reply

To post replies, please first log in.

Forums → Aldershot →

Search forums

Social

Follow us on

and

ROOL Store

Buy RISC OS Open merchandise here, including SD cards for Raspberry Pi and more.

Donate! Why?

Help ROOL make things happen – please consider donating!

RISC OS IPR

RISC OS is an Open Source operating system owned by RISC OS Developments Ltd and licensed primarily under the Apache 2.0 license.

Description

Everything with nothing particularly or remotely to do with ROOL.

Voices

Options

Forums
Login

Contact Us | About Us

The RISC OS Open Beast theme is based on Beast's default layout
Site design © RISC OS Open Limited 2024 except where indicated

Hosted by Arachsys

Powered by Beast © 2006 Josh Goebel and Rick Olson
This site runs on Rails