Safeguarding the past, present and future of RISC OS for everyone

News | Downloads | Bugs | Bounties | Forums | Library

Forums → Wish lists →

Static analysis

9 posts, 4 voices

Jan 29, 2018 1:59pm Jeffrey Lee (213) 6048 posts	Although some bits of the OS have recently been being subjected to analysis by cppcheck, I’m thinking that something a bit more advanced is needed. Specifically, tools which are able to look at a program as a whole and check for concurrency/re-entrancy issues. A typical workflow might be as follows: The programmer adds annotations to the source code to indicate where the different entry points are, how the entry points might be invoked (thread, interrupt handler, re-entrancy), and how variables should be organised into atomic groups The static analyser looks at the code and constructs a call graph showing which functions can call each other, and which variables are accessed (tracking both read & write accesses) The call graph is then analysed to work out where different locks need to be placed in the code to ensure the program behaves in a safe manner (and what types of locks are required – e.g. spinlock vs. rwlock vs. mutex), the correct locking order if multiple locks need to be held, etc. The tool can then compare the lock information against the source / call graph again in order to work out which bits of code need modifying in order to ensure the locks are implemented correctly (bonus points if it can automatically update the source for you!) Warnings can also be produced if it looks like the code may keep interrupts disabled for long periods of time (e.g. timer based wait loops while spinlocks are held), or if excessive locking is likely to have an adverse effect on the peak concurrency of the system, etc. This is the kind of thing that computers should be good at, but (at least if my initial searches are anything to go by) doesn’t seem to be something that’s commonplace amongst (free/open-source) static analysis tools. Especially if you’re after something that can work with C code. Which seems pretty crazy, considering the amount of open-source threaded code that there is out in the wild. Programming really is a profession that’s still in its infancy. The closest I’ve found so far are: Frama-C, which has a closed-source Mthread plugin, and an open-source (but simple) simple-concurrency plugin. Neither of which look like they’d be particularly suitable, so I’d need to get my head around OCaml and try writing my own plugin (but maybe that isn’t so bad? Frama-C seems to have an extensive guide on how to write plugins, at least) Moose, which is a project which seems to be aiming towards introducing a seismic shift in how people look at code. Despite being around for 20 odd years it still feels a bit research project-y, with lots of effort being put into the core modelling & analysis but not so much on how to make practical use of it analysing source code. There is a C analyser available, but without learning the system myself it’s hard to say whether it’s in a state that’s suitable for use, or how long it would take for me to become proficient enough to be able to make effective use of its capabilities. A big hurdle with a lot of analysis seems to be the act of transforming the source code into a call graph or other structure which can be easily manipulated, and C, despite being a relatively simple language, seems to be one of the more annoying ones to deal with at the source level (lots of implementation-defined behaviour, implementation-specific headers, the C preprocessor acting as an extra language ontop of the ‘pure’ C source, etc.). I know GCC has the option to spit out its RTL which some tools can then make use of, but that feels a bit too low-level for me to use to knock up a quick and dirty analysis tool.

Jan 29, 2018 2:16pm David Thomas (43) 72 posts	If you can get code through Clang, its thread sanitizer is neat for catching a lot of threading issues as they happen.

Jan 30, 2018 8:52pm Jeffrey Lee (213) 6048 posts	Clang’s thread sanitiser does sound useful, but it is kind of the opposite of static analysis :-) I’ve had a quick play with Frama-C: When using it on RISC OS projects there’s a bit of setup needed to get your sources into the right structure (e.g. renaming them all to use DOS/Unix filename extensions instead of c/h directories). But this will be true of pretty much any tool, unless it can be trivially ported to RISC OS. Some tweaking of the command line it uses to run the C preprocessor is also necessary (e.g. specifying include paths that are suitable for working on components from the OS sources) It doesn’t have a builtin machine definition for ARM, so some bits might not be quite right. This also means it might be difficult to configure it to use the GCCSDK cross compiler as a preprocessor (although I guess if it’s just using the preprocessor, the machine selection won’t really matter much?) It looks like a lot of the analysis options rely on the value analysis having been performed, and the value analysis is only likely to succeed if you’ve added ACSL annotations to various functions (e.g. it gets upset because it doesn’t understand how `_swix` returns values) Even the callgraph can do unexpected things, e.g. if you’ve got a function pointer and it can’t work out what functions it may be initialised to, it looks like it assumes that it could be initialised to any function with a matching signature So it looks like you’d have to put in a lot of effort to make use of Frama-C on an “ordinary” program.

Jan 31, 2018 1:33pm Jeffrey Lee (213) 6048 posts	e.g. it gets upset because it doesn’t understand how _swix returns values Thinking about it, I suspect the only sensible way of dealing with SWI calls in this context would be to create wrappers for all the SWIs you want to use. Make sure the prototypes all have ACSL specifications, and don’t let frama-c see the function bodies (otherwise you’re stuck trying to write a specification for `_swix` again). As a bonus the C compiler will also be able to type check your SWI arguments whenever you use one of the wrappers.

Mar 10, 2018 5:31pm Jeffrey Lee (213) 6048 posts	I’m a fan of APIs that are safe by design – i.e. for thread-safe access to a shared object, the API would be designed such that you’re only given access to that object if you can prove that you’ve got an appropriate lock held on it. In C++ this is fairly easy to achieve, with low overheads. E.g. you can have an opaque ‘foo_reference’ class which represents a pointer to a ‘foo’, and only has two methods: ‘read_lock’ and ‘write_lock’. These methods then return ‘foo_read’ and ‘foo_write’ types which both (a) use their constructors & destructors to act as scoped locks to provide thread safety, and (b) contain members/methods in order to allow the underlying ‘foo’ to be interacted with (in a read-only or read-write manner, as appropriate). Inline functions and a half-decent optimiser in the compiler can help eliminate any of the unnecessary overheads this may introduce, compared to if ‘foo’ was a plain struct and you had to manually call global lock/unlock functions. Achieving this in C is a lot harder – you have no private functions/members, no constructors, and no destructors. But I’m thinking that maybe it’s possible to get most of what I want using some good old-fashioned macro magic. E.g. you could have code that looks like this: OBJECT(foo_t, object1) OBJECT(foo_t, object2) void do_stuff(int arg1,int arg2 WITH_LOCKS(READ foo_t object1,WRITE foo_t object2)) { object2->some_value = arg1 + arg2 + object1->some_value; } int main() { { READ_LOCK(foo_t,object1) { { WRITE_LOCK(foo_t,object2) { do_stuff(1,2 PASS_LOCKS(object1,object2)); if (something) { printf("oh no!\n"); WRITE_UNLOCK(object2) READ_UNLOCK(object1) return 1; } } WRITE_UNLOCK(object2) } } READ_UNLOCK(object1) } return 0; } When built normally, the macros would expand to something like this: foo_t object1; mutex_t object1_mut; foo_t object2; mutex_t object2_mut; void do_stuff(int arg1,int arg2) { object2->some_value = arg1 + arg2 + object1->some_value; } int main() { { mutex_read_lock(&object1_mut); { { mutex_write_lock(&object2_mut); { do_stuff(1,2); if (something) { printf("oh no!\n"); mutex_write_unlock(&object2_mut); mutex_read_unlock(&object1_mut); return 1; } } mutex_write_unlock(&object2_mut); } } mutex_read_unlock(&object1_mut); } return 0; } i.e. with no overheads introduced by the lock checking. But when built in lock checking mode, it would produce code that looks like this: foo_t object1_glob; mutex_t object1_mut; foo_t object2_glob; mutex_t object2_mut; void do_stuff(int arg1,int arg2, const foo_t object1, foo_t object2) { object2->some_value = arg1 + arg2 + object1->some_value; } int main() { { mutex_read_lock(&object1_mut); const foo_t object1 = object1_glob; { { mutex_write_lock(&object2_mut); foo_t object2 = object2_glob; { do_stuff(1,2,object1,object2); if (something) { printf("oh no!\n"); mutex_write_unlock(&object2_mut); mutex_read_unlock(&object1_mut); return 1; } } mutex_write_unlock(&object2_mut); } } mutex_read_unlock(&object1_mut); } return 0; } i.e. the globals still exist, but have been renamed so that attempts to access them without a lock held are liable to fail. Meanwhile, the lock acquire macros have been updated to create local references to the globals, with the appropriate const/non-const qualifier. You probably wouldn’t want to run the code that’s produced by this version (poor performance due to lots of extra function arguments), but it’ll allow the compiler to spot various mistakes you may be making with how you’re accessing your global variables. By explicitly marking the functions as requiring certain locks, it’ll also make it more obvious to the programmer where certain problems may lie (e.g. recursive locks or incorrect lock acquisition order). It may also be possible to produce a halfway-house version, that keeps the globals protected from accidental access, even for regular builds of the code: void do_stuff(int arg1,int arg2 WITH_LOCKS(READ foo_t object1,WRITE foo_t object2)) { ACCEPT_READ_LOCK(foo_t,object1) ACCEPT_WRITE_LOCK(foo_t,object2) object2->some_value = arg1 + arg2 + object1->some_value; } In lock-checking builds, the ACCEPT_READ_LOCK / ACCEPT_WRITE_LOCK macros would be nops. But in regular builds they would expand as local references to the renamed globals, as follows: void do_stuff(int arg1,int arg2) { const foo_t object1 = object1_glob; foo_t object2 = object2_glob; object2->some_value = arg1 + arg2 + object1->some_value; } Searching around to see if other people have had similar ideas, I’ve just discovered this article which has two main take-aways: Some programming languages have a concept of “thread usage policy”, which allows the compiler/linter to check your code is safe, essentially operating as a more advanced version of the above macros Clang supports thread usage policies for C/C++ via adding attributes to the function declarations (and using intrinsic functions for obtaining/releasing the capabilities that the function attributes check for) So if we were using clang then I could probably use clang’s system directly.

Mar 10, 2018 8:04pm Rick Murray (539) 13840 posts	check for concurrency/re-entrancy issues. I’m wondering if there mightn’t be some value in having some sort of simple debug terminal that works via serial port and can interrupt/freeze an active machine for various sorts of analysis? Something a little easier/friendlier than JTAG, but something that doesn’t imply screen or keyboard use (pausing FileCore from a Wimp program, for example, would display what where?).

Mar 10, 2018 10:28pm Jeffrey Lee (213) 6048 posts	Yes, something like that would definitely be useful for some tasks.

Mar 20, 2018 11:21am nemo (145) 2546 posts	Jeffrey wrote I’ve just discovered this article I have implemented thread role declaration through mutually exclusive headers to control build-time access to functionality. The use of public (external) and private (internal) headers for a module is common and usually well understood, and though careful planning is required to avoid dependency hell, it can be seen that extending that model to provide multiple public role headers is straightforward. There’s no protection for functionality quite like being unable to link to the functionality.

Mar 20, 2018 2:02pm Jeffrey Lee (213) 6048 posts	I think I’m close to solving my immediate problems (tightening up OMAPVideo). Implementing the macro-based system helped make it clear where some refactoring was required to reduce the amount of data that needs to be protected by long spinlocks. I’m not sure if I’ll actually want/need to use the macro system in the final version, however (I’m starting to think that a lot of the code should avoid passing around pointers to write-locked objects, and read-locked objects are now generally getting handled as pointers anyway)

Reply

To post replies, please first log in.

Forums → Wish lists →

Search forums

Social

Follow us on

and

ROOL Store

Buy RISC OS Open merchandise here, including SD cards for Raspberry Pi and more.

Donate! Why?

Help ROOL make things happen – please consider donating!

RISC OS IPR

RISC OS is an Open Source operating system owned by RISC OS Developments Ltd and licensed primarily under the Apache 2.0 license.

Description

What would you like to see written or changed?

Voices

Options

Forums
Login

Contact Us | About Us

The RISC OS Open Beast theme is based on Beast's default layout
Site design © RISC OS Open Limited 2024 except where indicated

Hosted by Arachsys

Powered by Beast © 2006 Josh Goebel and Rick Olson
This site runs on Rails