Forums → Bugs →

IIC probing / CMOS sanity failure

13 posts, 5 voices

Mar 11, 2015 6:27pm Rick Murray (539) 13840 posts	I’m going to add this to the bug tracker, but paste details/code here as the bug tracker has a ‘different’ version of Textile that means program snippets usually get mangled. Plus I can be more descriptive. [update: Filed as ticket #404 https://www.riscosopen.org/tracker/tickets/404 – yay, I got myself a 404! :-) :-)] While developing some code for CJE, it would have been useful to probe the IIC buses to try to locate a device that is not present on the standard IIC bus. Unfortunately, this has some… unpleasant… side effects. Do NOT do this on a live system with important unsaved data. We will be messing things up here. ;-) *Your CMOS settings will* be reverted to default.** First up, open a TaskWindow. You’ll need it later. Here’s the code: `DIM iic% 12 DIM buf% 8 FOR bus% = 0 TO 7 buf%?0 = 0 iic%!0 = &E8 iic%!4 = buf% iic%!8 = 1 SYS "XOS_IICOp", iic%, (1 + (bus% << 24)) TO ; f% IF (NOT f% AND 1) THEN iic%!0 = &E9 iic%!8 = 4 SYS "XOS_IICOp", iic%, (1 + (bus% << 24)) TO ; f% IF (NOT f% AND 1) THEN buf%?4 = 13 PRINT "Reply is "+$buf%+" on bus "+STR$(bus%) ENDIF ENDIF NEXT END` If you run this on a Pi with CJE’s smart RTC/power control module, the response will be: Reply is CJE1 on bus 0 Reply is ÞÞÞÞ on bus 2 Reply is üüüü on bus 4 Reply is úúúú on bus 6 Note that when you return to the desktop, everything is RISC OS 3.1 style VDU font. Run the above program again. And again. And again. You will see that it is repeatable, that IIC works (sort of) and the device on bus #0 returns “CJE1” every time. Yet, for some reason, RISC OS has thrown in the towel, decided there is no CMOS RAM, and has reverted to defaults of everything. For example: St. BootNet BootNet Off Co. BootNet On St. BootNet BootNet Off Unplug BASIC Unplug TaskManager Unplug No modules are unplugged * (the unplug TaskManager command will have RMKilled it, but your machine needs to be rebooted anyway…). As a result of this, it is clear that probing IIC devices is not going to work. Questions: Why does RISC OS return gibberish for bus numbers 2, 4, and 6? They don’t exist; and even if they did, this device would not exist on them. Why does RISC OS not throw an error for attempts to talk to invalid IIC buses? What is it that causes the CMOS RAM to revert to defaults? To reboot ‘friendlier’ without pressing a reset button or power cycling, go to the TaskWindow you opened earlier. RMReInit BASIC Shutdown BASIC SYS "OS_Reset" You will* need to reconfigure your machine – I found mine reset the monitor type (was Generic, 1280×1024) but had reverted to Auto (1920×1080!), mouse speed, and font cache settings. It may be resetting everything only I don’t change the other stuff from defaults. I have found, via some testing, that attempting to talk to bus 5 is what trashes the CMOS RAM (at least, on a Pi). Trying to talk to the other buses do not cause this oddity to happen. Chris has reported that similar behaviour is exhibited on a Pandaboard, so this is not Pi-specific.

Mar 11, 2015 9:49pm Chris Evans (457) 1614 posts	This problem is new. I haven’t had time to try and narrow down when the problem ROM builds started but it appears that it was a few months ago. I believe an alternative solution would be if there was a way of detecting what hardware you are running on. I’ve seen tests for certain CPUs etc but not a neat 0=RiscPC, 1=Iyonix, 2=Pi A, 3 Pi B, 4 Pi B rev 2 …. 6=Panda ES… or something like that. Sometimes it is what CPU you need to know but for things like IIC it is actual hardware and revisions that need identifying. I’ve probably not explained that well but I hope it makes sense.

Mar 16, 2015 9:56pm Jeffrey Lee (213) 6048 posts	I’ve checked in a fix, so starting from tomorrow there’s now some parameter validation performed to check for cases like this. 1. Why does RISC OS return gibberish for bus numbers 2, 4, and 6? They don’t exist; and even if they did, this device would not exist on them. Lack of parameter validation. 2. Why does RISC OS not throw an error for attempts to talk to invalid IIC buses? Implementation oversight. 3. What is it that causes the CMOS RAM to revert to defaults? Memory corruption, and/or bogus transfers on the IIC bus This problem is new. Actually, since support for multiple IIC buses was implemented by overloading existing OS_IICOp parameters, bad stuff will happen in one form or another on all versions of RISC OS 5 if you try accessing a bus which doesn’t exist. Old versions which only supported one bus will think you’re trying to perform several million transactions. Chances are it will fail with an IIC error pretty soon after it runs past the end of the valid input, but that might be after it’s corrupted some memory or performed a bogus IIC transfer and corrupted your CMOS. I believe an alternative solution would be if there was a way of detecting what hardware you are running on. I’ve seen tests for certain CPUs etc but not a neat 0=RiscPC, 1=Iyonix, 2=Pi A, 3 Pi B, 4 Pi B rev 2 …. 6=Panda ES… or something like that. Sometimes it is what CPU you need to know but for things like IIC it is actual hardware and revisions that need identifying. There are a few potential solutions to the problems you’re facing, depending on exactly what those problems are. Also I’m not entirely sure what solution ROOL would prefer (but past experience suggests a “0=RiscPC, 1=Iyonix”, etc. solution isn’t one of them) If you’re developing a product for something which exports an instance of the GPIO HAL device then you could query that to work out the platform + hardware revision (which in this case would actually give you a 0=RiscPC, etc. style result) If you’re developing a product which will most likely be fitted as standard to a certain type of machine then you could make the HAL probe for it and then create a HAL device (containing the IIC bus number and any other relevant info) if it finds it (obviously requires support in the HAL!) A more generic version of the first solution, the GPIO HAL device (or the GPIO module) could be extended to contain information about what IIC buses are available for user expansion, and so by extension you have a list of IIC buses which you can safely probe for things This bug also highlights the problem that we don’t really have an official way for third-party code to determine how many IIC buses there are in a system. Arguably third-party software shouldn’t need to know (probing for devices on an IIC bus can be dangerous – there’s no way of knowing that you’ve found the right device, you only know whether your interrogation request has succeeded or not. You could confuse the EDID EPROM for the CMOS RAM, or you might unwittingly be bogus sending commands to a device which doesn’t use the same message protocol as your intended target), but if the choice is between code using the ‘for internal use only’ HAL_IICBuses call or using an official SWI then I think an official SWI would be better. That then raises the question of where we put the SWI – overriding OS_IICOp further isn’t a sane thing to do due to backwards compatibility issues, so as far as I can see that leaves us with the following choices: New OS_ReadSysInfo reason code New OS_PlatformFeatures reason code New OS_Hardware reason code New SWI altogether At the moment I’m leaning towards #1, since it already reports assorted low-level information such as 82C71x features, IOEB/IOMD presence flags, etc.

Mar 16, 2015 10:46pm Rick Murray (539) 13840 posts	Thank you for the fast fix and reply. ;-) Actually, since support for multiple IIC buses was implemented by overloading existing OS_IICOp parameters, bad stuff will happen in one form or another on all versions of RISC OS 5 if you try accessing a bus which doesn’t exist. Ah. Well… poop. I believe an alternative solution would be if there was a way of detecting what hardware you are running on. I’ve seen tests for certain CPUs etc but not a neat 0=RiscPC, 1=Iyonix, 2=Pi A, 3 Pi B, 4 Pi B rev 2 …. 6=Panda ES… or something like that. OS_Hardware returns that. CPU type, board type, and board revision. I take the approach that if OS_Hardware can’t tell me that much, it isn’t a compatible platform. ;-) Sometimes it is what CPU you need to know but for things like IIC it is actual hardware and revisions that need identifying. In my specific case, luckily not. But yet, I’m aware that there were some changes in the Pi’s IIC configuration. not entirely sure what solution ROOL would prefer (but past experience suggests a “0=RiscPC, 1=Iyonix”, etc. solution isn’t one of them) The logical approach would be for a call to return a value indicating how many IIC buses were present and a bitmap indicating which ones (to cater for non-sequential ordering). If you’re developing a product for something which exports an instance of the GPIO HAL device then you could query that to work out the platform + hardware revision Ah, yes. That’s what I meant by the OS_Hardware call. Arguably third-party software shouldn’t need to know …because the information should be available from the OS. :-P (probing for devices on an IIC bus can be dangerous – there’s no way of knowing that you’ve found the right device, you only know whether your interrogation request has succeeded or not. Actually, the device will exist at a specific address (which rules out the ~125 other possible devices). The first thing that I do is write ‘0’ to the device to inform it to begin reading at internal address +0. While this is not guaranteed, many of the IIC devices that I have used (from Teletext chips to ADCs) just “assume” the initial write specifies an offset, so in the rare case that this address is used by something else, it ought to be the same. Not a guarantee, fair enough, but it’s the best that can be done in lieu of any other way to do this. Following this, four bytes are read. If they come back as “CJE1” then the device has been positively identified. If they don’t, then I’ve probably just killed the GPU and pretty soon blood will start pouring out of the HDMI socket… You could confuse the EDID EPROM for the CMOS RAM, or you might unwittingly be bogus sending commands to a device which doesn’t use the same message protocol as your intended target) As long as the initial identification succeeds, then things will be okay as it is a very strong positive identification. New OS_ReadSysInfo reason code Don’t get me started on this. I reckon ReadSysInfo ought to be telling a lot more about the environment (stuff the OS ought to already know) in a simpler form than requiring OS_Hardware calls if/when GPIO is implemented); and OS_PlatformFeatures should maybe indicate stuff like “this machine has NEON” and the like. Sure, all this information is already available, either by calling the HAL or poking CP15; but given that the OS already needs to know more than a little about what it is actually running upon….sharing would be nice. ;-) Thanks again for the fix.

Mar 16, 2015 10:56pm Jeffrey Lee (213) 6048 posts	OS_PlatformFeatures should maybe indicate stuff like “this machine has NEON” and the like. VFPSupport_Features :-) (although it does require some manual decoding of the MVFR registers to check for the exact features you’re after – the downside of there being about 10 different theoretical VFP/NEON combinations)

Mar 17, 2015 8:34am Sprow (202) 1158 posts	1.New OS_ReadSysInfo reason code 2.New OS_PlatformFeatures reason code 3.New OS_Hardware reason code 4.New SWI altogether OS_ReadSysInfo seems to fit the best as that’s where motherboard controller info lives. I always associate OS_PlatformFeatures with processor features, which IIC isn’t; OS_Hardware seems to be for tickling HAL functions/devices rather than getting numbers; and there aren’t many spare kernel SWIs left so probably best to keep them for something big.

Mar 17, 2015 11:36am Chris Evans (457) 1614 posts	Thanks Jeffrey and Rick quick work:-)

Mar 17, 2015 12:13pm Rick Murray (539) 13840 posts	Thanks Jeffrey and Rick quick work:-) Thank Jeffrey. All I did was reliably break it. (^_^)

Mar 17, 2015 1:30pm Jeffrey Lee (213) 6048 posts	And I was the one who added support for multiple buses in the first place. It’s the circle of life!

Mar 29, 2015 4:28pm Jeffrey Lee (213) 6048 posts	I’ve now added OS_ReadSysInfo 14 as a way of getting the number of IIC buses – expect to see it in tomorrow’s ROMs.

Mar 29, 2015 4:40pm Rick Murray (539) 13840 posts	Thanks. Are buses guaranteed to be contiguous, or is there a call I’ve missed to say which buses are valid? OMG, I could kill myself. I’ve just spent half an hour trying to track down why a module I’m working on is interacting strangely with my OLED clock/status ticker. I knocked out all of the test module startup code and it was still doing it. Then I tried dropping DADebug calls everywhere and saw all sorts of random SWIs being called. Then it hit me. I originally recycled the CMHG file from OLED and I never changed the SWI chunk ! Argh! Stupid! So I’ve set it to &59580 for now. [don’t worry, I’ll apply for a proper registration when the software does stuff] This does show up an interesting quirk of RISC OS in that the OLED ticker kept on going and the new module was having its lookup routine splattered with random gibberish (well, OLED strings but certainly not the data it would have expected). This implies that RISC OS can not only load two modules with the same SWI chunk, but it’ll call them both too. That’s either awesome or distressing, I’m really on the fence about it…

Mar 29, 2015 5:34pm Jeffrey Lee (213) 6048 posts	Yes, buses are guaranteed to be contiguous.

Mar 29, 2015 7:23pm h0bby1 (2567) 480 posts	aaaaa

Reply

To post replies, please first log in.

Forums → Bugs →

IIC probing / CMOS sanity failure

Reply

Search forums

Social

ROOL Store

Donate! Why?

RISC OS IPR

Description

Voices

Options