Date formats
GavinWraith (26) 1563 posts |
Exotic, ancient and oriental methods to the side, it seems to me that we have 24 methods of expressing dates, multiplied by the number of separator symbols used for dividing the three fields: Day, Month, Year (for example hyphens, spaces, periods, slashes backward and forward, etc). There are 3! = 6 possible orderings of the fields: DMY and YMD being the logical ones, as monotone in size. There are 4 formats for months: literal versus numeric (e.g. July versus 07) times full versus abbreviated or dropping leading zeroes (e.g. Jul, 7). I am inspired to write this because ConvText has a built-in command FLIP DATE which converts full numeric DMY to YMD. Now it is not hard to code routines for converting date formats. The difficult part, as I see it, is extracting from the user which of the the (>18)^2 transformations is desired. Do you guide the user through a series of questions (keep or change?), present them with a dialogue window with defaults to accept or untick? There are obviously lots of possible compromises to be made. Changing the date format is not going to be a frequently used facility. Lots of people do not care anyway. The greater the number of choices to be made the more tedious for the user. Bacon and eggs is it then? |
Stuart Painting (5389) 714 posts |
My suggestion would be to take a “sneak peek” at the document to be processed, work out which date format(s) are in use in the document, and present the user with a drop-down menu of the date formats (including all options where the date format is ambiguous). Once they’ve chosen the exact date format they are interested in, you can then guide them through the (by now reduced) options for adjusting the order of the components in the date but keeping other features (separator, numeric/alpha month, drop leading zero) unchanged. If you want to offer something more ambitious (e.g. “change all dates – regardless of format – to a standard format”) you may be reduced to prompting them each time something that looks like a date1 appears in the document (and for ambiguous entries, getting them to specify which of the possible conversions are to be applied). And yes, choosing the output format is going to be fiddly. Some other things to consider:
In summary, are you sure you want to do this? 1 Processing a document that contains a lot of bank sort codes (e.g. 77-01-03) is going to be tedious in the extreme. |
GavinWraith (26) 1563 posts |
Good point. There is a huge amount of hype about AI in the media, as if it were a panacea, a magic spell carved on a wand and waved over the problem. I suppose eventually we will have document processors that not only correct our spelling and grammar but also insert metadata. But will we be able to sue our machines for misrepresentation? In Time Waits for Winthrop a future in which one may sue one’s own subconscious is described. |
Rick Murray (539) 13850 posts |
I like to think the acronym stands for Artificial Idiocy.
That’s about it. Once upon a time, these sorts of people sold snake oil and tinctures. Now they sell AI solutions.
Your examples miss whether or not st, nd, rd, th is added after the number. So that’s (yet) another set of possibilities.
That’s not hard to do if you restrict yourself to those formats. Dead easy really, just copy the string. Search for any non-numerical (so it doesn’t matter what odd separator the user picked), then atoi it from a string to an integer, and rebuild. It’s like a ten line function. The more interesting question comes when you are presented with a date like: 06-12-21. That is valid in DMY (6th December 2021), MYD (12th June 2021), and YMD (21st December 2006). So… which is it? ;-)
Much easier said than done!
There’s usually a sliding scale of “dates after this are 19xx, before this are 20xx”. Isn’t the usual cut-off sort-of close to now? (1935, or something, isn’t it?)
Good luck with 令和3年7月15日 ;-) Usually you can parse Japanese dates fairly easily looking for the symbols, they’re in the form 2021年7月15日, and they usually use western numbers rather than 二千二十一年七月十五日 which is possible but unlikely. However the example given with the smiley refers to the official date format, which is to refer to the era name. 2021 is the third year of the Reiwa era, after Emperor Akihito’s abdication. So, to repeat what Stuart said:
:-) |
Rick Murray (539) 13850 posts |
BTW, whilst you’re extremely unlikely to actually run into Japanese dates (especially given RISC OS’ advanced foreign character support ;p ), it does provide a rather nice example of a place that has three completely different ways of writing the exact same date (today), all of which are valid. |
Steve Pampling (1551) 8172 posts |
That collection of idle code that generated the “fun” on the rollover to the first day of the last century of the Gregorian calendar. Some of that tat is still in use.1
If you want to be nit-picky, AD is a regnal year system anyway. 1 No one would be stupid enough to integrate that stuff into a system that came into use in the 2000’s would they? Ah… |
Chris Hall (132) 3558 posts |
Two eggs or one? Rosewood, mahogany, teak? |
Chris Mahoney (1684) 2165 posts |
Just small pieces of lightly-buttered kipper for me, thanks. |
Chris Hall (132) 3558 posts |
Guess what – the NMEA sentences returned by the GPS satellites specify the date without the century (i.e. a two digit year). OK we know it is after 1970 (or even 1980) so it’s all right until 2070 (or 2080). Not my problem. we have 24 methods of expressing dates There is a standard way of expressing a date: |
David J. Ruck (33) 1636 posts |
IIRC the Risc PC RTC only gave years modulo 4, so base year number had to be stored in CMOS. |
Dave Higton (1515) 3534 posts |
I can confirm this. The PCF8583 is the first IIC device I ever worked with. I remember it well. Still available. Bus speed still 100 kHz. |
Jeff Doggett (257) 234 posts |
Well, actually, since time doesn’t run backwards (citation needed), we can be pretty sure that it’s at least 2021. |