LibreOffice HTML tidy-up app
Clive Semmens (2335) 3276 posts |
I’ve written a little RISCOS app that takes an HTML file generated by LibreOffice, and strips out all the extraneous formatting information that you don’t really want – this is supposed to be HyperText Mark-up Language, not a detailed format description! Of course it leaves in your italics, Headings etc. http://clive.semmens.org.uk/RISCOS/XP1LO2web.html [edited to the new name!] There’s more information about what exactly it’s doing on that linked page – and a link to download the app. It’ll probably do a reasonable job of HTML from other word processors or DTP packages, but I’ve not tested it on those. If you send me samples of HTML files from such, I might have a bash at widening its applicability. |
Clive Semmens (2335) 3276 posts |
Discovered a bug. It’s unlikely to bite you: it only affects files with footnotes, and only a minority of them. Will be fixed later today anyway. I’m also going to change its name. It will be !XP1LO2web. |
Clive Semmens (2335) 3276 posts |
Done. http://clive.semmens.org.uk/RISCOS/XP1LO2web.html is now available, and as far as I know, bug free. Bug reports or upgrade suggestions welcome, I might even act on them… |
Steve Pampling (1551) 8155 posts |
Have you tried it on an export of Firefox bookmarks? 1 What would you call an embedded screenshot that you can only see when you open the bookmark to edit it while on the page that the snapshot came from… Rant: |
Clive Semmens (2335) 3276 posts |
No, it had never occurred to me. I’ll take a look and see whether it’s a feasible prospect at all…the bloat in LibreOffice HTML is entirely understandable: they actually do their best to make your browser display what your word processor would print…plus a bit of “you’ve deleted something here that was in italics, so we’ll delete the text and leave the tags behind, ha!” My assumption is that most of that was incidental to the content, and you want to leave it to the browser to choose your default font etc. etc. Whenever I take a look at the HTML source of other people’s web pages, I think to myself, “What the hell package did they use to write all that crap, and what the hell is it all for?” but perhaps if I will write a debloating app I ought to try to work out what I can safely remove. Needles to say, I actually wrote this app for my own use. I write a lot of text in LibreOffice… – but I thunk perhaps it might be useful to other folk. But I will take a look at Firefox bookmarks, anyway. |
Clive Semmens (2335) 3276 posts |
Blimey. I’ve taken a look. I’ll take more of a look later, but it’s not looking good…what an INCREDIBLE pile of crap!!! |
Clive Semmens (2335) 3276 posts |
It isn’t really an html file at all… That said, I’ve managed to work out what’s going on, more or less. Definitely unwise to try to edit it, but perfectly possible to create a new, far smaller (proper html) file that simply displays a list of links to the sites that are bookmarked. Not sure whether I can make it group them in folders the way FireFox does. I’m inclined to make it a separate app though, rather than making LO2web do it. Is it worth the (not huge) effort? Everything in the bookmarks file appears to be meaningful, although some of it may be carrying information that’s of no interest to you. In the case of the LibreOffice html files, I know what information I’m trying to expurgate (formatting that relates to page layout rather than content – and archaeological remnants); in the bookmarks file, it’s not clear to me what to get rid of. The obvious biggie is all the little icons, but I’m inclined to think they might be wanted really. |
Steve Pampling (1551) 8155 posts |
Possibly best not to pollute what you have, but then
Probably not, unless you wanted to show Mozilla the error of their ways
Fripperies. Of course, you’ve seen my comments about the Gravatar stuff on these forums so that comment won’t be a surprise. Links are the thing. Pictures in things should have a clear purpose and contribute additional, required, information. Piccies in things for no reason:
|
Steve Pampling (1551) 8155 posts |
Is it worth the (not huge) effort? Having said that:
and ending with a |
Clive Semmens (2335) 3276 posts |
8~) When I’m looking down my bookmarks, I spot the little icon of the thing I’m looking for…however, the “icon” data in the bookmarks file looks like a lot more than is needed to create those little icons. But I’ve not investigated yet whether there’s any way to cut it down a bit. Obviously you could remove the icon data and retrieve the icon from the icon link, but that would make it pretty slow.
|
Steve Pampling (1551) 8155 posts |
Those are part of the details in the view of bookmarks when you do a Show All Bookmarks from menu or Ctrl-Sh-B, last modified would update the next time you saved the link (perhaps with additional tags) Yes the ICON data is the bloat, and totally, totally useless. |
Clive Semmens (2335) 3276 posts |
Well, I can easily make a little app that clears out that icon data 8~) – I’ll do that, even if you’re actually the only person who wants it! 8~) I can probably even put it some tiny alternative “icon data” to fool FireFox (and whatever other browser uses Netscape Bookmark files) into accepting the file. We’ll see… |
Steve Pampling (1551) 8155 posts |
If I could remember the syntax I could probably get AWK to do it. |
Clive Semmens (2335) 3276 posts |
It’s a very easy thing to do – when I’m back at the Pi. Just now I’m on a MacBook in the living room, being (relatively) sociable… 8~) …gotta be sociable, it’s my (71st) birthday… 8~) |
Rick Murray (539) 13806 posts |
Sorry Authentic Steve, but some of us are more visually based. I have little pictures in the entries of my phone’s address book. The (very!) few times I call anybody, it’s usually by looking for the picture and not reading the words. I can recognise Clive easily. Etc etc. Pictures good. :-) |
Clive Semmens (2335) 3276 posts |
I like the little icons in the bookmarks sidebar too, but they really oughtn’t to need quite so much gibberish to produce them… But I’m perfectly happy to oblige Authentic Steve with an app to clean them out for him – the rest of us don’t have to use it! |
Rick Murray (539) 13806 posts |
🎂 HAPPY BIRTHDAY 🎂 |
Steve Pampling (1551) 8155 posts |
They don’t. The ICON_URI element just before is what does that.
The “gibberish” in ICON= is for the bit you only see when editing a bookmark. Save a bookmark. Yes, that level of stupidity
1 OK, there’s a few million across the pond. |
Rick Murray (539) 13806 posts |
A few million? By last count, a mite under seventy one million. |
Steve Pampling (1551) 8155 posts |
Well, I think Trump has more than 71 million, but lots of them have difficulty writing an X. |
David J. Ruck (33) 1629 posts |
@Rick it’s meant to be a red dragon, but it’s too small to see on the forum. Here is a bigger version It was originally a draw file called BlueDragon, but I filled it with the mica red paint from my RX-8. |
John WILLIAMS (8368) 493 posts |
But if you squint a bit it looks like a bloke with a pointy beard and long lower lashes looking to his left. He might have some idiosynchratic moustache hair as well! So really it’s a failure as an icon!
I’d like to have an icon, but could never get whatever it is to work (some sort of page-making site, I forget). I’ve got a lovely one of my very handsome dog, one of me as Ché G. , and one of me as me as I was once – but … |
Clive Semmens (2335) 3276 posts |
I like it, David! My facebook avatar is a dragon, but very different from that one – it’s the head of the dragon in the cover picture of my book, Birgom’s Diary – the dragon is the bowsprit of a little ship that figures prominently in the book. http://clive.semmens.org.uk/Fiction/Birgom.html And Thank You Rick! 8~) |
Clive Semmens (2335) 3276 posts |
That explains it. I wonder what will happen if I expurgate that altogether? We’ll see. Failing that, there are two tricks I can play on it: one is to duplicate the shortest example (which is much shorter than the others); if that doesn’t work I can create a proper HTML file instead of a Netscape Bookmarks file – with just the links, and maybe the wee icons. Later. |
John WILLIAMS (8368) 493 posts |
Or it could be (at displayed scale) a pair of red lips dribbling blood! |