Rob's Ramblings

Monday, 17 September 2018

Some contemplating on frame storage formats and clashes therein

I recently posted a little bit on how I now store contributed videotex (teletext and viewdata) frames within a database, so as to make accessing them far easier on the application side.

To do this, I had to decide on exactly how to store the visible content of the frame.  Everything else is easy; I crated a secondary table holding key=>value pairs, which means it is very easily expandable, and any application needing particular data can go look for it's own, and not be confused by anything extra.

So. The frame content itself.  I didn't get much help looking at existing storage formats, as I've got at least 17 types documented, and others I know about.  I may however have been influenced somewhat by them.

When you think about a viewdata frame, or a teletext page, you automatically see the 23-25 lines by 40 columns of static image.  Almost every frame you will find that has been saved out by a terminal emulator, or teletext captures, will consist of those 920, 960 or 1000 bytes of data, perhaps with some meta-data accompanying, sometimes not.  I think that every third-party viewdata host that I have so far encountered also stored its pages so.  Individual characters took up a single bytes as per their ASCII character code, and colour and control characters were also stored as a single byte.  For teletext, this uses the non-display codes below the space, as there is no concept of cursor movements, carriage returns, etc, on a teletext screen, which is what these values are used for in a serial-terminal based service.

Prestel, and viewdata generally, is however serial.  Frames are sent to the user as ASCII characters, but the colour and control codes are sent as command sequences:  Escape then a capital letter.  So, what might be stored in a teletext page as "<01>RED<02>GREEN<07>WHITE" would be sent to a viewdata terminal as "<ESC>ARED<ESC>BGREEN<ESC>GWHITE".  Short lines would be terminated by a carriage return and linefeed, so reducing the need to send the whole 40 characters.

Now.. Prestel itself is known to have stored the frame data exactly as it would be sent to the user.  There was a hard limit of 920 bytes available to the editor to use, and colour codes, etc, took up two of them.  This made creating complicated graphical pages somewhat difficult, as too many colour changes could quickly eat up all the allocation.  (Response frames were even worse; you only got 716 bytes to play with!)  This is probably why all third party viewdata servers stored their page as the 22x40 character full image, with the control codes stored as per teletext.  Doing this allowed for much more colour and graphic rich content than was possible on Prestel itself - the conversion was done on transmission.  The actual codes stored varied - some systems used 7 bit data throughout, some used top-bit sett letters to indicate that letter needed the escape sending before it, some used 7 bits for visible characters, and top-bit set control codes (codes in the range 128-159) and at least one had everything with the top bit set!

So fast forward 30 years, and I'm writing code to handle saved viewdata pages and display them on this new-fangled World Wide Web thing.  There is zero support for viewdata and teletext format images, so we have to roll our own, converting saved pages in any number of formats into PNG or GIF (to account for flashing characters) images that a web browser can display.

As an intermediate stage, I have to pull that 22-24x40 matrix of characters out, before plotting them onto a graphics image for sending to the viewer.  This intermediate block of characters I called an "internal" format, and was 7-bit clean, so codes below space for the colour codes, and the rest visible.

For nearly ten years this worked fine, and this internal, intermediate format, was the format used when I created the page database.

It is only this week I hit a problem with this, and it is down to a peculiarity with how Prestel stores Response Frames.  (And, I assume, other frames that are not simple static pages.)

A response frame contains a number of fields that are defined by the editor when they create it, and are either filled automatically by the Prestel server when it displays the page, or  can contain text or data to be entered by the user.  When the user hits # on the last field, they are given the option to send (or not) the page to the IP.  It is then delivered to their mailbox in a filled-in state.

When defining a response frame in the standard Prestel online editor, a field is specified by typing, e,g. Crtl-L n 30 Ctrl-L will create a field of 30 characters length containing the subscribers' name - on pressing the second Ctrl-L the system will display 30 "n"s in the required position.  The same procedure is repeated for any other field you request.  What gets stored in the Prestel database is a single Ctrl-L and 30 "n"s.

When you retrieve a page from Prestel using the "Bulk" Online editor, it is sent exactly as stored, so you get the Ctrl-L and sequence of letters alongside the Escape'd colour codes and CR/LFs for short lines.  Uploading a replacement frame you specify the layout in the same manner.

Those of you familiar with the standard ASCII control codes will recognise that Ctrl-L is also known as "Clear Screen", and is a character that is usually sent before sending the frame content.  This is probably why it was used for this purpose - finding it in the middle of the frame content would not make sense, so it was re-purposed as a flag for start-of-field.  Obviously this is never actually sent to the user, but is replaced by a space when viewing on a terminal.

Now ...

I have two small databases in my posession that were pulled back down from Prestel at some point, and these include a number of Response Frames.

When I converted the data to my "internal" format to load them into the database, this normalised the control codes to 7-bit data, filling that lower 32 bytes of the table.  On displaying, these codes were sent as <Esc><code + 64>, this recreating the colour sequences.

When it comes to a <ctrl l>, however, this was never stored in the database - the normalisation routine ignored it.  However, even if it had been saved, on recall, it would have been translated into an <Esc>L, the sequence to end double-height text.

So, to summarise, the normalisation I did, in most cases, lost the start-of-field character because it wasn't expected in a frame.  And if it did make it though, it would be indistinguishable from the "Single Height" code, and as that was allowed anywhere in a response frame, it couldn't be deduced from context.

I never noticed, because there were so few frames affected, and there was no need to process the fields the code indicated, anyway!

This last month, however, I've been working on a viewdata host program that will run on a modern server, and which I could use to receate the look and feel of using the original Prestel service.  I've been testing this using an actual Prestel terminal, and it's been great fun!  It's only when I stumbled across one of these response frames, and decided to support them, that I discovered this problem!

Looking into how other file formats solved this, it seems that at least one of them uses <Esc> itself as the field indicator.  If  stored in the database like that, when expanded on recall this would translate into an unused code sequence, in viewdata, so is a suitable alternative.  I will translate the affected pages, eventually!


So, a decision taken about 10 years ago came back to bite me this week. And it's all to do with 25 year old data in a file format determined 40 years ago that everyone else decided needed to be done differently.

Well done for making it this far!


As an aside .. Prestel added support for "Dynamic frames" which were basically frames that could contain cursor movement characters.  This meant you could go back and change things after you had already drawn them.  This was easy for them, as they stored data in an as-transmitted form anyway.  It's no so easy for host software that expects it's frames to be stored in a fixed matrix!  I'll be working on this, one I find some original examples....


Labels: , , , , ,

Friday, 14 April 2017

Retrochallenge Day 14

A little more work today.

The teletext viewer javascript currently expects to find the entire teletext service held within the html of the web page as encoded links.  This is great for a static archive, as it places very little load on the webserver.

I had created a quick bit of php that could construct the html page when loaded, which was the source of the pages I linked to last time.  This is all well and good, but does not help with the sort of interactive services I would like to use the viewdata version for.

So today's work has been adding functionality to request pages from a server, thus allowing for ever-changing pages to be served up instead.  This involved not only modifying the JavaScript, but writing a server in php to deliver the requested pages.   This also brought out some previously unnoticed limitations in my viewdataviewer class, which I was using to parse the stored data serverside.

So, the class has had some fixes added, server written, and viewer updated.  You can play with it here, although there isn't actually anything dynamic on it at the moment.  Subpages aren't quite working right yet, but most of the rest is!

(I did break the first demo, btw, in case you were reading yesterday's post today, when I fixed a "bug" in the class file, but hadn't removed the work-around in the html generator!  It's fixed, now!)




Labels: , , ,

Thursday, 13 April 2017

Retrochallenge day 13

Blimey, where are the days going...

OK. I've taken the hacked-about teletext-editor-that-acts-as-a-viewer, and split off the hacks.  Then I modified the (latest version of) the teletext editor so that it exposes a bit more of it's internals, as the viewer needs that access...

It took a bit if trial and error, but seems to work.  Code is up on github and a demo is (temporarilly) here. I added a touch of code to allow direct linking to specific pages while I was at it!

Now to look at doing the viewdata browser that I was supposed to be doing in the first place!



Labels: , , , ,

Monday, 10 April 2017

Retrochallenge Day 2.. er... 8 ..er .. 10

Blimey, has it been a week already?

OK.  I've not done much coding since last time, but I have been reading code and daydreaming planning out my next move.

Now Javascript is not my strongest language.  I can read it, and modify it, but actually writing new code is a bit of a challenge.  Part of the rationale behind this task was to get myself a bit more familiar with this hideously back to front language...

The teletext browser I used is that created by Adam Dawes based on Simon Rawles (et..al.) edit.tf editor, and grabbed from Jason's captures at uniquecodeanddata.co.uk.  The modifications are to add a pile of new functions, and truncate and or redirect others.

As the original editor has moved on somewhat since this was done, it seems logical that, if I want to do more mods to it, then i should base my code on the latest version.  If I can do it in such a way that I do not need to actually modify the editor, just call it, then that would be best.   In PHP I would, assuming it was a class, extend the class in a new file and override the relevant functions.  So... How to do this in Javascript...

I tried using prototypes ... but hit the problem that the editor is written with lots of private variables and functions, which the new functions in the viewer refer to.  Using the existing editor as the viewer's prototype doesn't work because it cannot access the private variables.  Drat.

<days pass>

After spending more time than I ever expected looking at javascript objects, inheritances, etc., I have decided not to commit myself to ever having to do anything major in this language!!

Sticking with Javascript, I think the best approach at this point would, after all, be to fork edit.tf and modify it to separate out the actual display part from the editor part, that way I can provide for a viewer, indeed, different viewers...  Might even be a mod Simon would like...

Sigh.  Bloody Javascript.

The other option would be to go back to my own viewdata viewer class, which runs serverside to create the images.  I understand this, but I was hoping not to have to do this, as it makes updating the "screen" with the page number being keyed dependant on the server, rather than being local.


So, ten days in, and all I've achieved is discovering that what I thought would be a simple task is much more complicated than I thought it would be.



Labels: , , , , ,

Saturday, 1 April 2017

Retrochallenge: Day 1


OK. First day, and I have to do something.. .whether I can keep this up is another matter....

I had a look at the code used for browsing Jason's teletext captures.   These use a modified version of the edit.tf teletext editor (the 'viewer'), driven by an html page consisting of a mass of links!  The viewer grabs all these, displays the first one, then accepts key-presses to get the next page number, as per a teletext page.  Plus it allows up/down arrow shenanigans to skip through.

Teletext, as you should know, shares the exact same display format as Viewdata, namely 24 (or 25) lines of 40 characters of primary colour text and simple block graphics.  As control codes take up a space on the line, this makes it harder than you might think to do multicoloured images..

So, I've got a pile of dumps of teletext pages over at www.teletext.org.uk so the obvious thing to do is use one of those, see if I can use the viewer just as it is.  That way I have a starting point, and can begin to understand the code and decide on the particular direction I want to go.

I've got a php class in working-but-incomplete state that allows me to manipulate viewdata and teletext pages.  A quick bit of code to load one of the teletext archives and then spit out each page as a link took me significantly less than an hour, and only 14 lines of new code!

So...  from this,  to this.    I think that's a positive step.

Today has, however, shown up a lot of features currently missing in vv.class that I need to add in, particularly to deal with the viewdata side of things.  That is partly what the whole point of what this was for, though: to get an idea of what I need to do next!



Labels: , , , ,

Thursday, 30 March 2017

Retrochallenge 2017/04

OK... after only a few years* hiding away, I've put myself in for this again.

I've set a much smaller goal this time; something I think I can achieve, instead of something I want to achieve!  This is, I think, an important difference.  I got nowhere the last two times, mostly because I was a bit too ambitious for the limited time available to me.  (I've only sort of partially achieved the "tidy the cable nest" of 2012, and never did get anywhere with the dial-up thing from 2013.)

So this year's goal:  Write something that will allow browsing a (static) viewdata database from a web browser for use with the new Wordpress-based viewdata.org.uk.  I have a browser already, but it's a bit clunky and not very easy to incorporate into anything else.  So a nice re-write is in order, whereby I can take advantage of "modern technologies" like javascript and AJAX calls, to make the effect more seamless, and I intend to "borrow" the code behind edit.tf and the teletext-browser version thereof to make things easier..

If I can do that, I might try and finish off enough of the new version of the website to actually launch it, but no promises!!


* - Four.

Labels: , , , ,

Sunday, 8 July 2012

Hicups and Hostings

Ok.  Part of my retrochalllenge entry was to get the viewdata website sorted out - it's been in the process of being Wikified for over a year now!  Some of the work involved in that relates to simply translating the article markup from one markup language to another, but it also means I need to re-write all my custom plugins too.  And to do that, I need to finish the re-write of the viewdataviewer code... and holding me back on all of this was a webhost that made everything behave as if I was walking through molasses whenever I tried to change anything..   It used to be pretty good, but as with all shared webhosts, seems to have become oversubscribed and slowed to a crawl as a result.  Add in a good stir of never-updates, some virus infections, and I should have waved them goodbye a long time ago!

Anyway, I've taken the plunge and shifted my hosting... I'm now renting a VPS which means I'm effectively in control of my own server (albeit merely a tiny part of someone else's system) but I can keep everything up to date, and I don't have to worry about somebody else letting a virus in.  It's very nippy, compared to the previous host anyway, so I'm satisfied.  Maybe I'll not see visitors getting fed up waiting for the next page, now .. (how they would have coped at V23 speeds I do not know ...)

As of yet, there's no new content (although I've got a little titbit waiting to be released - thanks Ant) but that's because it's taken about a week to get everything shifted over - there's 17 websites to deal with! Most are placeholders or simple html-only things, but there are a few complicated ones.  But everything seems to work.. phew.... now to concentrate on more interesting things...

The other part of the challenge is re-working the hardware running the BBS.  This, I might not manage.  Rather than tidying up the mess in the photo, it's got worse!  I had to move everything on the left over in order for a surveyor to examine the floor joists - back in 2008 we got a builder in to put in a "proper floor" and better access, so we could actually use the space, among other things.   Unfortunately, he turned into one of those cowboys you see on the TV, and did a job that nto even the worst DIY nut would be proud of.  There's a lot more to tell, which I'll no doubt blog about eventually, but there's a chance we might finally get some of it sorted soon.  This, unfortunately, will mean packing everything up from up there while it's done, making it impossible to do the BBS side of things.


Labels: , , , , , ,

Monday, 7 February 2011

php classes are go!

It's a bit of a mind set switch. Most of my time spent employed as a programmer was writing business applications in BOS/COBOL. This mostly involved leading the user through a series of options, with little chance of random things happening. The inclusion of SpeedBase added a little more OO style coding - it's basically a windows manager and relational database spliced on top, so you end up writing snippets of code for the different actions a user could do or buttons they could press within a window, but the options for them were still fairly limited, but that's probably what you need when dealing with Ledgers and Order Processing systems - make things too complicated and you have more chances to introduce errors or, god forbid, bugs!

Obviously this approach doesn't lend itself to programming for the internet, where as far as the server is concerned, pretty much any request could come at any time for any thing! And with forms, you don't get to prompt a user each item in turn, you basically get the whole set of answers dumped on you all at once. You also can't rely on the answers actually being within the range of values you specified when you displayed the form, necessitating a fair bit of validation being required to stop people trying to disrupt your systems. So it was fun getting used to such changes when I started coding up web backends.

Now, I'm working, finally, on converting my Viewdata Viewer code to a class based system. At present, it's a loose collection of individual programs that can each operate on a range of data formats, but each programme does different things, on different sorts of data, in different and sometimes contradictory ways. It started of as just a single bit of code that allowed me to refer to a saved viewdata page within an <img> tag, rather than have to convert it to an image manually, but it snowballed, especially with the inclusion of over 11 file formats!

The conversion is to try and create a defined API between the data files and the programs that need to access them. A lot of the code can obviously be re-used, but I'm trying to clean some bits up as well. Out goes the spaghetti of ifs and elses, and in come switches. Out go hard coded constants, and in come some defined labels. It should make accessing the files much easier, and will hopefully allow for such things as:

$vfile = new ViewdataViewer();
$vfile->LoadFile("./upload/1a");
echo $vfile->ReturnText();

Which is obviously much simpler than trying to figure out the file format, etc, in every app.

The main intended application at the moment is for a database for the Celebrating Viewdata site, into which all the pages found so far can be loaded, with an intent to create functionality similar to web.archive.org. Obviously items in the database need to be stored in a consistent format, and therefore I needed code to parse the source data files. Given I would have to replicate the majority of the functionality currently in place in vv.php, vl.php and vb.php, I thought it best if I bit the bullet and did it right!

Having all the file-format-dependant stuff in a single class file will also, of course, enable the addition of any new file formats to be achieved with the minimum of fuss, and no editing needed to existing applications. It should also make it much, much, easier for anybody else who wants to use the code to actually use it!


Of course, classes are a totally new way of thinking about items, again. It took a little playing and reading before I got the hang of them. It's a little more involved than just writing a library of subroutines, but much better. I wish I'd known about them before I started the viewer originally. But, well, it did start out as barely more than a hundred lines of code that achieved but a single task. It grew somewhat from there...


Anyway, watch out on the project website for the class code to appear. Hopefully I'll be able to post the first working code up within a week or two. Status as of today is that the recognition seems to work, and some of the support routines are coded up for some of the filetypes! I'm pleased with it, and enjoying the coding, which always makes it go faster. I just have to find the time to actually do it..

Labels: ,

Friday, 22 January 2010

Making old data visible, easily!!



Many years ago I was heavily involved in the viewdata industry - working for Micronet 800 and then producing software for other Prestel ISPs, running my own viewdata BBS, etc. I therefore accumulated rather a lot of viewdata pages, and managed to recover these from an old backup a few years ago.

As part of a separate project, Vewdata.org.uk I wanted to display these images. As they were saved using a BBC Micro, I loaded them up in a BBC Micro Emulator, under Windows, took a screen capture, pasted that into Photo Editor, cropped it, saved it out as a GIF, and finally uploaded it to the web server. I then had to add the image to whichever gallery it belonged in. As you can guess, this is fairly labour intensive, and gave rather variable results.

Being a firm believer in "let the computer" do the work, I started this side project to condense all this into as little work as possible. What I wanted to acheive was to reduce the steps to: 1. Upload original saved screen file to the web server. 2. End.

I think this has now achieved this, and more so! There are currently two scripts in the suite - vl.php (viewdata lister) will scan a given directory and construct a web page bsaed on the files it finds. vv.php is used as an image source for each file, and this reads the files and constructs a PNG or animated GIF, as appropriate, and returns it to the client.

As a side-benefit of having the original save file available, it's also possible to provide a text-only version of the frames! I hope this will make things more search-engine friendly.

You can find the files here.

At the time of writing, you can find a sample page here that shows the results that can be acheived for a random selection of pages from Prestel, Teletext and some LAN based services.

Please add any comments or suggestions below.

Labels: , , , ,