Rob's Ramblings

Monday, 17 September 2018

Some contemplating on frame storage formats and clashes therein

I recently posted a little bit on how I now store contributed videotex (teletext and viewdata) frames within a database, so as to make accessing them far easier on the application side.

To do this, I had to decide on exactly how to store the visible content of the frame.  Everything else is easy; I crated a secondary table holding key=>value pairs, which means it is very easily expandable, and any application needing particular data can go look for it's own, and not be confused by anything extra.

So. The frame content itself.  I didn't get much help looking at existing storage formats, as I've got at least 17 types documented, and others I know about.  I may however have been influenced somewhat by them.

When you think about a viewdata frame, or a teletext page, you automatically see the 23-25 lines by 40 columns of static image.  Almost every frame you will find that has been saved out by a terminal emulator, or teletext captures, will consist of those 920, 960 or 1000 bytes of data, perhaps with some meta-data accompanying, sometimes not.  I think that every third-party viewdata host that I have so far encountered also stored its pages so.  Individual characters took up a single bytes as per their ASCII character code, and colour and control characters were also stored as a single byte.  For teletext, this uses the non-display codes below the space, as there is no concept of cursor movements, carriage returns, etc, on a teletext screen, which is what these values are used for in a serial-terminal based service.

Prestel, and viewdata generally, is however serial.  Frames are sent to the user as ASCII characters, but the colour and control codes are sent as command sequences:  Escape then a capital letter.  So, what might be stored in a teletext page as "<01>RED<02>GREEN<07>WHITE" would be sent to a viewdata terminal as "<ESC>ARED<ESC>BGREEN<ESC>GWHITE".  Short lines would be terminated by a carriage return and linefeed, so reducing the need to send the whole 40 characters.

Now.. Prestel itself is known to have stored the frame data exactly as it would be sent to the user.  There was a hard limit of 920 bytes available to the editor to use, and colour codes, etc, took up two of them.  This made creating complicated graphical pages somewhat difficult, as too many colour changes could quickly eat up all the allocation.  (Response frames were even worse; you only got 716 bytes to play with!)  This is probably why all third party viewdata servers stored their page as the 22x40 character full image, with the control codes stored as per teletext.  Doing this allowed for much more colour and graphic rich content than was possible on Prestel itself - the conversion was done on transmission.  The actual codes stored varied - some systems used 7 bit data throughout, some used top-bit sett letters to indicate that letter needed the escape sending before it, some used 7 bits for visible characters, and top-bit set control codes (codes in the range 128-159) and at least one had everything with the top bit set!

So fast forward 30 years, and I'm writing code to handle saved viewdata pages and display them on this new-fangled World Wide Web thing.  There is zero support for viewdata and teletext format images, so we have to roll our own, converting saved pages in any number of formats into PNG or GIF (to account for flashing characters) images that a web browser can display.

As an intermediate stage, I have to pull that 22-24x40 matrix of characters out, before plotting them onto a graphics image for sending to the viewer.  This intermediate block of characters I called an "internal" format, and was 7-bit clean, so codes below space for the colour codes, and the rest visible.

For nearly ten years this worked fine, and this internal, intermediate format, was the format used when I created the page database.

It is only this week I hit a problem with this, and it is down to a peculiarity with how Prestel stores Response Frames.  (And, I assume, other frames that are not simple static pages.)

A response frame contains a number of fields that are defined by the editor when they create it, and are either filled automatically by the Prestel server when it displays the page, or  can contain text or data to be entered by the user.  When the user hits # on the last field, they are given the option to send (or not) the page to the IP.  It is then delivered to their mailbox in a filled-in state.

When defining a response frame in the standard Prestel online editor, a field is specified by typing, e,g. Crtl-L n 30 Ctrl-L will create a field of 30 characters length containing the subscribers' name - on pressing the second Ctrl-L the system will display 30 "n"s in the required position.  The same procedure is repeated for any other field you request.  What gets stored in the Prestel database is a single Ctrl-L and 30 "n"s.

When you retrieve a page from Prestel using the "Bulk" Online editor, it is sent exactly as stored, so you get the Ctrl-L and sequence of letters alongside the Escape'd colour codes and CR/LFs for short lines.  Uploading a replacement frame you specify the layout in the same manner.

Those of you familiar with the standard ASCII control codes will recognise that Ctrl-L is also known as "Clear Screen", and is a character that is usually sent before sending the frame content.  This is probably why it was used for this purpose - finding it in the middle of the frame content would not make sense, so it was re-purposed as a flag for start-of-field.  Obviously this is never actually sent to the user, but is replaced by a space when viewing on a terminal.

Now ...

I have two small databases in my posession that were pulled back down from Prestel at some point, and these include a number of Response Frames.

When I converted the data to my "internal" format to load them into the database, this normalised the control codes to 7-bit data, filling that lower 32 bytes of the table.  On displaying, these codes were sent as <Esc><code + 64>, this recreating the colour sequences.

When it comes to a <ctrl l>, however, this was never stored in the database - the normalisation routine ignored it.  However, even if it had been saved, on recall, it would have been translated into an <Esc>L, the sequence to end double-height text.

So, to summarise, the normalisation I did, in most cases, lost the start-of-field character because it wasn't expected in a frame.  And if it did make it though, it would be indistinguishable from the "Single Height" code, and as that was allowed anywhere in a response frame, it couldn't be deduced from context.

I never noticed, because there were so few frames affected, and there was no need to process the fields the code indicated, anyway!

This last month, however, I've been working on a viewdata host program that will run on a modern server, and which I could use to receate the look and feel of using the original Prestel service.  I've been testing this using an actual Prestel terminal, and it's been great fun!  It's only when I stumbled across one of these response frames, and decided to support them, that I discovered this problem!

Looking into how other file formats solved this, it seems that at least one of them uses <Esc> itself as the field indicator.  If  stored in the database like that, when expanded on recall this would translate into an unused code sequence, in viewdata, so is a suitable alternative.  I will translate the affected pages, eventually!


So, a decision taken about 10 years ago came back to bite me this week. And it's all to do with 25 year old data in a file format determined 40 years ago that everyone else decided needed to be done differently.

Well done for making it this far!


As an aside .. Prestel added support for "Dynamic frames" which were basically frames that could contain cursor movement characters.  This meant you could go back and change things after you had already drawn them.  This was easy for them, as they stored data in an as-transmitted form anyway.  It's no so easy for host software that expects it's frames to be stored in a fixed matrix!  I'll be working on this, one I find some original examples....


Labels: , , , , ,

Friday, 14 September 2018

The Videotex Database - submit your pages now!

When I started viewdata.org.uk (and teletext.org.uk), I just uploaded the pages and databases I had as-is, and had my scripts deal with them on an as-accessed basis.  This is because I wanted to preserve the data as much as possible - any translation to a new format (such as JPEGs) would inevitably lose data, as well as context.

As time has moved on, and as the variety of data formats I have had to deal with has proliferated, this has increasingly become somewhat unwieldy. I decided, therefore, to try and rationalise things somewhat.

Each of the various file formats I was dealing with had different properties. Each had strengths, and each had weaknesses.  I could not decide on a single common format to try and convert files into.

Rather than create a new "perfect" file format, I decided therefore to store the frames within a database.  By having a primary table for the page content and certain static data, and a separate table for meta-data, any particular properties a particular file format had could be accommodated.

Once the data is held within a standardised database, of course, it makes it much easier to access it and use it from many different applications.  The first, and most obvious, is the ability to search across the entire database for key words or phrases. This is implemented on the front page of the database.

The main in-browser viewer for the saved pages implements a timeline function, where you can see how a given page has changed over time.  See, for example, the CEEFAX news headlines.

And of course, for viewdata pages, once can implement a dial-up host, so 1980s terminals can connect directly into the service and browse it exactly as they did at the time.  (This is mostly done, just pending further tidying up!)

Currently the database contains page data I have collected myself or already been sent. However I am aware that there is a vast amount more out there.  Jason Robertson has been amazing at rescuing teletext pages off old video tapes, and I know of at least one previous Prestel IP that has a massive archive of pages still extant, albeit sat on very old hardware.  I've got part of The Gnome At Home, and I know the rest still exists.

This week's task (one of the various "I'll do something" for Retrochallenge 2018/09) was to create a page for viewers to directly submit their pages to the database.  This is now complete!  It actually places the data into a queue, after briefly validating it, so it can be checked and added later.  I would welcome any contributions, anything from a single frame to a complete service backup!  If you need help, feel free to drop me a line.


Labels: , , ,

Monday, 3 September 2018

A Viewdata Host

One of my aims when setting up viewdata.org.uk was to create a means by which readers could experience connecting to a viewdata service, and also to use such to present what saved pages we had in an appropriate context.

Sadly, there was nothing available that I could find that would allow me to run an actual host, and although I had some success firing up my old BBC Micro based viewdata BBS, this didn't last long due to multiple hardware failures.

Back up to today, and, as I mentioned yesterday, John Newcombe has written, and is running, his own viewdata host called TELSTAR.   I've discussed some things with John, and had been hoping to blag a copy of the software, but it seems that it's not quite what I am looking for.

Now, I have been building up a database of frames - this is yet another unfinished project - over at db.viewdata.org.uk.   This database is what I want to use as the source of the data for a host system.
Although it's mostly got teletext loaded up, I do have a complete copy of the PC Plus demo of Micronet loaded up, which can act as a starting point.

So, what to do?  Well it's obvious, write my own host software.  I've been putting this off for years, but, it's #retrochallenge time, and I do want to achieve something...

A few hours last night got the bare bones sorted out, and a bit of time debugging, and we're at a point where I can dial in and navigate between pages!  Woo!

Whereas John has been concentrating on content for his viewdata host, I'm going to be working on making mine more of a "Prestel Emulator"; it should feel as close to the original as possible.  I've a lot to do, obviously, but not bad for an few hours work.




Labels: , , , ,

Saturday, 1 September 2018

Modem Emulation - an RC2018/09 prologue

Most of you will know by now that I'm really into preserving the memory of Prestel and Viewdata systems generally.  I run www.viewdata.org.uk which, while a bit long in the tooth, is going to get a massive update "soon" ...   But today I'm going to talk about hardware.

Some time back, I fired up my old viewdata BBS "Ringworld" - this operated on a collection of BBC Micros - one per connected user - and an Acorn A5000 acting as fileserver.  I connected these to the internet using a motly selection of modems, ATA telephony adapters, and serial terminal adapters.

The long shot was, for a user dialling in, the call was answered either by the exact same modem it always had been, connected to a SIP ATA - the digital data was transformed to analogue, before being turned back to digital by the modem.  This always seemed like a poor idea to me. What would be better is if some bit of software answered that digitised telephone call, looked at the whistles and warbles, and turned it directly into a sequence of ASCII bytes for delivery to a telnet port.

I had found an program called iaxmodem that allowed an asterisk based PBX to emulate a modem, but it was focused on faxing, and I just couldn't get it to work with the V23 dial-up I wanted.  But it was close.   I spent the next few years, off and on, searching for changes to that, or SIP based alternatives, with no luck.

In the meantime, John Newcombe decided to write his own viewdata host service, called Telstar, in Python, and that can be accessed via a raw-socket. (like telnet, but without the features!)  There's not a lot of software out there that can talk both Viewdata display protocols and connect to a socket, however.  Richard Russell wrote a example viewdata client that could do it, and you can connect from BeebEm if you load up a suitable comms package and set the RS423 IP parameters. 

There have also been a couple of projects to produce a "WiFi Modem" that, basically, looks like a hayes-compatible modem that you connect to via RS232, but it in turn connects to your WiFi, and onwards to a telnet port out on the internet.  This is great for things like BBC Micros, Commodore 64s, etc., where you can just swap out your period modem for this new device.   Not so good for dedicated terminals, or e.g. the ZX Spectrum VTX5000 where the modems are built in.

Then, out of the blue, an old friend, Darren Storer, posted on the BBC Micro facebook group (I think it was there..) that he'd set up a dial-up number for Telstar, and could people test it.  It took me a week or two to get there, but I pulled out a terminal, dialled the number ... and it didn't work.  Not at all.  I did, however find out the software he was using ...  asterisk-Softmodem.  This was exactly the sort of project I'd been looking for all those years.  But, it didn't work for him/

I pulled the code and had a look, and could see nothing wrong.  So, firing up an asterisk server, and installing it, I tried to debug.  The first issue was my terminal was not locking onto the carrier, so I added a t(-10) to increase the volume, and that sorted that!

Next problem, I wasn't getting much data on screen - many characters were just missing!   This was somewhat easy to diagnose, as I had an inkling after seeing how you configured asterisk to use softmodem - you specified the number of data bits, being between 5 and 8.  The example had it as 8. Now Prestel, and of course the terminals, all used 7 bits with even parity. What I was seeing was the terminal being sent 8-bit data, and of course interpreting that as most of the characters having an incorrect parity bit, and ignoring those!

Now, I can set a software terminal to 8bit data, but not the termnal - there  is very little you can configure as a user on these things.  Because the project had no support for parity it looked like a dead end, but that wasn't going to stop me - I'd waited years to find this, and wasn't going to give up now!

Delving into the code, it actually turned out to be a nice simple and straightforward bit of programming.  Adding parity support turned out to be fairly easy... I've published the modifications to my own github fork and submitted a pull request to send them back to the original author.

So now, I can dial into Telstar, CCL4, or anywhere I want to set up a number for!

If you want to try it, the number for Telstar is 0333 340 3311 (from outside UK, +44 333 340 3311). Calls cost the same as an 01 or 02 and are included in any inclusive minutes you may have. Call s are free for A&A customers.)

I can't guarantee that number will stay up, and it may not work from time to time if I'm tweaking things, but if it turns out useful to you, please let me know in the comments below!


Labels: , , , , , ,