Sunday, August 31, 2014

VCSMC - Design and Philosophy

This is the third article of my series of postings on vcsmc. You can find the first and second ones by following the links.

When approaching the idea of a media player for the Atari 2600 I knew that I would want to create a custom cartridge that could run on an unmodified console. The homebrew community for Atari has done this before, see the Harmony Flash Cartridge for example. This cart allows for running a variety of 2600 ROMs off of an SD memory card. There's an ARM chip inside of the Harmony cart which provides an on-console UI program as well has handling the loading of the selected ROM for play.

My original vision for the custom cartridge was that it could be a Raspberry Pi or other Linux SoC board that could process an input video signal in real time and output 6502 bytecode at the Atari clock speed 1 MHz for rendering by the Atari. However as I've dug in to the challenge of getting the Atari to render video with any kind of recognizable image fidelity I've come to understand that this may not be tractable, or at minimum I should come up with some intermediate goals that are more in the realm of the achievable.

My current thinking and designs are centered around software that generates 6502 bytecode streams to render individual frames of video at 60 Hz with no constraints on overall program size. In order to maximize render quality every CPU clock cycle available should be used to manipulate the TIA state machine to create output. This creates very large programs, in fact with load/store combinations at 4 bytes each and at 15 per scanline, and 180 vertical scanlines plus overhead audio data during the vertical blank a reasonable upper bound is 16 KB per frame, which is four times the original maximum size (without bank switching) for an entire 2600 program!

At 60 frames per second the overall rate of data to the Atari will be a bit under 1 MBps, or 8 Mbps, which is right around the standard quality video bitrate YouTube recommends for uploading 1080p video. For now, I am focused on offline programs that generate these rather large blobs of bytecode (an 8-minute video will be about 480 MB), and then create hardware that can read the bytecode blobs and drip-feed them to the Atari frame-by-frame.

The two programs in green, aufit and picc, are the offline processing components I am writing. My thinking for these programs is situated somewhere between a compiler/linker and a video compression tool. Furthermore, I have found it useful at times to model this software along the lines of lossy data compression. Although the data rates are not enviable in terms of a modern video compression algorithm, the video is obviously going through a substantial loss of quality as it is made suitable for rendering on the Atari - to wit 24-bit color down to 7-bit fixed palette and 16-bit 48 KHz stereo audio to 15 or 32 KHz 4-bit. As I will outline in subsequent postings it has also become obvious that perceptual coding is very important in this, as in all lossy schemes, in order to retain any feeling of image quality.

Notice that the overall flow has changed a bit from the written description in my first posting. This is because as I have implemented parts of picc I've come up against the challenge of scheduling state changes on the 6502. In an effort to squeeze every drop of potential image fidelity out of the 2600 I have been working on algorithms to try and reuse registers and allow for scheduling of state transitions earlier than they might be needed, so long as it doesn't impact rendering on the previous scanline. This has turned into quite the rabbit hole, more about it in a subsequent posting, but suffice to say that what values need to be set in the TIA audio hardware, and when, has become an important consideration for picc scheduling.