Progress Report February 2020

Hot off the presses is our latest stable, version 1.5.0, marking the second stable release since the last progress report. In this past year, support has been added for multiple new platforms to make the emulator accessible, performance has dramatically increased, new features such as save states and cheat support have landed to make emulating more fun, and numerous accuracy improvements were made to continue polishing the overall emulation experience.

Now on Android and Raspberry Pi

The largest push this past year has been getting redream running on power-efficient but low-performance ARM platforms. To this end:

  • The app was updated to work on 32-bit platforms.
  • The rendering backend has been upgraded to support OpenGL ES.
  • New JIT compilers were written for 32-bit and 64-bit ARM.
  • User interface was overhauled to work well with touch devices.

In addition to this work which was required to just get up and running, a huge effort has went into optimizing in order to ship the exact same product that's available for Windows / Mac / Linux on these platforms. That is, there are no speed hacks, no underclocking and no different or gimped code paths - users running on the Android and Pi get the exact same product.

Outside of optimizing and improving redream itself, patches have been upstreamed to Mesa to grind out more performance on the Pi and patches have been pushed to SDL to expand on its KMS driver. These patches serve to not just improve redream, but any OpenGL applications running on the Raspberry Pi.

Performance improvements

Redream has never slacked on performance, but during the past year a substantial amount of effort has went into optimizing it even further. The reason for the constant focus on optimization is twofold:

  1. The obvious, games needs to run full speed on these low-power devices.
  2. The less obvious, for each new feature and each accuracy improvement we refuse to compromise on performance.

Refusing to compromise means that these new features and accuracy improvements aren't tucked away behind options and that each is thoroughly tested for performance regressions. If a change is fundamentally slower, all other code is on the optimization chopping block until performance breaks even.

An example of this are the changes upstreamed to Mesa. Our renderer had been through the wringer for weeks and performance still wasn't where it needed to be on the Pi. Rather than claim the hardware was too slow or put a few features behind options, another ~10 percent was instead squeezed out of the driver. To date, the only accuracy related option available in redream is the per-pixel polygon sorting. This is because requiring it would, at this time at least, exclude a double digit percentage of the user base due to its performance impact.

Jumping into the numbers, here's a video that was made right after Android came out but before Pi bringup. It shows side by side footage of a sample of games running uncapped between 1.3.0 and what became 1.4.0. On the low end, many games gained around ~25% speed. On the high end, games as much as doubled in speed:

 

After Pi bringup games have again made another leap in performance. Here's a view of the raw frame rates on a few different platforms:

These results were gathered by running the first 10,000 frames of each game's attract mode using redream's --runframes argument with a retail BIOS. When this argument is used frame skip is disabled (so all frames are fully rendered) as is vsync and the results are printed out to the log at the end of the run.

General improvements

Save state support

For the uninitiated, save states are an additional way to save your progress in a game that's special to emulators. Save states are generated by saving the entire state of the emulator to a file, which can then be loaded at any time enabling users to save and load their progress outside of the constraints of the original game.

This means that each save state stores the state of the emulated CPU, the GPU, the GD-ROM - everything. Given that, as the emulator changes and evolves so does this state, and therefore so does the save state format. When the save state format changes emulators can choose to either make old saves incompatible with new versions of the emulator, or they can do added work to make each new version backwards compatible with old saves. If the emulator's state is rapidly changing during development, maintaining backwards compatibility can quickly eat into development time for other features.

With Android and ARM support complete, our code churn had settled enough to finally implement backwards compatible save states. Since release the format has been updated around 10 times without breaking compatibility.

 

Disc swap support

Most multi-disc games on the Dreamcast don't require emulating disc swapping in order to be played through; they support saving at the end of disc 1, powering off and then starting up disc 2. However, a handful of applications and games (Codebreaker, D2, Pop'n Music 3 and 4, etc.) require that the second disc be swapped while the system is still powered on, which is something that's been sorely missing.

Initial support for swapping GDI images with a real BIOS was added a few months back, but then we went down the rabbit hole of adding support for CDI images as well as swapping with the HLE BIOS. Supporting swapping GDIs with the real BIOS for games that require it was quick work, but numerous issues popped up once the support matrix expanded to GDI x CDI x real BIOS x HLE BIOS x all multi-disc games.

After a week or two of testing and prodding the GD-ROM hardware these issues were rooted out and disc swapping is now supported for all image types, with or without a BIOS.

Cheat support

Going into this I had never used an external cheat device (GameShark, Code Breaker, etc.) and by the end this became one of my favorite features from this release.

While I hadn't used these devices, I'd seen cheats such as 011EC4B4 0000173E (which for Crazy Taxi gives you an infinite amount of game time) but what did this cheat actually do? As it turns out, there are many good resources scattered around the internet to help answer this question, and for most devices the cheats break down similarly:

  • Each 4 byte word (11223344) is an instruction.
  • The first (from left to right) byte represents an operation.
  • The remaining 3 bytes are arguments for the operation.
  • Additional words can follow to provide additional arguments and form conditionals.

For 011EC4B4 0000173E, the cheat decodes like so:

  • 01 is the operation which in this case means to perform a 16-bit write.
  • 1EC4B4 is the offset in system ram to write the 16-bit value to.
  • 173E is the value to be written.

So that's cool, these cheat codes are their own little programming language, but how should the code be ran to actually apply the cheat? Manually applying a few cheats while a game was running worked fine, but some would apply and quickly stop. These devices weren't just applying cheats once at startup, they must be applying them periodically.

After some digging around, I read about master codes which made the process somewhat click - they were somehow modifying games to call into their own code which would apply the cheats. However, master codes are rare and game specific, so there must be some generic way they do this as a default. I couldn't think of a way to generically patch the games, but was reminded of how some of the release groups modified the BIOS system calls to bypass security cheecks. After remembering that and adding some checks to monitor writes to the system calls I found my answer - they were patching one of the frequently used GD-ROM system calls to jump to their own code which would apply the cheats.

With the actual implementation of decoding and applying cheats sorted, a large number of publically available cheats were imported which show up automatically for GDIs in the Manage cheats menu. Support for adding custom cheats isn't available yet, but a menu for editing them will be added during the next round of cheats work.

 

Widescreen cheats

Building on the cheat support mentioned above, we also rounded up and manually verified the widescreen cheats created by the members of the AssemblerGames forums (Espirral, yzb37859365, Radaron, ELOTROLADO.NE, Joel, S4pph4rad and VIRGIN).

These can be enabled for supported games from the same cheats menu shown above.

BIOS improvements

Since 1.4.0, all known compatibility issues have been fixed with our replacement BIOS, support for the built-in BIOS font was added, and every issue (other than JP character support in the BIOS font, and support for some homebrew / cracked CDIs) has been closed out in the issue tracker.

If a game runs with the retail BIOS, it will also run with our replacement HLE BIOS.

Graphics improvements

Parameter Selection Modifier Volumes

The Dreamcast's PowerVR supported what they referred to as "modifier volumes." These modifier volumes were user-defined 3D volumes that could be programmed to change how lighting was calculated for geometry that was either inside or outside of the volume.

The PowerVR supported two different modes for processing the modifier volumes:

  1. Intensity Volume Mode. These are used by many games and have been supported for a long time. In this mode, if a pixel is inside a volume the colors used to shade the pixel are scaled by a single value, typically to darken the final color to look like a shadow.

  2. Parameter Selection Volume Mode. These are only used by a handful of games, but are now supported. In this mode, surfaces can send two different sets of texture and shading parameters - one set that is used if the final pixel is outside of a modifier volume and one that is used if the final pixel is inside of a modifier volume. With this, games can do more dynamic effects such as changing the color or texture of a pixel based on if it's inside our outside of a modifier volume.

This mode is used by Speed Devils in night time races to produce the headlight effect.

Speed Devils at night without this effect

Intermdiate results, pixels inside of a volume are red, pixels outside are black

Final results, with different shading applied to pixels inside the volumes

Fog Support

The PowerVR supported both per-vertex and per-pixel fog modes, both of which are now emulated.

In addition to emulating the fog, the renderer has been refactored to correctly render shadows when fog is used. Previously, the entire scene was shaded, and shadows were cheaply rendered on top of the final scene. This works great when there is no fog, but unfortunately fog processing comes after modifier volume shading in the graphics pipeline, which makes some scenes to look wrong with the original approach:

Shadows incorrectly applied after fog

Shadows correctly applied before fog

This is now fixed, but the refactor was bittersweet - the improved accuracy comes with a large performance hit (particularly on mobile) for all games, while only a handful of games are noticably improved by it.

Secondary Accumulation Buffer Support

Support has been added for a rarely used PowerVR feature - the secondary accumulation buffer. The secondary accumulation buffer was a secondary render target that could be drawn to, and then sampled from, during the same frame. This could be used to provide multi-texturing, bump mapping and trilinear filtering but in practice it was only used by a handful of retail games (Evil Dead and a few 2K sports games).

NHL 2K2 used the secondary accumulation buffer to composite the ice texture with a decal texture and then blended that back into the main scene:

NHL 2K2 without the secondary accumulation buffer output

Ice and decal texture composited to the secondary accumulation buffer

NHL 2K2 with the secondary accumulation buffer output

Trilinear Filtering

For some background, trilinear filtering works by interpolating the results of two adjacent bilinear filtered mipmaps. Modern GPUs have dedicated hardware to select two mipmaps and filter between them, but with the Dreamcast the user had to do additional work in the form of rendering the polygon once for each mipmap level and manually blending them.

In order to render a trilinear filtered, opaque polygon the user had to:

  1. Render polygon, tweaking mipmap selection to use the higher resolution mipmap.
  2. Render polygon, tweaking mipmap selection to use the lower resolution mipmap.

To tweak the mipmap selection, a decimal coefficient was supplied for each polygon which could be used to select a higher or lower mipmap level. Most importantly, the fractional portion of this coefficient was used to scale the final color enabling the results to be additively blended.

Of this process, the mipmap level selection tweaking is not yet emulated, but scaling the final color by the fractional portion of the coefficient is. This causes the final result to just be bilinear filtered, but the results are no longer blended incorrectly:

D coefficient not honored, colors get added together twice

D coefficient honored, blending is correct

In Closing

The past year has been a long one - it's been a delicate balance of adding new platforms, optimizing for these platforms as well as adding new features and improving compatibility and accuracy along the way. For a more complete list of changes, you can view the changelog.

With these new platforms out of the way, we're really excited to focus purely on improving compatibility over the next few months, some of the major items to soon be tackled are:

  • Improved input device emulation. Adding support for emulating the Dreamcast's jump pack, keyboard, etc.
  • Improved audio emulation. Adding DSP support, low-pass filter support and more.
  • Windows CE support. Compatibility with non-Windows CE is nearly complete, it's finally time to add this.

If you'd like to follow along with the development or suggest improvements, come join us on our Discord server: