Яндекс

Sunday, April 2, 2017

Resident Evil 2

Resident Evil 2 for Nintendo 64 is hard to emulate game. While the game uses standard ucode (or slight modification of standard one), it uses few non-standard tricks, which are hard to reproduce on PC hardware. I spent lots of time on this game when I worked on Glide64 plugin. Abilities of 3dfx graphics card allowed me to obtain pretty good result: the game was fully playable on Voodoo4/5 with some minor glitches. Later necessary functionality was added to glide wrapper, so you can run the game on any modern PC card.

What makes the game hard to emulate? As you know, the game consists of static 2D backgrounds with 3D models moving over. Background size may vary from place to place: someplace it is 436x384, someplace 448x328 and so on. Frame buffer size corresponds to background size. Video interface stretches image to TV resolution 640x480.

The first problem, which hardware plugin faces in this game is the way how background loaded to frame buffer. To optimize background load and rendering on N64 hardware, background loaded as image with width 512. That is 448x328 image is loaded as 512x287. The game allocates color buffer with width 512 and renders background with BgCopy command into it. In fact BgCopy works as memcpy to copy background content from one address in RDRAM to another. When buffer copy completed, the game allocates buffer with the same origin, but with width 448. Now buffer has correct proportions, and 3D models can be rendered over.

Why it is a problem for hardware graphics plugin? The plugin executes BgCopy command, which loads 512x287 image. It is no problem to create 512x287 texture and render it to frame buffer. The result will look like this:


If the background rendered right to frame buffer, that result can't be fixed. If frame buffer object is used for rendering, you may try to change size of buffer texture the same way as N64 changes size of color buffer. I did not find a way to change size of existing texture without loosing its content with OpenGL. glTexImage2D can change the size/format for existing texture object, but it removes all previous pixel data. Of course, it is possible to copy texture data to conventional memory, resize texture and write the data back, but it will be slow. If you know better method, please share.

There is fast solution of the problem: a hack. Value of video interface register VI_WIDTH is the same as actual width of background image. Thus, we can recalculate background image dimensions and load it properly:


I used that hack in Glide64 and I still don't know better solution. Unfortunately, it works only for HLE, because BgCopy is high-level command. For LLE we still need somehow resize buffer texture.

The next problem is depth compare. I already described the problem here and here, so I cite myself:
"Few games use scenes consisting of 3D models moving over 2D background. Some of objects on the background can be visually "closer" to user than 3D model, that is part of the 3D model is "behind" that object and that part must not be drawn. For fully 3D scene problem "object behind other object" is usually solved by depth buffer. 2D background has no depth, and depth buffer by itself can't help. Zelda OOT solves that problem by rendering auxiliary 3D scene with plain shaded polygonal objects, corresponding to the objects on the background. Thus, the scene gets correct depth buffer. Then the background covers this scene and 3D models rendered over the background are cut by depth buffer when the models are behind the original polygonal objects.
In Resident Evil 2 all screens are 3D models over 2D backgrounds. But the game does not render auxiliary 3D geometry to make depth buffer. Instead, the game ROM contains pre-rendered depth buffer data for each background. That depth buffer data is copied into RDRAM and each frame it is rendered as 16bit texture into a color buffer which then is used as the depth buffer. To emulate it on PC hardware the depth buffer data must be converted into format of PC depth buffer and copied into PC card depth buffer."

Glide64 was the first plugin, where the problem was solved. Copy values to depth buffer was relatively easy with glide3x API: glide3x depth buffer format is 16bit integer, as for N64. I could load depth image as 16bit RGB texture, render it to a texture buffer and then use that buffer as depth buffer, exactly as N64 does. OpenGL could not do it, but glide wrapper authors also manged to solve that problem. It was kinda hackish, but it works.

GLideN64 uses another solution. I invented it for NFL Quarterback Club 98 TV monitor effect. It is described in details in my Depth buffer emulation II article. Depth image is loaded as texture with one component RED and texel format of GL_UNSIGNED_SHORT. When the texture is rendered, fragment shader stores fetched texel as its depth value. Depth value from fragment shader passed to depth buffer, exactly as we need.

So, we have color background and depth buffer correctly rendered. Victory? Not yet. Depth buffer compare works, but not always. Here it works ok:


but if I step behind it looks like this:


Where is the problem? The problem is in the way N64 depth buffer works. N64 vertex uses 18bit fixed point depth value. N64 depth buffer stores 16 bit elements. N64 uses non-linear transformation of 18bit vertex depth value to 16bit value, which will be used for depth compare and then kept in the depth buffer. OpenGL uses floats for vertex depth and for depth buffer, but it is incorrect to directly compare GL depth component with value from N64 depth image. First, the same transformation must be applied to vertex depth. Fortunately, necessary shader code was already written for depth based fog, which Beetle Adventure Racing uses. I reused that code and finally got perfect result:







If you want to support my work:


Saturday, April 1, 2017

Major modification of frame buffer and video interface emulation.

I already wrote about N64 Video Interface emulation in GLideN64. It was my first attempt to make things right. Three years passed. Many elements of frame buffer emulation mechanism have been modified since that time. However, one major problem remained. This problem is as old as N64 emulation itself. This is "frame buffer height" problem.

To render anything you first need to allocate rectangular buffer, which will hold your graphics. You need to know width and height to allocate the buffer. The problem is that RDP command SetColorImage set only width of color buffer. Height is not set. RDP does not need to know buffer height. SetColorImage provides buffer origin, number of pixels per line and size of each pixel in bytes. This is enough to calculate position of vertex with given X and Y coordinates within the buffer. Scissor command prevents out of buffer writes. Software graphics plugin works exactly as RDP and also does not need to know buffer height. Hardware plugin is in trouble. Suppose, we selected 960x720 resolution with 4:3 aspect ratio and created 960x720 render buffer in video memory. N64 game allocates buffer with width 320. Which scale should we apply to original N64 coordinates to get correct picture in our render buffer? Since 960 = 3 x 320, it seems that correct scale is 3x. That is we scale original N64 X and Y coordinates by 3 and get picture in our buffer. Will this picture be correct? Only if original buffer also has 4:3 aspect, that is has size 320x240. In reality, it also can be 320x220, 320x256 or even 320x480. In all these case 3x scale for Y will give us wrong result. To get correct Y scale we need to know height of original buffer, but it is not available.

Height of N64 render buffer can be estimated from parameters of Video Interface, which defines how color buffer will be mapped to TV screen. All hardware plugins, which I know from inside use this possibility. Thus, frame buffer allocation becomes dependent on VI registers. This dependency does not exist in N64 itself. The height estimation does not guarantee to be always correct, and in fact it is often incorrect. The estimation code is complex and full of heuristics, to reduce numbers of errors. Nevertheless, this tie still induce many issues, in particular with PAL games and with games, which use interlaced TV modes.


Besides main color buffers, whose content is displayed on TV, N64 games often use auxiliary color buffers. These buffers are used for variety of purposes: dynamic shadows, reflections, TV monitors and so on. Auxiliary color buffer can be of any size. Thus, estimation of auxiliary buffer height is complex and fully heuristic algorithm, which also not always works right. Wrong height lead to visual glitches.

At the end of 2016 I finally invented the way to get rid of necessity to know exact height of  N64 color buffers. The idea is actually very simple. Why RDP does not care about buffer height? It knows that the height is large enough and just fills the buffer with primitives. Video Interface takes necessary part of the buffer and maps it on TV screen. Auxiliary buffers are used as textures: game's program code knows buffer's bounds and maps texture coordinates to its content.
My frame buffer mechanism creates separate frame buffer object in video memory for each buffer allocated by RDP. I used estimated height to create the buffer render target. It caused aforementioned issues when estimation heuristics failed and produced wrong result. So, the idea is to not use estimated buffer height and always use large enough height instead. 'Large enough' should be taken literally. It is some value, which is surely greater or equal to any possible height of N64 buffer. There are some natural limitations: maximal buffer size for NTSC is 640x480 and 640x576 for PAL.
Since I know width of rendering resolution selected by user and I know width of N64 rendering buffer - I know how to scale original coordinates of N64 vertices. This scale can be applied for X and Y coordinate, no matter has the N64 buffer the same aspect as user selected screen resolution or not. Video Interface emulation will map my frame buffer object to screen the same way as N64 Video Interface maps N64 buffer in RDRAM to TV screen.

Pros:

  • No more buffer height estimation heuristics.
  • No more glitches caused by wrong height estimation
  • Emulation of effects, not working before
Cons:
  • More video memory needed. Memory overhead is not large for main buffers, because actual buffer height is usually close to natural limit used as Large Enough Height. Memory allocated for auxiliary can be 10 times more than actually used.

While the idea is simple, its implementation was not.  It was obvious, that lots of things need to be changed. The first step was code refactoring, mentioned in the previous article. After that step I got more clear and easy to modify code. It was not enough though. Some preliminary steps had to be done first.

There is one OpenGL specific problem with emulation of N64 graphics. N64 uses coordinate system with origin in upper left corner. Glide3X API allowed to set origin to either upper left or to lower left. So, when I worked on Glide64, I set origin to upper left and had no inconveniences. OpenGL has origin nailed to lower left corner. If you will use N64 coordinates, you will get image upside down. Thus, Y coordinate must be inverted. (0,0) coordinate translated to (0, maxY), where maxY is buffer's height.


It is simple trick, but you need to apply it everywhere: modify vertex Y, viewport Y, scissor Y. Read from frame buffer to RDRAM have to be done in reverse order. Things could get even more complicated with new frame buffer technique. Thus, I decided to remove Y inversion. Of course, image will be upside down in that case.


However, the image is in frame buffer texture, which I can map to screen as I need. So, it is not a problem. The problem arises when you do not use frame buffer object and do rendering right to back buffer. GLideN64 renders right to screen when frame buffer emulation disabled. I did not want to keep Y inversion code to support "no frame buffer emulation" mode. My goal was to simplify things, not to make them more complex and intricate. Thus, I decided to slightly modify "no frame buffer emulation" mode: use one frame buffer object for rendering instead of direct render to back buffer.  It also mentioned in previous article: "Anti aliasing without frame buffer emulation". After that modification I could safely remove Y inversion code.

After preliminary work completed, real challenge started. Implementation of my idea was a very hard task.  Frame buffer emulation was twisted tight with VI emulation, and I spent many time untangling multiple knots and fixing weirdest glitches. At the end I was totally rewarded. Issues with cut image in PAL games gone. Issues with screen shakes in interlaced mode gone. Many crashes with buffer copy to RDRAM gone. VI effects started to work more smooth. Screen shrink VI effect in Mia Hamm Soccer finally start to work properly.



Saturday, March 25, 2017

Project news

Hello,

Three month passed since the latest Public Release. Time to report about most noticeable changes.

Massive code refactoring


GLideN64 currently supports the following graphics API: OpenGL, GLES2, GLES3, GLES3.1
OpenGL support also divided on GL 3.3 and GL 4.3+. API functions called directly from any place in code. It causes the following problems:

  • The code contains lots of GL version - specific code, separated by #ifdef (or if() for OpenGL versions)
  • Android emulator distributes 4 GLideN64 binaries for each supported CPU family. 

I refactored GLideN64 code to totally remove direct calls to graphics API from main code. All core GLideN64 classes use special proxy class graphics::Context to manipulate with textures , shaders, buffers and so on, and to draw objects. Context passes calls to back-end class. Currently there is one back-end, which uses OpenGL. If somebody wants to add Vulkan API or DirectX API support, it can be made much easier now: just write new back-end.

OpenGL backend designed as dynamically adoptable for available GL version. It may use different functions for the same task. For example, if available GL supports glTexStorage2D the new texture will be initialized with glTexStorage2D and with glTexImage2D otherwise.

Another example is polygons drawing. Core OpenGL 3.3 requires to pass vertex data from application to GL via Vertex Buffer Object (VBO). GLideN64 used immediate mode rendering with data stored in client side arrays. Thus, it could not use core profile. New back-end implements primitives drawer, which uses VBO and supports core profile. However, we found that many Android devices work better with old immediate mode rendering. So, the back-end also has primitives drawer, which uses immediate mode. Back-end decides which drawer to use in run-time and does it transparently to main code.

The amount of code changes was huge. I totally rewrote many parts of code. As the result, the code is much more clean now. Logan McNaughton and Francisco Zurita helped me to tune the back-end and select most effective GL functions for each GL version. In most cases refactored code works as fast or better than before refactoring. Android port now uses only one binary for all versions of GL ES.

VSync support


GLideN64 version for Zilmar-spec emulators does not support vertical sync. I thought that necessity in that option gone with analog monitors. However, users asked me to add it, because they experienced tearing on their monitors without it. VSync is not part of OpenGL specifications. Use of WGL extensions required to enable it on Windows. I haven't time for it until recently. The code refactoring made it possible to use OpenGL core profile on Windows. Core profile also requires use of WGL extensions. I made necessary changes. Adding VSync support was a matter of few lines after that. Ryan Rosser added new control to the GUI.

MacOsX port


While GLideN64 successfully works on Linux and Android, Mac port was impossible until now. Mac OpenGL driver requires from application to use core GL profile if it needs to work with OpenGL 3.3 or above. That implies VBO support. GLideN64 did not support VBO until the refactoring. I made new attempt to port the code to MacOsX after refactoring completed. This attempt was successful:

I don't know how well the port works: I have no Mac. I got remote access to Mac mini via command line and just made code compilable. The video provided by Brent Woodruff, who built plugin from sources and run on his Mac.


Anti aliasing without frame buffer emulation



Frame buffer emulation is enabled by default. It can be disabled, but it also disables many features, including anti aliasing and gamma correction. This is because anti aliasing and gamma correction requires rendering with Frame Buffer Objects (FBO), which is enabled only with frame buffer emulation. I changed it: now plugin always uses FBO for rendering. This made possible to use anti aliasing and gamma correction even when frame buffer emulation disabled. This was done as preliminary step for another large code refactoring, which I will describe next time.


Donations


Donations are welcome. Two options are available: Yandex Money (the form above) and PayPal: https://www.paypal.me/SergeyLipskiy. Both methods work well, my thanks to people, who used them. Also, does anybody know how to place my paypal.me link as widget/gadget on blog layout? I'm helpless in web design. Another side note: it seems that my mail server has problems with sending mails to AOL mailboxes. I tried to say thanks by email, but it probably was not delivered.