Journey to Poom
Doomed Assault
As in many gamedev stories, Poom should not have existed.
It came to be possible on Pico8 thanks to a (still unpublished) project, an Assault demake - a game I used to play at arcades as a kid (dual stick! excellent music! ultra low bass explosions!).
Early 2020, quite proud of my silky smooth rotozoom engine running entirely in memory, well below 50% cpu with enemy units.
@Eniko & @Electricgryphon had already paved the way, I went slightly further:
load #assault (requires a 0.1.12c version to run at full speed)
Engine is looking good, time to export to HTML and demo it. I run a couple of tests on my home computers and mobiles...
It did not went well, performance was all over the place and required a powerful PC to run at full speed.
Reporting the bug (?) to Zep (Joseph White, Pico8 author), it became apparent the game relied too much on binary operations and trashed the web player. The delicate balance of simulated API costs did not account for so many "low level" operations per frame.
Too much binary ops? Nah...
poke4( mem, bor( bor( bor(rotr(band(shl(m[bor(band(mx,0xffff), band(lshr(srcy,16),0x0.ffff))],shl(band(srcx,7),2)),0xf000),28), rotr(band(shl(m[bor(band(mx mdx1,0xffff), band(lshr(srcy-ddy1,16),0x0.ffff))],shl(band(srcx ddx1,7),2)),0xf000),24)), bor(rotr(band(shl(m[bor(band(mx mdx2,0xffff), band(lshr(srcy-ddy2,16),0x0.ffff))],shl(band(srcx ddx2,7),2)),0xf000),20), rotr(band(shl(m[bor(band(mx mdx3,0xffff), band(lshr(srcy-ddy3,16),0x0.ffff))],shl(band(srcx ddx3,7),2)),0xf000),16)) ), bor( bor(rotr(band(shl(m[bor(band(mx mdx4,0xffff), band(lshr(srcy-ddy4,16),0x0.ffff))],shl(band(srcx ddx4,7),2)),0xf000),12), rotr(band(shl(m[bor(band(mx mdx5,0xffff), band(lshr(srcy-ddy5,16),0x0.ffff))],shl(band(srcx ddx5,7),2)),0xf000),8)), bor(rotr(band(shl(m[bor(band(mx mdx6,0xffff), band(lshr(srcy-ddy6,16),0x0.ffff))],shl(band(srcx ddx6,7),2)),0xf000),4), band(shl(m[bor(band(mx mdx7,0xffff), band(lshr(srcy-ddy7,16),0x0.ffff))],shl(band(srcx ddx7,7),2)),0xf000)) ) )
Long story short, Pico8 0.2 is out shortly after - binary operators and tline ("textured line") are a thing...
The new addition to the Pico8 API manual reads:
tline x0 y0 x1 y1 mx my [mdx mdy] [layers] Draw a textured line from (x0,y0) to (x1,y1), sampling colour values from the map.
Assault port to tline proved the huge potential of the function:
45% cpu - 1024x1024 pix map - rotating enemies - nuklear blast!
Half joking I told Zep about the flood of Doom clones we will have in no time...
What if, say, I dig myself a bit more into binary space partitioning (BSP trees) & portals?
A sign that current project is going to have some competition...
Doom Engine? Nah!
Good thing approaching a (very successful) game 30 years late is that documentation & tooling is top notch!
Driven by portability & extensibility, level geometry is now expressed using an open text format (UDMF), saving the need for tedious binary unpacking of WAD structure (to some extent...) and perfect for quick hacking.
May 9-10th: armed with Real-Time Collision Detection from Christer Ericson, ANTLR for UDMF parsing and a good pinch of Python I got my first BSP compiler up and a running Pico8 renderer:
each polygon has its own color - notice split on pillar
https://twitter.com/FSouchu/status/1259520453128990721?s=2
Sorry Assault, see you on the other side!
Next 2 weeks are spent digging into WAD structure (I think I know what a linedef is by now...) and going over Zdoom wiki a million times.
My custom compiler was quickly dropped in favor of the "official" zbsp compiler (nothing beats years of bug fixing!), Python code fully decodes binary WAD files, complex geometry is supported, including textured walls & floors:
https://twitter.com/FSouchu/status/1266474492890624011?s=20
Slow motion rendering (without textures), each color represents a convex sub-sector:
Quake To The Rescue
Since textures are in, the engine departed from "canon" front-to-back Doom rendering to back to front.
I really never thought front to back woudl work anyway - programming for Pico8 is more akin to targeting a very low end GPU.
As per Doom standard, perspective correct texturing & shading work for cheap as long as the world is made of flat walls & floors.
Fast depth shading (the farther you look, the darker it is) is done using standard palette swapping. Depth information indexes a gradient table, stored in Pico8 ram. It makes full palette swap faster than regular method pal() call, saving precious cycles in the core wall & floor rasterization loop:
-- pal1: normalized integer depth if(pal0!=pal1) memcpy(0x5f00,0x4300|pal1<<4,16) pal0=pal1
Back to front rendering means loosing natural "narrowing" of level scanning of standard Doom (see Black Book for an extensive explanation!) and the limited performance impact of large levels that comes with it.
That's where having read Mike Abrash Black Book years ago helped.
I knew Quake had the same issue and solved it using a Potentially Visible Set (PVS). A PVS, generated at compile time, is the set of all potentially visible convex sub-sectors from any given given sub-sector (read it twice, slowly!).
Of course, zbsp doesn't generate such information...
Some more Python code & maths (and cursing!), it works!!!
bold numbers are all sub-sectors visibles from sub-sector 38
Surprisingly, there is little literature on how to generate a proper PVS, most notable sources I found:
Source Engine PVS - A Closer Look
Potentially Visible Sets explained
PVS is encoded as a bitfield, stored as 32bits chunks to keep memory usage under control (a much simpler version than RLE encoding of Quake PVS):
-- pvs (packed as a bit array) unpack_array(function() -- visible sub-sector id (16 bits) local id=unpack_variant() -- pack as a bitfield pvs[id\32]=bor(pvs[id\32],0x0.0001<<(id&31)) end)
Virtual Sprites
Working some more on getting rotating sprites into game engine, it is obvious sprite real-estate is going to be a major pain point. For reference, a single soldier pose is eating almost half of the space available. And that's for a "small" sprite...
I knew the game would be multi-cart, that is, storage may not be my main concern and Pico8 has a large 2MB of LUA ram to play with.
What if sprites could fit in memory, how to then best use built-in Pico8 sprite scaling capabilities (read, sspr)?
What if I had some kind of fast memory allocator? Last recenly used cache is a good design pattern for this case, simple enough if chunks are of fixed size.
What if I split sprites into little (but not too small) chunks, say 16x16 and get their actual sprite location from that "virtual memory" allocator?
July 15th, this is indeed a very workable approach - memcpy is fast enough to swap required sprite tiles on the fly. Sprites can be up to 16 tiles, e.g. up 64 by 64 pixels, allowing large monsters like Cyberdemon at 1/2 of their original size.
Say hi to Cyb'!
Virtual Sprite Engine (tm!) integrated in game - monsters can be much more than a blurry pixelated mess!
Actors are registered in each sub-sector they are touching, based on their radius. Multiple actors on a given sub-sector are sorted using a basic insertion sort, assuming the number of "co-located" actors is usually low:
-- all things in sub-sector for thing,_ in pairs(segs.things) do local x,y=thing[1],thing[2] local ax,az=m1*x m3*y m4,m9*x m11*y m12 -- todo: take radius into account if az>8 and az<854 and ax<az and -ax<az then -- default: insert at end of sorted array local w,thingi=128/az,#things 1 -- basic insertion sort for i,otherthing in ipairs(things) do if(otherthing[1]>w) thingi=i break end -- perspective scaling add(things,{w,thing,63.5 ax*w,63.5-(thing[3] m8)*w},thingi) end end
Once sorted, sprites are rendered right after sector's polygons.
Note that registering actor into multiple sub-sector also solves the issue of overlapping sprite/polygons.
See how back to front rendering of convex sectors are erasing part of Cacodaemon in below gif. Sprite is rendered multiple times, and the last render (in the nearest sub-sector) fixes the image:
Registering actors per sub-sector (e.g. BSP leaves) is also used to speed up player/actor & actor/actor collision detection.
Summer time, at this point, my goal is clear: make the engine as easy as possible to work with an artist, I'll need a team to realize the vision.
On to gameplay!
Decorate Love Letter
The DECORATE format is brilliant!
Each "thing" runs it's own little state machine, can reference sprites and call game functions:
actor ZombieMan : Monster 3004 { Health 20 Radius 20 Height 56 Speed 8 States { Spawn: POSS A 10 A_Look; Loop See: POSS A 8 POSS B 8 A_Chase; Loop Missile: POSS E 10 A_FaceTarget; POSS F 8 A_FireBullets(22.5, 0, 1, 9, "BulletPuff"); POSS E 8 Goto See Death: POSS H 5 POSS I 5 // A_Scream POSS J 5 // A_NoBlocking POSS K 5 POSS L 60 Stop } }
The Python compiler supports a limited set of features, but enough to support key Doom gameplay elements (states, animations, function calls with parameters).
The runtime part is simple enough to fit in less than 20 lines of code:
-- vm update tick=function(self) while ticks!=-1 do -- wait ( reset random startup delay) if(ticks>0) ticks =delay-1 delay=0 return true -- done, next step if(ticks==0) i =1 ::loop:: local state=states[i] -- stop (or end of vm instructions) if(not state or state.jmp==-1) del_thing(self) return -- loop or goto if(state.jmp) self:jump_to(state.jmp) goto loop -- effective state self.state=state -- get ticks ticks=state[1] -- trigger function (if any) -- provide owner and self (eg. for weapons) if(state.fn) state.fn(self.owner or self,self) end end
Best of all, everything in game is described using the same syntax: weapons, bullets, items & monsters!
By August, I have a Python package that can be easily installed. Compiler supports many of key Doom features (doors, platforms, monsters, infigthing, multiple weapons, pick ups, difficulty levels...).
Main "compilation" pipeline steps:
- Read WAD entries
- Extract normal & pain palettes
- Read actors (from DECORATE file)
- Split actor sprites into unique tiles
- Read properties
- Decode state machine & function bindings
- Decode sprite properties
- For all maps (from gameinfo file)
- Convert texture into unique set of tiles (max. 128)
- Read skybox image (if any)
- Read level data
- sectors, sides, vertices, linedefs, pvs, sub-sector & BSP nodes
- specials (triggers)
- active textures
- things (e.g. monsters, weapons...)
- Actors & map data is packed into multiple carts
The Right Match
End August, I am reaching out to Paranoid Cactus (of X-Zero fame and I am total fan of his work!), let's wait and see...
Paranoid Cactus (Simon Hulsinga IRL) replies a couple of days after and seems to be interested - good news!
September 9th, Simon delivers a first test level:
wow (it became my favorite expression throughout the project)
Code is almost complete, we should be shipping by what, end September?
....
If we knew...
8192 Tokens Forever
It so happens that Simon is a multi-classed gamedev, mastering code, art, music & gameplay (yeah, life is unfair!).
Paranoid Cactus takes the lead on gameplay decisions, we both get into a routine of challenging current features, reworking the engine to support ever increasing details while keeping tokens in check. To name a few:
- Dedicated title screen
- Weapon wheel
- Save player state between levels
- Flying monster support
- Transparent textures
- Secret sectors
- Skybox
- Sound blocked by walls
- Non-bullet weapons (e.g. hands)
Code goes into massive refactoring, always close to the danger zone, always finding new ways to squeeze our last idea in!
Example token optimization technique, where item identifier, property name and unpacking function are declared as a large text block:
-- layout: -- property mask -- property class name -- property unpack function local properties_factory=split("0x0.0001,health,unpack_variant,0x0.0002,armor,unpack_variant,0x0.0004,amount,unpack_variant,0x0.0008,maxamount,unpack_variant,0x0.0010,icon,unpack_chr,0x0.0020,slot,mpeek,0x0.0040,ammouse,unpack_variant,0x0.0080,speed,unpack_variant,0x0.0100,damage,unpack_variant,0x0.0200,ammotype,unpack_ref,0x0.0800,mass,unpack_variant,0x0.1000,pickupsound,unpack_variant,0x0.2000,attacksound,unpack_variant,0x0.4000,hudcolor,unpack_variant,0x0.8000,deathsound,unpack_variant,0x1,meleerange,unpack_variant,0x2,maxtargetrange,unpack_variant,0x4,ammogive,unpack_variant,0x8,trailtype,unpack_ref,0x10,drag,unpack_fixed",",",1) -- properties: property bitfield of current actor for i=1,#properties_factory,3 do if properties_factory[i]&properties!=0 then -- unpack & assign value to actor actor[properties_factory[i 1]]=_ENV[properties_factory[i 2]](actors) end end
Bumpy Zone
So far it looks like everything went smooth and nice... It did not!
Collisions
Collision went through major refactoring multiple times to make world feels solid - one of the key point of Doom.
I was actually surprised to find that Doom used a different data model to handle collision (well known BLOCKMAP), when BSP would have a perfect match (Black Book & Carmack notes confirm it was a missed opportunity).
Poom collision code generates a list of linedefs traversed along a ray, using the BSP to traverse the world in order.
One day you have a super solid collision routine, the next day you got that gif:
Root cause is that a BSP tree loose sector spatial relationship, e.g. 2 walls might be on 2 different part of the tree, leaving a kind of "gap" when handling collision in sequence.
Solution was to treat each wall as a capsule (e.g. a ray with a radius), ensuring player path cannot fall in between wall segments.
Black sectors
PVS calculation has also been quite tricky to get right in all cases. When failing, whole sectors would end up disapearing from screen (dubbed "black sectors" by me and Simon).
Left side: standing in red sector, zone circled yellow is clearly MIA!
Getting back to Chris Ericson reference, correct PVS calculation algorithm ended up as:
from a given sub-sector (convex zone): iterate over all double sided linedef register connected sub-sector ("other sub-sector") # find all anti-portals for all double sided linedef from other sub-sector: if other linedef is front or straddling current linedef: register linedef pair as a portal # clip & find visible geometry while portal set is not empty: create clip region ("anti-penumbra") for all sides of destination portal: if side is back or straddling: clip segment if anything remains: register a new portal mark sub-sector visible
Compression
Compression was not supported until late in game developpement, I had already a LZW encoder for Assault that worked rather well.
Thing is, I soon realized that LZW decompression is a memory hog, certainly not compatible with a game already stretching available RAM...
Google to the rescue, I was sure the embedded crowd had something for me.
A random post on some Arduino forum lead me to this: https://www.excamera.com/sphinx/article-compression.html
Compression code in Python, decompression code small enough to fit token budget and best of all, fixed memory cap!
Compression ratio is about 50%, does a good enough job on picture and world data.
Thanks James Bowman!
Out of Memory
That roadblock was a big one - playing levels back to back would tip the game over the 2MB limit.
Below is the kind of chart I used to find out if I was battling a garbage collection bug or dependency cycles or some other bugs...
The game went though heavy refactoring at this point:
- hot reload after death was removed (ensures clean memory slate)
- most data structures where converted to arrays, code lisibility suffered but memory usage went back to safe zone
-- 11KB for i=1,100 do add(buf,{id=1,tex=2,sp=3,hp=4}) end -- 7Kb for i=1,100 do add(buf,{1,2,3,4}) end
Artwork & Level Design
Poom appeal would have been nothing without the right pixels - and we needed many many pixels...
Left to right: Original Doom Imp | automatic 50% scaling palette conversion| hand fixed
Best part of the September to December was spent redrawing the selected cast to reach the right level of quality.
What Went Right
Choices
Pico8 games are all about making choices, even with such a large game, not every idea could make it.
Fun & gameplay came mostly first, second "hey, it fits"! I trusted Simon when it came to game design decisions, still trying to lure him with new engine features like transparent textures!
Beta Test
Mid November game is ready for testing.
As obvious as it seems, once you have spent some months working on a game, having a pair of fresh eyes is invaluable.
Sending out invites to Tom Hall (of Doom design fame and resident Pico8 Discord member) and Henri Stadolnik (of Fuz fame), they quickly proposed a good number of quality of life tweaks & great insight into level progression.
The most notable addition was mouse support (quoting Tom "I want mouse support"), wiring "mouse lock" browser handler to Pico8 GPIO to communicate coordinates back to the game.
Demoing that to Zep certainly helped bring mouse lock as a native Pico8 feature (with a under the cover patch delivered in the wee hours of the night!) - Thanks!
Conclusion
Behond the technique, the main take away is that such large game would have been difficult to pull out without teamwork, kind words & contributions from the extended Pico8 community (hey @farbs, @nucleartide, @sam, @valeradhd...!) .
Game was our biggest success so far totalling 10K downloads, 100K web sessions, hundreds of comments...
Thank you all for that (and thanks Id Software for such a timeless game)!
Reading gave you some new gameplay ideas, want to rework sprites or try to improve engine?
Hesitate no more, a full blown SDK is there - the same tools we used to make the game!
Poom SDK (support Discord: https://discord.gg/Bmc4nxjfuE)
Get POOM
POOM
DOOM reinvented for PICO8
More posts
- 1.9 standalone zipMar 03, 2024
- Version 1.9Jan 22, 2023
- Version 1.8Sep 21, 2021
- Version 1.7Jan 22, 2021
Comments
Log in with itch.io to leave a comment.
Thank you for sharing,this was a great read :)
Thank you
Thanks for the write up and the sdk.
happy to know if you do anything with it!!
Thanks a lot for the write-up, really interesting!
Delightful read. Thanks for writing it all out.
Thanks for sharing this log with us!
very cool in-depth write-up for one of the most badass games on the site :)
Amazing! Thanks for sharing!