Skip to content

WIP: Lossy Sprite Compression#874

Closed
meeq wants to merge 29 commits into
DragonMinded:previewfrom
meeq:lossysprite
Closed

WIP: Lossy Sprite Compression#874
meeq wants to merge 29 commits into
DragonMinded:previewfrom
meeq:lossysprite

Conversation

@meeq
Copy link
Copy Markdown
Contributor

@meeq meeq commented Apr 29, 2026

Rasky and I have been working on implementing support for H.264-based lossy compression of sprites. I have a working example that demonstrates lossy-compressing a PNG at build-time and decompressing it at run-time into a LibDragon Sprite.

Public API

  • lossysprite_load — loads/decodes an LSPR file (an H.264 intra slice) into a regular sprite_t
  • sprite_load / sprite_load_buf sniffs the LSPR magic and delegates to lossysprite_load, so existing call sites work transparently
  • rdpq_sprite_blit knows how to draw the result
  • rdpq_sprite_upload asserts (single-TMEM upload is impossible for semi-planar YUV)

Decoder (src/lossysprite.c)

  • Decompresses the LSPR file into YUV planes at run-time using the existing H.264 decoder
  • Emits a semi-planar layout: a full-res Y plane (FMT_I8) followed by a half-res UV plane (FMT_IA16, U high / V low — matches what the RSP UV interleaver normally produces).

YUV blit path (include/yuv.h, src/video/yuv.c)

  • yuv_tex_blit_semiplanar skips RSP YUV interleaving and goes straight to RDP load+draw

Sprite integration (src/sprite.c, src/sprite_internal.h, src/rdpq/rdpq_sprite.c)

  • Added support for semi-planar YUV sprites to rdpq_sprite_blit to auto-configure YUV render mode
  • New flag SPRITE_FLAG_YUV_SEMIPLANAR (0x0080) used during sprite blitting
  • Repurposed the spare padding byte in sprite_ext_t for YUV colorspace

Example (examples/lossysprite/)

  • Minimal ROM: load a 640×480 background as .sprite, blit it via rdpq_sprite_blit

Outstanding concerns

  1. Allow opt-in for lossy decoding with sprite_load / sprite_load_buf so that the linker can strip out H264 code if unused.
  2. rdpq_set_mode functions should not be called in rdpq_sprite_blit
    • Due to implementation details of lossy compression with H.264, it is more efficient to use YUV16 format for blitting, but this introduces a conflict: the RDP needs to be in YUV mode, but usage of rdpq_sprite_blit is not intended to touch render modes.

meeq and others added 29 commits April 28, 2026 23:02
Co-authored-by: Copilot <copilot@github.com>
…se render mode changes

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Copy link
Copy Markdown
Collaborator

@rasky rasky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first round only on public headers, APIs, and docs.

Comment thread include/lossysprite.h
*
* - default / `--format=RGBA16` -> #FMT_RGBA16 sprite
* - `--format=RGBA32` -> #FMT_RGBA32 sprite
* - `--format=UYVY` -> #FMT_YUV16 sprite (packed 4:2:2)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FMT_YUV16 is a standard RDP texture format, and follows RDP format conversions of "NAME". I might be biased because I'm used to it, but I think it is preferable to introduce a new naming UYVY for a standard format.

Comment thread include/lossysprite.h
* (#FMT_I8) + interleaved UV plane
* (#FMT_IA16, half height), tagged as
* #FMT_YUV16 with the semi-planar layout
* recorded in #sprite_ext_t::yuv_attrs
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not put implementation details in lossysprite.h such as referring to how #sprite_ext_t::yuv_attrs is encoded or the fact that it even exists. You should put them in lossysprite.c. lossysprite.h must only comment the public API (and tools).

Comment thread include/lossysprite.h
* dispatches here automatically once #lossysprite_init has registered the
* decoder. Call #lossysprite_decode_buf directly only when the encoded
* bytes are already in memory (e.g. embedded as a DSO data blob).
*/
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above this whole initial doc is highly skewed towards implementation details. A few questions the initial docs should answer:

  • What is a lossy sprite and what is its relation to a normal sprite? When is it useful to use it? Is it OK for 32x32 textures? Better for larger files?
  • Does it use the same file .sprite or is it a different file?
  • What's the relationship between the sprite.h API and the lossysprite.h API? Do I need to use both? One or the other? If they are alternative, there are pros and cons? More cases supported by one respected to the other?
  • What are the performance characteristics of a lossysprite? Do I expect it to be slow at loading, slow at drawing, both or neither? Does it use RSP during loading or drawing? Is the loading blocked or asynchronous?
  • What are the memory characteristics of a lossysprite? Is it fully decompressed at loading? Is it decompressed into RGBA or YUV at load time?

Comment thread include/lossysprite.h
* matching files through #lossysprite_decode_buf. Until then, loading an
* LSPR file via #sprite_load fails. Safe to call multiple times.
*/
void lossysprite_init(void);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoudl we account for multiple algorithms here? Should there be one init per algorithm eg: lossysprite_init_lvl3()?

Comment thread include/lossysprite.h
* @param sz Size of @p buf in bytes
* @return The decoded sprite (free with #sprite_free)
*/
sprite_t *lossysprite_decode_buf(const void *buf, int sz);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this lossysprite_load_buf for symmetry with sprite_load_buf

Comment thread include/sprite.h
* @param sprite The sprite to inspect (must be #FMT_YUV16)
* @return The #yuv_format_t identifying the on-disk layout
*/
yuv_format_t sprite_get_yuv_format(sprite_t *sprite);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to clarify, sprite_get_format() would always return FMT_YUV16 as a way to say "this is a YUV sprite", and then you need sprite_get_yuv_format to extract the exact layout of the YUV bytes?

I think the complexity here, semantically speaking, is that FMT_YUV16 is actually a specific packed format supported by RDP (UYVU). So reusing it as a generic placeholder for "generic YUV" might be a bit confusing. I don't have a better suggestion.

Maybe we should mirror this in the mksprite command line though. Instead of having --format=NV12, we could have something like --format=YUV16 --yuv=NV12 or --yuv=UYVY. Or maybe even a special syntax like --format=YUV16:NV12, with --format=YUV16 being the same of --format=YUV16:UYVY.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a bit more of this issue. I think there's too much freedom.

At yuv.h level, it makes sense for yuv_format_t to specify the various packing/layouts like NV12, etc. yuv.h is an utility library and should support many different formats that can commonly be produced by different image/video pipelines. So it makes sense to just support all of the equally behind an unified API.

On the other hand, mksprite should be much more strict. Why should a mksprite user want to request a lossysprite which is exactly NV12 or NV16? The pixel layout is mandated by the codec being used. rsph264 currently emits a full planar format. Tomorrow, it could be optimized to directly emit a semi-planar format (that would be a good optimization to have one day). This should be left as an implementation detail to the codec.

The only possible input that a user might want to specify at the mksprite level would be the chroma subsampling, as that might affect the quality of the image. But that is not the pixel layout. That input could be specified as a different pseudo texture format, for instance "YUV24" to mean YUV 4:4:4 and "YUV12" to mean YUV 4:2:0 (and maybe just also add aliases like YUV_444 and YUV_420). Currently we don't support this in rsph264 so it's even moot to discuss about the correct syntax here. Anyway this is just the chroma subsampling: the pixel layout should be chosen by the decoder, and even changing over time.

I think by doing this, we should simplify quite a bit of code in lossysprite.c. I think you're currently post-processing pixels to honor the pixel layout requested by the user in mksprite, but I think that code should go. In fact, in addition of being inefficient, I don't see any reason why an user should want that CPU-based preprocessing at runtime, when yuv.h can handle the native pixel layout anyway.

Comment thread src/rdpq/rdpq_sprite.c
// tweaks below, since rdpq_set_mode_yuv resets the SOM.
if (sprite_get_format(sprite) == FMT_YUV16)
rdpq_sprite_set_yuv_mode(sprite);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid this part. I think rdpq_sprite_upload() shouldn't reset the reader mode like this, especially since you can't then do the counterpart for the standard mode anyway. I think this is something that should be documented. and left as-is-

Comment thread src/rdpq/rdpq_sprite.c
int padded_w = (int)sx->texparms.s.translate;
int padded_h = (int)sx->texparms.t.translate;
uint8_t *base = (uint8_t*)sprite + sx->data_ptr;
size_t y_bytes = (size_t)padded_w * padded_h;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are set by lossysprite.c, not mksprite. So that's not the producer.

Comment thread include/yuv_format.h
*
* The default value (0) matches the historical behavior, so existing callers
* that build a #yuv_frame_t with designated initializers do not need to set
* @ref yuv_frame_t::format explicitly.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph has a bit of recentism. In mksprite, our default is UYVY which is the hardware format; maybe we should set that as default value (0) even though existing source code might break (though it would break with an assert, and that's probably acceptable for the preview branch).

Comment thread include/yuv_format.h
*
* Y, U, and V are three separate #FMT_I8 surfaces; U and V are at half
* width and half height. RSP interleaves U+V into a temporary IA16
* buffer before the RDP draw, so #yuv_init must have been called. */
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's now reasonable to require a user to know whether yuv_init must tbe called or not depending on the assets (considering that the YUV format is now embedded in the assets).

I think we should always call it as part of lossysprite_init() just like fmv.c calls it.

Comment thread include/rdpq_sprite.h
* @see #yuv_blitter_run
*/
void rdpq_sprite_yuv_blitter_run(yuv_blitter_t *blitter, sprite_t *sprite);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yuv_blitter_t is a recorded block that allows to blit arbitrary YUV surfaces (of the given size/format) by using placeholders. The idea is to prepare a block once for eg drawing 320x240x2 NV12 frames, and then draw any frame during the playback (considering MPEG1/H264 will use different surface_t for different frames).

So I don't understand very well what these functions do. Seems like a bit of a semantic confusion: they accept a specific, single sprite_t as input, but then proceed to create a generic blitter for all sprite_t of the same size? That seems a bit confusing. I assume you are using the sprite as a template to say "create a blitter to draw sprites identical to this one". Still, I find it a bit confusing.

Also whatever problem you are solving here is not specific of YUV sprites at all. If you feel it's important to provide an API to do this, it would make sense for it to be available for all sprite formats. All sprites would benefit from this API.

If we want to go this way, I think we should introduce a rdpq_sprite_blitter_t, and have APIs such as:

blitter = rdpq_sprite_blitter_new(format, width, height, x0, y0, parms);
rdpq_sprite_blitter_run(blitter, sprite);

and make it work for all formats.

I just noticed that this would also be missing the yuv_format... This makes me wonder whether we really want yuv_format to be a "subformat", or maybe upgrade them to the top-level tex_format_t...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants