Initial TBR chapter. #338

gpx1000 · 2025-08-04T01:18:31Z

NB, fix the TBR link to the Simple Game Engine tutorial when it is published.

…utorial when it is published.

chapters/tile_based_rendering_best_practices.adoc

cforfang · 2025-08-05T07:21:12Z

Can I ask if a lot of this was written by AI? I'm very surprised by a lot of the text. Also the notion that one might as a SW developer 'choose' between TBR vs IMR, and have code trying to determine what to pick (ref the PowerConsumptionAnalyzer code) is very strange to me.

gpx1000 · 2025-08-05T07:26:38Z

Parts were written by AI, specifically the power consumption analyzer code as that is outside my normal wheelhouse; but looking at the references they looked solid and I edited it to make it read mostly correct to me. So, genesis by AI sure, but heavy human editing.

cforfang · 2025-08-05T07:26:48Z

I'm not able to a point by point feedback, but for example the VK_EXT_robustness2 section seems like total nonsense to me. And claims about use of VK_KHR_dynamic_rendering_local_read by Unity and Unreal is as far as I know also not true. As I scroll through the guide there is in general a lot of strange claims and commentary I think.

gpx1000 · 2025-08-05T07:27:46Z

Okay, no worries, I'll rewrite it.

cforfang · 2025-08-05T07:32:48Z

Probably not too useful to drop more 'random' drive-by comments like this, but for another example I think none of the use-cases mentioned for VK_EXT_shader_tile_image (bloom, edge-detection, FXAA, SSR) makes sense as the extension only gives access to the current pixel while all of these effects need access to other pixels.

FWIW I've ping some folks here at Arm to see if we can help review and support development of the guide -- I think it's a great initiative to be clear, but it probably needs some close review especially as there is not too much good and up-to-date public info about current mobile GPUs to pull from (hence also why the idea of the guide is good, of course) :)

gpx1000 · 2025-08-05T07:37:21Z

Thanks it's MUCH appreciated. I'm by far not the best expert at TBR; and I really want to try to get updated information out there. There's a reason I read all of the research articles linked and tried to put as much research into this chapter as I could. If we could get more details and more review, I'm much happier. Soon as I get a chance, I'm going to update from the comments already generated here.

solidpixel · 2025-08-05T08:52:35Z

The chapter title is Tile-Based Rendering Best Practices, but most of what it talks about is nothing to do with tile-based rendering but related to other aspects of vendor-specific implementation detail or orthogonal mobile GPU issues (constant registers, coherent memory, thermal, etc). For a Vulkan guide I'd probably split this up - having a topic focused only on the effects of being tile based is useful and the rest is somewhat a distraction.

The most important things for tilers (good use of loadOp/storeOp) seems to be buried right at the end, and the second most important (good use of pipeline barriers to get pipelining) isn't mentioned at all.

SaschaWillems · 2025-08-05T10:11:13Z

Not that much of a hardware guy, but isn't laziliy allocated memory / transient attachment and important Vulkan concept for TBRs? If so might be good to add that.

SaschaWillems · 2025-08-05T10:14:07Z

And I second the remarks about the power consumption part of that chapter. I tried to understand the code and data, but felt kinda lost. Wouldn't stuff like that require querying vendor specific apis to get real world power usage? Didn't see that mentioned anywhere.

SaschaWillems · 2025-08-05T10:45:22Z

Also some of the links don't point to anything usefull, e.g. these:

Imagination PowerVR Architecture Guide: Shows tile memory providing 10-20x bandwidth compared to external memory

Qualcomm Adreno Performance Guide: Demonstrates GMEM (tile memory) efficiency in mobile gaming scenarios

NVIDIA Tegra TBR Analysis: Research paper showing 60% power reduction through bandwidth optimization

IEEE Computer Graphics and Applications: Tile-Based Rendering analysis and improvements research

IEEE Transactions on Computers: Thermal management in mobile graphics processing research

Either point to or redirect to a (company) landing page instead of the linked e.g. "Research papers" or documents.

SaschaWillems · 2025-08-05T10:46:21Z

And other links don't make sense, e.g. this:

Vulkan-Hpp: Modern C++ bindings with TBR optimization examples

That links to the Vulkan-Hpp headers, I don't see why or how that relates to TBR optimizations?

gpx1000 · 2025-08-05T10:47:42Z

I'm going to rewrite this. Sorry not ready for prime time.

ZehuiLin-Huawei · 2025-08-06T06:57:01Z

Huawei Maleoon GPU Guide: Maleoon GPU Rendering Optimization

awalters-vk-img · 2025-10-06T17:01:29Z

chapters/tile_based_rendering_best_practices.adoc

+
+- **Attachment Configuration**: Final attachments use `VK_ATTACHMENT_STORE_OP_STORE`, intermediate attachments use `VK_ATTACHMENT_STORE_OP_DONT_CARE`
+- **Load Operations**: Use `VK_ATTACHMENT_LOAD_OP_CLEAR` for new content, `VK_ATTACHMENT_LOAD_OP_DONT_CARE` for intermediate results
+- **MSAA Efficiency**: TBR handles 4x MSAA efficiently due to tile memory resolve capabilities


Not sure I'd call out 4x specifically - makes it sound like you should prefer it over 2x, or 8x for example - which I'd not say is generic advice. Though tile memory resolve can be a good source of performance gain if you are going to be using MSAA.

Removed that advice to ensure it's clear.

awalters-vk-img · 2025-10-06T17:03:09Z

chapters/tile_based_rendering_best_practices.adoc

+
+**Tile Memory Management Strategies:**
+
+- **Memory Calculation**: Typical tile memory 512KB, calculate usage based on tile size (32x32 pixels), format size, and sample count


This calculation is not easy to perform for a developer as the determinations are not quite as simple as that. Different formats might not be stored in tile memory in the way you might naively expect. Also how MSAA affects tile size is also not widely documented.

Good call, removed

chapters/tile_based_rendering_best_practices.adoc

awalters-vk-img · 2025-10-06T17:19:10Z

chapters/tile_based_rendering_best_practices.adoc

+=== Half-Precision Float Optimization
+
+Using half-precision floats in shaders can speed up execution and reduce bandwidth on mobile TBR devices. Use low-precision numbers in fragment and compute shaders when visual quality permits:
+


Might be worth pointing out that mediump should be checked on as many devices as possible as they act as something of a hint - using mediump and testing only on one device that mayu under the hood still be using F32 can be very misleading and lead to visual issues on devices actually employing mediump.

I'd still recommend using it whenever possible, but it might be a worth while note/pointer

Removed it to ensure the information I have in here is correct and something I've been able to verify. If you recommend adding it back, I'd be happy to.

chapters/tile_based_rendering_best_practices.adoc

…ines, and implementation-agnostic practices.

bacTlink · 2025-10-24T06:41:19Z

chapters/tile_based_rendering_best_practices.adoc

+**Bandwidth Optimization Strategies:**
+
+- **Attachment configuration**: Final attachments use `VK_ATTACHMENT_STORE_OP_STORE`; intermediate attachments use `VK_ATTACHMENT_STORE_OP_DONT_CARE` when you do not need the results.
+- **Load operations**: Use `VK_ATTACHMENT_LOAD_OP_CLEAR` for new content; `VK_ATTACHMENT_LOAD_OP_DONT_CARE` for intermediate results you overwrite.


I am not sure here. I think that the loadOp can also be set to dont_care when rendering opaque objects, even for new content.

bacTlink · 2025-10-24T06:56:46Z

chapters/tile_based_rendering_best_practices.adoc

+
+**Advanced TBR considerations:**
+
+- Use subpasses and `VK_DEPENDENCY_BY_REGION_BIT` to enable local data reuse where beneficial; always measure on target devices.


I think it's worth mentioning here the subpassLoad operator to read pixel value from tile memory.

bacTlink · 2025-10-24T07:02:08Z

chapters/tile_based_rendering_best_practices.adoc

+
+- No explicit on-chip tile memory model exposed to applications.
+- Overdraw tends to generate more external memory traffic than on tilers; minimizing overdraw is important.
+- Applications should rely on standard Vulkan techniques (early depth/stencil, appropriate load/store ops, and subpasses where helpful) and profile on target devices.


I am seeing "profile on target devices", "measure on target devices", "profiling results on target hardware" many times in this documentation. This kind of redundant phrases should be cleaned up.

ZehuiLin-Huawei · 2025-10-25T01:34:38Z

Currently information is scattered in various corners. And same information appears a few times including thing like "profile on target devices", or "Tile size not exposed by core Vulkan".
The documentation structure could be improved by establishing a main line of reasoning and developing the content within the framework of this logic. The current version does not seem to be really useful for developers.

Initial TBR chapter. NB, fix the TBR link to the Simple Game Engine t…

795bbac

…utorial when it is published.

ZehuiLin-Huawei reviewed Aug 5, 2025

View reviewed changes

janharaldfredriksen-arm mentioned this pull request Aug 15, 2025

Simple game engine KhronosGroup/Vulkan-Tutorial#119

Merged

awalters-vk-img reviewed Oct 6, 2025

View reviewed changes

chapters/tile_based_rendering_best_practices.adoc Outdated Show resolved Hide resolved

awalters-vk-img reviewed Oct 6, 2025

View reviewed changes

chapters/tile_based_rendering_best_practices.adoc Outdated Show resolved Hide resolved

awalters-vk-img reviewed Oct 6, 2025

View reviewed changes

chapters/tile_based_rendering_best_practices.adoc Outdated Show resolved Hide resolved

awalters-vk-img reviewed Oct 6, 2025

View reviewed changes

chapters/tile_based_rendering_best_practices.adoc Outdated Show resolved Hide resolved

Refine TBR chapter with updated Vulkan specifics, optimization guidel…

c9ad6ab

…ines, and implementation-agnostic practices.

bacTlink reviewed Oct 24, 2025

View reviewed changes


		Tile Memory Management Strategies:

		- Memory Calculation: Typical tile memory 512KB, calculate usage based on tile size (32x32 pixels), format size, and sample count

		=== Half-Precision Float Optimization

		Using half-precision floats in shaders can speed up execution and reduce bandwidth on mobile TBR devices. Use low-precision numbers in fragment and compute shaders when visual quality permits:


		Advanced TBR considerations:

		- Use subpasses and `VK_DEPENDENCY_BY_REGION_BIT` to enable local data reuse where beneficial; always measure on target devices.

Initial TBR chapter. #338

Are you sure you want to change the base?

Initial TBR chapter. #338

Uh oh!

Conversation

gpx1000 commented Aug 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cforfang commented Aug 5, 2025

Uh oh!

gpx1000 commented Aug 5, 2025

Uh oh!

cforfang commented Aug 5, 2025

Uh oh!

gpx1000 commented Aug 5, 2025

Uh oh!

cforfang commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gpx1000 commented Aug 5, 2025

Uh oh!

solidpixel commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SaschaWillems commented Aug 5, 2025

Uh oh!

SaschaWillems commented Aug 5, 2025

Uh oh!

SaschaWillems commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SaschaWillems commented Aug 5, 2025

Uh oh!

gpx1000 commented Aug 5, 2025

Uh oh!

ZehuiLin-Huawei commented Aug 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZehuiLin-Huawei commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

cforfang commented Aug 5, 2025 •

edited

Loading

solidpixel commented Aug 5, 2025 •

edited

Loading

SaschaWillems commented Aug 5, 2025 •

edited

Loading