2025-07-31 - HLSL Working Group Minutes
Propose a discussion topic by making an edit suggestion on the GitHub PR.
- Discussion topics
- Supporting -O0 in Clang
Generated by AI. Be sure to check for accuracy. Meeting notes:
Struct Decomposition and Scalarization Challenges:
Discussed the technical challenges of struct decomposition in the presence of dynamic array indices, focusing on the SROA pass, language differences between C and HLSL, and the implications for DXIL code generation.
* SROA Pass Limitations: the SROA pass in LLVM refuses to decompose structs containing arrays when the array index is not known at compile time, due to the risk of out-of-bounds accesses that could legally reference other struct fields in C, making decomposition unsafe.
* Language Semantics Differences: in C, negative or out-of-bounds array indexing within structs is defined behavior, preventing decomposition,, whereas in HLSL, such accesses are undefined behavior or even ill-formed, allowing for more aggressive decomposition in compilers like DXC.
* Proposed Solutions for Decomposition: modifying the SROA pass to allow decomposition when array accesses are marked as inbounds, and HLSL could always generate inbounds accesses in HLSL codegen, relying on the language's undefined behavior for out-of-bounds accesses.
Optimization Levels and Legalization Requirements:
Discussed the necessity of running certain optimizations at O0 (no optimization) to generate valid DXIL or SPIRV, the distinction between legalization and optimization, and the implications for debug information and user expectations. * O0 and Valid Code Generation: generating valid DXIL or SPIRV often requires running optimization passes even at O0, as some code patterns (e.g., resource handle propagation, loop unrolling) cannot be lowered correctly without them, and DXC often runs a full optimization pass pipeline at O0 for this reason. * Legalization Versus Optimization: passes required to produce valid output (such as SROA, inlining, and loop unrolling for certain constructs) should be considered legalization rather than optimization, and that O0 should still run these passes to ensure correctness. * Debug Information Considerations: required passes like SROA and loop unrolling can degrade the debugging experience, however they are necessary for correct code generation, and that a balance must be struck for good debuggability. * Military and Special Use Cases for O0: some users, such as military or machine learning workloads, may require strict O0 (no optimizations), but consensus emerged that supporting this may not be practical if it leads to invalid output, and that users with such requirements may need to implement their own solutions.
Handling of the Optimize-None Attribute and Pass Pipeline:
Discussed the implications of the optimize-none (optnone) attribute in LLVM, debating whether to run required passes like SROA on functions marked with optnone, and considered changes to the Clang-DXC pipeline to ensure valid code generation for HLSL. * SROA and Optimize-None: SROA currently skips functions marked with optimize-none, which can prevent valid DXIL generation, and considered whether to make exceptions for SROA or to avoid marking HLSL functions with optimize-none altogether. * Pipeline Adjustments for HLSL: The group considered modifying the Clang-DXC pass pipeline to avoid applying optimize-none to HLSL functions, recognizing that HLSL’s requirements differ from C++ and that this change would allow necessary legalization passes to run even at O0. * Tracking and Resolution: @Icohedron to ensure that an issue is tracked for not supporting the optimize-none attribute in this context, and for adjusting the pipeline accordingly.
Short-Term and Long-Term Plans for O0 Support:
Discussed possible short-term and long-term strategies for O0 support in Clang-DXC, including promoting O0 to O1, warning users, or removing O0, and the need for further data on user requirements. * Promoting O0 to O1 or OD: Clang-DXC could treat O0 as O1 or OD under the hood, running the minimal set of required passes to ensure valid output, and possibly warning users that true O0 is unsupported. * User Experience and Backwards Compatibility: it is important to understand who uses O0 and for what purposes, so that we may avoid breaking user expectations or introducing validation errors when migrating from DXC to Clang-DXC. * Action Items and Next Steps: more data is needed on O0 usage, and in the meantime, the team should focus on generating good debuggable code and tracking differences in behavior between Clang and DXC.
Follow-up tasks:
- Create and track an issue to not put the optimize none attribute on functions in Clang-DXC for DXIL legalization. (@Icohedron)