[0010] - DXIL Data Scalarization
| Status | Under Consideration |
|---|---|
| Author | |
| Sponsor |
Introduction
In Proposal 0009 we covered scalarization of call instructions, and vector operations like math ops, logical ops, bitcasts, loads, and stores. Proposal 0009 also solves all the function scope data scalarization for us. The goal of this proposal is to present a solution for scalarizing data structures via data layout transformations. As a note, any case that results in an implicit cbuffer will not be covered by this proposal.
Motivation
As mentioned in DXIL.rst “HLSL vectors are scalarized” and “Matrices are lowered to vectors”. Therefore, we need to be able to scalarize these types in clang DirectX backend. Without scalarizing the data structures and call instructions we can’t generate legal DXIL.
Background
In DXC we support five forms of data scalarization. data marked with
groupshared, static function scope, static global scope, global data,
and data in function scope. There are then subset behaviors depending on
if we are dealing with arrays of vectors or vectors. The data
scalarization can then be broken down into three cases:
staticscalar layouts- vector of
Msize get scalarized intoMelements. - an array of
Nsize with a vector ofMwill get scalarized intoMarrays ofNsize.
- vector of
groupsharedscalar layouts- vectors become arrays
- an array of
Nsize with a vector ofMwill lower first to a 2d arrayNbyMand then a 1d.
- cbuffer usage for regular arrays.
Proposal
We need to implement a solution to handle groupshared and static.
To simplify things for upstream this proposal will make a deviation from DXC
and use the same layout transformations for both groupshared and static
vectors. Both layouts are legal so this shouldn’t cause any problems. The plan
will be to create two new passes. The first pass will convert vectors to arrays
and potentially create multi-dimensional arrays. The second pass will flatten
these arrays into one dimension. These two passes will run successively.
While these two passes could have an agnostic order; they will run after the scalarizer pass. The pass will also only operate on global data since the only cases not handled by the scalarizer pass are the global scope data cases.