summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a4xx: fix fragcoord.z + fragdepthRob Clark2015-12-152-5/+5
| | | | | | | | | | | | | It seems like disabling earlyz on a4xx also, by defaults, disables fragcoord.z to the FS. For frag shaders that both read fragcoord(.z) and write fragdepth, we need to set some extra bits to prevent a lockup. This lets us get rid of the hack of disabling fragcoord.z (which prevented 0ad from lockups, but resulted in rendering corruption). Also fixes fbo-depth-sample-compare. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-12-156-92/+231
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/cmdline: don't dump nir by defaultRob Clark2015-12-151-3/+1
| | | | | | By default we only want the disasm dumped, which we get anyways. Signed-off-by: Rob Clark <[email protected]>
* nir: Get rid of *_indirect variants of input/output load/store intrinsicsJason Ekstrand2015-12-101-32/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the *_indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the *_indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of *_indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <[email protected]> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <[email protected]> ir3 changes are Reviewed-by: Rob Clark <[email protected]> NIR changes are Acked-by: Rob Clark <[email protected]>
* freedreno: little clean up in fd_create_surfaceSerge Martin2015-12-091-15/+16
| | | | | | in order to avoid returing invalid adress if CALLOC_STRUCT return NULL. Signed-off-by: Rob Clark <[email protected]>
* freedreno: change to goto failSerge Martin2015-12-091-4/+2
| | | | | | in fd_resource_transfer_map, like the others error cases Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix bind_sampler_states when hwcso is NULLSerge Martin2015-12-093-0/+9
| | | | | | | | | | src/gallium/tests/trivial/compute.c expects samplers to be cleaned when the samplers list is NULL. Like in radeon, the function behave like when the number of samplers parameter is set to 0. [small s/hwsco/hwcso/ typo fix] Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: nir shader prints with 'disasm' debug optionRob Clark2015-12-051-2/+2
| | | | | | | | Move these to 'disasm' instead of the more verbose 'optmsgs' since, like the tgsi dumps, it is useful without the more verbose compiler logging enabled. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: convert scheduler back to recursive algoRob Clark2015-12-042-127/+204
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've played with a few different approaches to tweak instruction priority according to how much they increase/decrease register pressure, etc. But nothing seems to change the fact that compared to original (pre-multiple-block-support) scheduler, in some edge cases we are generating shaders w/ 5-6x higher register usage. The problem is that the priority queue approach completely looses the dependency between instructions, and ends up scheduling all paths at the same time. Original reason for switching was that recursive approach relied on starting from the shader outputs array. But we can achieve more or less the same thing by starting from the depth-sorted list. shader-db results: total instructions in shared programs: 113350 -> 105183 (-7.21%) total dwords in shared programs: 219328 -> 211168 (-3.72%) total full registers used in shared programs: 7911 -> 7383 (-6.67%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 21294 -> 21294 (0.00%) half full const instr dwords helped 0 322 0 711 215 hurt 0 163 0 38 4 The shaders hurt tend to gain a register or two. While there are also a lot of helped shaders that only loose a register or two, the more complex ones tend to loose significanly more registers used. In some more extreme cases, like glsl-fs-convolution-1.shader_test it is more like 7 vs 34 registers! Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't reuse a0.x across blocksRob Clark2015-12-041-7/+14
| | | | | | | | It causes confusion in sched if we need to split_addr() since otherwise we wouldn't easily know which block the new addr instr will be scheduled in. So just side-step the whole situation. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rename ir3_block::bdRob Clark2015-12-043-11/+11
| | | | | | | | We'll need to add similar for ir3_instruction, but following the pattern to use 'id' seems confusing. Let's just go w/ generic 'data' as the name. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: assign varying locations laterRob Clark2015-11-264-29/+37
| | | | | | | | | | | | | Rather than assigning inloc up front, when we don't yet know if it will be unused, assign it last thing before the legalize pass. Also, realize when inputs are unused (since for frag shader's we can't rely on them being removed from ir->inputs[]). This doesn't make sense if we don't also dynamically assign the inloc's, since we could end up telling the hw the wrong # of varyings (since we currently assume that the # of varyings and max-inloc are related..) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use instr flag to mark unused instructionsRob Clark2015-11-264-14/+24
| | | | | | Rather than magic depth value, which won't be available in later stages. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: rework vinterp/vpsreplRob Clark2015-11-261-12/+36
| | | | | | Same as previous commit, for a4xx. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: rework vinterp/vpsreplRob Clark2015-11-261-12/+37
| | | | | | | | | | | | Make the interpolation / point-sprite replacement mode setup deal with varying packing. In a later commit, we switch to packing just the varying components that are actually used by the frag shader, so we won't be able to assume everything is vec4's aligned to vec4. Which would highly confuse the previous vinterp/vpsrepl logic. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add support for a few gs5 opsIlia Mirkin2015-11-231-0/+27
| | | | | | | Tested on a4xx. This is part of the builtins added by ARB_gpu_shader5 and GLSL ES 3.10. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add ARB_texture_query_lod supportIlia Mirkin2015-11-232-6/+20
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: re-emit program on dirty framebufferIlia Mirkin2015-11-231-1/+1
| | | | | | | The program emit depends on certain fb details. Make sure those get updated when the fb changes. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: use a factor of 32767 for snorm8 blendingIlia Mirkin2015-11-231-5/+34
| | | | | | | | | | | | It appears that the hardware wants the integer to be scaled the same way that the hardware representation is. snorm16 uses one of the float factors, so this is only relevant for snorm8. This fixes a number of subcases of bin/fbo-blending-formats GL_EXT_texture_snorm Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* freedreno/a4xx: only compute texture offset once for the viewIlia Mirkin2015-11-233-13/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add ARB_texture_view supportIlia Mirkin2015-11-233-8/+10
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add formats for ARB_texture_buffer_object_rgb32 supportIlia Mirkin2015-11-233-3/+9
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add ARB_texture_rgb10_a2ui supportIlia Mirkin2015-11-232-2/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add astc formatsIlia Mirkin2015-11-232-1/+39
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: support 16384 texels in buffer textureIlia Mirkin2015-11-232-5/+4
| | | | | | Looks like the width field's bitmask was off-by-one. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add ARB_texture_buffer_range supportIlia Mirkin2015-11-233-15/+41
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add polygon mode supportIlia Mirkin2015-11-234-4/+26
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nir: s/nir_type_unsigned/nir_type_uintJason Ekstrand2015-11-231-1/+1
| | | | | | | | | | | v2: do the same in tgsi_to_nir (Samuel) v3: added missing cases after rebase (Iago) v4: Add a blank space after '#' in one of the comments (Matt) Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* freedreno/a4xx: disable blending and alphatest for integer rt0Ilia Mirkin2015-11-211-2/+13
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* freedreno/a4xx: fix independent blendIlia Mirkin2015-11-212-2/+3
| | | | | | | This fixes the ext_draw_buffers2 and arb_draw_buffers_blend tests. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* freedreno/a4xx: enable ARB_base_instance supportIlia Mirkin2015-11-211-1/+1
| | | | | | We already pass in start_instance in fd4_draw. Expose the extension. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: set fetchsize in mem2gmem texture restoreIlia Mirkin2015-11-211-1/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add 11_11_10_float vertex type supportIlia Mirkin2015-11-212-1/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: fix 3d texture setupIlia Mirkin2015-11-213-3/+7
| | | | | | | | | | Same fix as on a3xx - set the second (tiny) layer size bitfield to the smallest level's size so that the hw knows not to minify beyond that. This fixes texelFetch sampler3D piglits. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* freedreno/a4xx: only align slices in non-layer_first texturesIlia Mirkin2015-11-211-2/+4
| | | | | | | | | | | | When layer is the container, slices are tightly packed inside of each layer. We don't need any additional alignment. On a3xx, each slice contains all the layers, so having alignment makes sense. This fixes a whole slew of array-related piglits, including texelFetch and tex-miplevel-selection varieties. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* freedreno/a4xx: add missing formats to enable ARB_vertex_type_2_10_10_10_revIlia Mirkin2015-11-202-4/+8
| | | | | | | Same as commit 84d087aea but for a4xx. The RE'd enums had the same issue too. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: use hardware RGTC texture samplersIlia Mirkin2015-11-206-24/+19
| | | | | | | | | a4xx hardware has real support for RGTC so there's no need to fake it like we do on a3xx. Undo the hacks, and keep track of an "internal format" of a resource, which on a3xx will be different, triggering the transfer-time conversions to take place. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: hook up RGB565 formatIlia Mirkin2015-11-202-1/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: logic op handlingIlia Mirkin2015-11-206-29/+35
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add 16-bit unorm/snorm format texturing/renderingIlia Mirkin2015-11-202-25/+47
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: point regid to "red" even for alpha-only rb formatsIlia Mirkin2015-11-201-7/+0
| | | | | | | | | Looks like a4xx hw does this in a more standard way and we don't need to hack around it like we do on a3xx. Fixes GL_ALPHA formats in fbo-blending-formats, fbo-colormask-formats, and fbo-alphatest-formats. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* freedreno: always set all border colorsIlia Mirkin2015-11-201-30/+8
| | | | | | | Instead of playing the guessing game as to which texture format reads from which border color encoding type, just write both of them always. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: fix dst_alpha blend for RGBX render targetsIlia Mirkin2015-11-203-5/+32
| | | | | | | There are not native RGBX render formats, so we must manually force dst_alpha to be one, same as for a3xx. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: add BPTC supportIlia Mirkin2015-11-202-0/+8
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nir: Add nir_texop_samples_identical opcodeIan Romanick2015-11-191-0/+3
| | | | | | | | | | | This is the NIR analog to GLSL IR ir_samples_identical. v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by Ken and Jason. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* freedreno/a4xx: fix 5_5_5_1 texture sampler formatIlia Mirkin2015-11-191-1/+1
| | | | | | | | This fixes teximage-colors, fbo-generatemipmap-formats, and probably others (in relation to the RGB5 formats, others still fail). Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* freedreno/a4xx: add depth clamp and halfz clipIlia Mirkin2015-11-193-4/+9
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: allow seamless cubemap filtering to be enabled per-textureIlia Mirkin2015-11-193-1/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: support lod_biasIlia Mirkin2015-11-192-0/+7
| | | | | | | | | The lower layers assume that we support this, and it's been core since GL 1.4. This fixes a slew of piglit tests, especially around tex-miplevel-selection. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* freedreno/a4xx: add fake RGTC support (required for GL3)Rob Clark2015-11-183-1/+22
| | | | | | | | | | The a4xx bits corresponding to 'freedreno/a3xx: add fake RGTC support (required for GL3)' TODO some more r/e.. maybe we get lucky and hw supports some of this directly? For now this will help us enable gl3. Signed-off-by: Rob Clark <[email protected]>