summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: move FMASK shader logic to shared codeMarek Olšák2018-04-021-72/+2
| | | | | | We'll need it for FBFETCH in both TGSI and NIR paths. Tested-by: Dieter Nützel <[email protected]>
* radeonsi: add R600_DEBUG=nofmask to disable MSAA compressionMarek Olšák2018-04-025-14/+17
| | | | | | For testing. Tested-by: Dieter Nützel <[email protected]>
* ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memoryMarek Olšák2018-04-023-6/+6
|
* radeonsi/nir: fix explicit component packing for geom/tess doublesTimothy Arceri2018-04-021-8/+11
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: gather buffers declared more accurately and use const fast pathTimothy Arceri2018-04-022-6/+90
| | | | | | | | For now we skip SI && HAVE_LLVM < 0x0600 for simplicity. We also skip setting the more accurate masks for builtin uniforms for now as it causes some piglit regressions. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: create load_const_buffer_desc_fast_path() helperTimothy Arceri2018-04-021-39/+49
| | | | | | | | This will be shared by the TGSI and NIR backends. For simplicity we leave the SI LLVM 5.0 and lower work around only in the TGSI backend. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: set TGSI_PROPERTY_NEXT_SHADERTimothy Arceri2018-04-021-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* freedreno/a5xx: don't align height for PIPE_BUFFERRob Clark2018-04-011-1/+1
| | | | | | | | | Buffers can be large, so we probably don't want to make them all 32x bigger. But they can't be rendered to (at least in GL) so we don't need this workaround to prevent page faults on mem<->gmem. Cc: "18.0" <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix page faults on last levelRob Clark2018-04-011-0/+10
| | | | | | | | | | We could alternatively fall back to using "old style" draw's for mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32, but that is somewhat more work (and not really something that could be applied to stable) Cc: "18.0" <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix issue w/ glamor composite shadersRob Clark2018-03-312-2/+36
| | | | | | | | | | | | | | | Fixes an issue that became possible when we started lowering phi webs to regs (a7ea2b4e) (although was not really seen until we also switched to using peephole select pass (ec8bc54a) instead of lowering *all* if/else to select). If texture coord (or anything else that uses create_collect() to collect scalar values in a sequence of scalar registers) was consuming a value produced on either side of an if/else (ie. a phi lowered to nir reg, which in ir3 is an "array" of length 1) then register allocation would happen incorrectly and we'd end up sampling from garbage coordinates. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: more half-precision fixesRob Clark2018-03-312-8/+37
| | | | | | | | Some instructions require src/dst to be in full or half precision register depending on src/dst type. So do a better job of propagating register type. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add helper to create immed of specified sizeRob Clark2018-03-311-4/+11
| | | | | | | We'll also need to be able to create a half-precision immediate. So re-work create_immed(). Prep work for following patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: pass ctx instead of block to create_collect()Rob Clark2018-03-311-18/+19
| | | | | | Prep work for following patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: eliminate unused false-depsRob Clark2018-03-312-11/+31
| | | | | | | | | | | | | | Previously false-dependencies would get flagged as used, even if the only "use" was a false dep to (for example) prevent a load from being scheduled after a store. In addition to being pointless instructions, in some cases they can cause problems. For example, ldg (and similar instructions) depend on an immed arg getting CP'd into the instruction, but this doesn't happen if an instruction is otherwise unused. Which can result in undefined results (overwriting unintended registers). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add local_group_sizeRob Clark2018-03-313-2/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: clear SSA flag when assigning "ARRAY" regs tooRob Clark2018-03-311-0/+1
| | | | | | Avoids a misleading "INVALID FLAGS" warning in debug builds. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: print array live rangesRob Clark2018-03-311-4/+10
| | | | | | This is also useful to see if optmsgs are enabled. Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: Implement DP2 instructionWladimir J. van der Laan2018-03-311-0/+21
| | | | | | | | Use DOT2ADDv instruction with 0.0f constant add. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: implement SEQ/SNE instructionsWladimir J. van der Laan2018-03-311-3/+20
| | | | | | | | | Extend translate_sge_slt to emit these, in analogous fashion but using CNDEv. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: Compressed textures supportWladimir J. van der Laan2018-03-311-0/+11
| | | | | | | | | | | | | | Add support for: - PIPE_FORMAT_ETC1_RGB8 - PIPE_FORMAT_DXT1_RGB - PIPE_FORMAT_DXT1_RGBA - PIPE_FORMAT_DXT3_RGBA - PIPE_FORMAT_DXT5_RGBA Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: Support TEXTURE_RECTWladimir J. van der Laan2018-03-313-1/+4
| | | | | | | | | Denormalized texture coordinates are required for text rendering in GALLIUM_HUD. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: Prevent crash in emit_texture if view is not setWladimir J. van der Laan2018-03-311-3/+10
| | | | | | | | | | Textures will sometimes be updated if texture view state was un-set, without this change that causes an assertion crash or segfault. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: Fix fd2_tex_swizWladimir J. van der Laan2018-03-311-9/+9
| | | | | | | | | | | Compose swizzles using util_format_compose_swizzles instead of the custom code (which somehow had a bug). This makes the GL_ALPHA internal format work. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: Change use of BLEND_ to BLEND2_Wladimir J. van der Laan2018-03-311-2/+2
| | | | | | | | | | | | | | | Change use of BLEND_ to BLEND2_, BLEND_* a3xx_rb_blend_opcode BLEND2_* is a2xx_rb_blend_opcode This makes no effective difference as the used enumerant has the same value (0), but the other enumerants do not match 1-to-1 so this will avoid future problems. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: Update rnndb header for formats enumerationWladimir J. van der Laan2018-03-311-20/+13
| | | | | | | | | | The format enumeration comes comes from the yamoto register headers that are part of the amd-gpu kernel driver. (see freedreno envytools commit b8fb7978e7ae106d0d11d0b238ab2ba2d4dd9d43) Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* util: Add and use util_is_power_of_two_nonzeroIan Romanick2018-03-291-1/+1
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* util: Move util_is_power_of_two to bitscan.h and rename to ↵Ian Romanick2018-03-2917-29/+31
| | | | | | | | | | | util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nvc0/ir: fix emiting NOTs with predicatesKarol Herbst2018-03-291-0/+2
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* broadcom/vc5: Fix setup of integer surface clear values.Eric Anholt2018-03-281-8/+8
| | | | | | | | I'm disappointed that the compiler didn't warn me about use of uninitialized uc in these paths. Just use the incoming clear color instead of the packing temporary if we're doing our own packing. Fixes GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_*
* broadcom/vc5: Stop trying to swizzle around RGBA4 clear color.Eric Anholt2018-03-281-12/+2
| | | | | | | | We always want A in the A slot in the tile buffer, and any other swapping should happen elsewhere. Fixes RGBA4-using cases in fbo-clear-formats and GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_fixed.
* broadcom/vc5: Work around scissor w/h==0 bug same as rasterizer discard.Eric Anholt2018-03-281-2/+15
| | | | | The 7268 HW apparently lets some rendering through in this case. Fixes GTF-GLES2.gtf.GL2FixedTests.scissor.scissor
* radeonsi: simplify DCC format categoriesMarek Olšák2018-03-281-20/+9
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: don't use the SPI barrier management bug workaroundMarek Olšák2018-03-281-0/+5
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: use maximum OFFCHIP_BUFFERING on Vega12Marek Olšák2018-03-281-1/+8
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: add support for Vega12Marek Olšák2018-03-284-1/+7
| | | | Reviewed-by: Alex Deucher <[email protected]>
* broadcom/vc5: Fix padding of NPOT miplevels >= 2.Eric Anholt2018-03-271-3/+8
| | | | | | | The power-of-two padded size that gets minified is based on level 1's dimensions, not level 0's, which starts to differ at a width of 9. Fixes all failures on texelFetch fs sampler2D 1x1x1-64x64x1
* ac/radeonsi: pass bindless bool to load_sampler_desc()Timothy Arceri2018-03-281-1/+1
| | | | | | | | We also fix the base_index for bindless by using the driver location. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: set uses_bindless_samplers for samplersTimothy Arceri2018-03-281-0/+3
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0/ir: fix INTERP_* with indirect inputsIlia Mirkin2018-03-271-3/+4
| | | | | | | | | | | | There were two problems, both of which are fixed now: - The indirect address was not being shifted by 4 - The indirect address was being placed as an argument in the offset case This fixes some of the new interpolateAt* piglits which now test for these situations. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* radeonsi/nir: fix input processing for packed varyingsTimothy Arceri2018-03-281-3/+2
| | | | | | | | | The location was only being incremented the first time we processed a location. This meant we would incorrectly skip some elements of an array if the first element was packed and proccessed previously but other elements were not. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix scanning of multi-slot output varyingsTimothy Arceri2018-03-281-109/+127
| | | | | | | | | | This fixes tcs/tes varying arrays where we dont lower indirects and therefore don't split arrays. Here we also fix useagemask for dual slot doubles. Fixes a number of arb_tessellation_shader piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* broadcom/vc5: Fix RG16I/UI texture sampling.Eric Anholt2018-03-271-2/+2
| | | | | | | How many times did I look at this table without noticing the missing 'G' in the texture column? Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.
* broadcom/vc5: Fix swizzling of RGB10_A2UI render targets.Eric Anholt2018-03-261-1/+1
| | | | | | | This is the actual hardware layout, and we were only swizzling R/B back around in texturing. Fixes part of KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in simulation.
* broadcom/vc5: Implement workaround for GFXH-1431.Eric Anholt2018-03-261-1/+5
| | | | | This should fix some blending errors, but doesn't impact any testcases in the CTS.
* broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well.Eric Anholt2018-03-265-21/+111
| | | | | | | Once we've disabled EZ for some draws, we need to not use EZ on future draws. Implementing that made implementing the GT/GE direction trivial. Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.
* broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled.Eric Anholt2018-03-262-0/+8
| | | | | | On 3.x, we just don't flag the primitive as needing TF, but those primitive bits are now allocated to the new primitive types. Now we need to actually update the enable flag at draw time.
* broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job.Eric Anholt2018-03-263-5/+29
| | | | | | The next job from this client will turn it back on unless TF gets disabled, but we don't want the state to leak from this client to another (which causes GPU hangs).
* broadcom/vc5: Move the BCL epilogue code to a per-version compile.Eric Anholt2018-03-265-24/+67
| | | | I need to do some new packets for transform feedback on 4.1.
* broadcom/vc5: Fix transform feedback in the presence of point size.Eric Anholt2018-03-263-4/+23
| | | | | | | I had this note to myself, and it turns out that a lot of CTS tests use XFB with points to get data out without using a fragment shader. Keep track of two sets of precomputed TF specs (point size in VPM prologue or not), and switch between them when we enable/disable point size.
* broadcom/vc5: Split transform feedback specs update from buffers.Eric Anholt2018-03-261-27/+32
| | | | | The specs update will be changing based on additional state flags in the next commit, and this unindents the buffer update code.