summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* trace: add missing set_shader_images()Samuel Pitoiset2016-04-073-0/+81
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable perfect ZPASS counts for PIPE_QUERY_OCCLUSION_PREDICATEMarek Olšák2016-04-073-5/+16
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't use the real barrier instruction in tess ctrl shadersMarek Olšák2016-04-071-0/+8
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "clover: Fix build against clang SVN >= r265359"Michel Dänzer2016-04-071-3/+0
| | | | | | | | This reverts commit 0daab9878d2b96356cf667591a2c877d912be52d. The corresponding clang change was reverted. Trivial.
* r600: use radeon_emit in a few more places in evergreen_computeDave Airlie2016-04-071-4/+4
| | | | | | | | This is just a cleanup of the code. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: make compute global buffer functions static.Dave Airlie2016-04-072-98/+86
| | | | | | | | | This moves things around so that the global buffer handling functions in evergreen_compute.c are static. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: make two compute functions static.Dave Airlie2016-04-072-5/+3
| | | | | | | | These aren't used outside evergreen_compute.c Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: using pipe_grid_info more in evergreen_compute.Dave Airlie2016-04-072-26/+21
| | | | | | | | | No reason to pull the pieces apart here, also make one of the functions static as it's unused outside this. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: in evergreen_compute use ctx consistently instead of ctx_Dave Airlie2016-04-071-25/+25
| | | | | | Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: use rctx consistently in evergreen_compute.cDave Airlie2016-04-071-74/+74
| | | | | | | | Another step towards cleaning this up. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: cleanup whitespace in evergreen_compute.cDave Airlie2016-04-071-87/+75
| | | | | | | | | | This aligns the code with the style of the rest of the driver. Makes editing it a lot less painful. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: Enable ARB_framebuffer_no_attachmentsEdward O'Callaghan2016-04-071-1/+1
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Enable ARB_framebuffer_no_attachmentsEdward O'Callaghan2016-04-071-1/+1
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Improve assert info out of si_set_framebuffer_state()Edward O'Callaghan2016-04-071-0/+2
| | | | | | | | Lets give the developer a little hand if we are going to assert on a zero literal at the end of a branch. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Allow 16 samples MSAA mode for PIPE_FORMAT_NONEEdward O'Callaghan2016-04-071-0/+5
| | | | | | | | | For ARB_framebuffer_no_attachment; A is_format_supported() query with 'PIPE_FORMAT_NONE' passed implies a query of the number of samples supported from the framebuffer with no attachment. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* softpipe: Set samples and layers in set_framebuffer_state() cbEdward O'Callaghan2016-04-071-0/+2
| | | | | | | | | Carries across the number of samples and layers state in the 'softpipe_set_framebuffer_state()' callback. This state is part of 'ARB_framebuffer_no_attachments' support. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/trace: Dump no.of samples and layers in fb stateEdward O'Callaghan2016-04-071-0/+2
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Put no.of {samples,layers} into pipe_framebuffer_stateEdward O'Callaghan2016-04-073-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we store the number of samples and layers directly in the pipe_framebuffer_state so that in the case of ARB_framebuffer_no_attachment we may make use of them directly. Further, we adjust various gallium/auxiliary helper functions accordingly. V2: Convert branches in util_framebuffer_get_num_layers() and util_framebuffer_get_num_samples() to their canonical form. V3: 'git stash pop' the typo fix of 'cbufs' which should be 'nr_cbufs' that was missing in V2, woops! Thanks Marek for pointing this out yet again. V4: Squash in the following patch: 'gallium/util: Ensure util_framebuffer_get_num_samples() is valid' Upon context creation, internal driver structures are malloc()'ed and memset() to zero them. This results in a invalid number of samples 'by default'. Handle this in the simplest way to avoid elaborate and probably equally sub-optimial solutions. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENTEdward O'Callaghan2016-04-0716-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PIPE_CAP to determine if the GL extension 'GL_ARB_framebuffer_no_attachments' shall be supported. The driver is required to support 'PIPE_FORMAT_NONE' via its 'is_format_supported()' callback in order to determine the MSAA modes the hardware supports so that values requested from the application using 'GL_ARB_framebuffer_no_attachments' may be quantized to what the hardware expects. V.2: Fix doc for a more detailed description of the PIPE_CAP and the corresponding GL constant. V.3: Renamed and repurposed once again. V.4: Remove CAP from cap_mapping array. [airlied: fix damaged whitespace] Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: set shader calling conventionsBas Nieuwenhuizen2016-04-061-1/+16
| | | | | | | | | | | | | Note that old mesa + new LLVM or new mesa + old LLVM breaks with this change and the corresponding LLVM change (D18559). For LLVM version <= 3.8 we use the old method, but we can't detect people using a post 3.8 svn version that is still too old. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* freedreno/ir3: insert extra move into phiRob Clark2016-04-051-0/+10
| | | | | | | | | | | | | | | | | We had an implicit assumption that the phi src was assigned in it's source (pred) block leading into the phi. But this is not true with NIR, so we can't just ignore the source block specified in the nir_phi_src. Insert an extra mov in the source block. If it is not required the CP pass will take it back out again. Fixes: ./tests/spec/glsl-1.10/execution/vs-call-in-nested-loop.shader_test ./tests/spec/glsl-1.10/execution/vs-inner-loop-modifies-outer-loop-var.shader_test and probably others. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: eliminate unnecessary absneg'sRob Clark2016-04-052-3/+26
| | | | | | | | | | | | | | | | The frontend inserts (abs) and (neg)'s to convert between NIR boolean (~0/0) and native boolean (1/0). So we'd end up with things like: cmps.s.ge r1.x, ... absneg.s r1.x, (neg)r1.x absneg.s r1.x, (abs)r1.x sel.b32 r2.x, r0.x, r1.x, r0.y The (neg) already gets collapsed due to the following (abs). Now by realizing that r1.x comes from a cmps.s instruction, we can drop the (abs) as well. Signed-off-by: Rob Clark <[email protected]>
* clover: Fix build against clang SVN >= r265359Michel Dänzer2016-04-051-0/+3
| | | | | Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeonsi: use bounded indexing for samplersBas Nieuwenhuizen2016-04-051-1/+4
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use bounded indexing for constant buffersBas Nieuwenhuizen2016-04-051-2/+3
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: allow multiple exports of the same texture with different usageMarek Olšák2016-04-051-21/+33
| | | | | | | | | Instead of failing an assertion, disable DCC and CMASK on the first export that needs it, and merge the external usage flags. v2: clear the EXPLICIT_FLUSH flag if it's not set; whitespace fixes Reviewed-by: Michel Dänzer <[email protected]>
* freedreno/ir3: deal with duplicate phi sourcesRob Clark2016-04-041-5/+20
| | | | | | | | | | | | | Otherwise we end up with funny things like: mov.f32f32 r0.x, r1.y mov.f32f32 r0.x, r1.y (It doesn't happen as much after fixing the problem w/ CP into phi src, but it can still happen since we aren't too clever about generating phi sources in the first place.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix silly brain-fart in RARob Clark2016-04-041-2/+1
| | | | | | | We want to consider all the vars, not 1/32nd of them, when extending live-ranges. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't cp into phi'sRob Clark2016-04-041-0/+6
| | | | | | | | The block defining a phi source might not have been executed. If we allow copy propagation, we could end up pointing to a src instruction in the wrong block. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: we can't store immediate valuesRob Clark2016-04-041-0/+13
| | | | | | | | Fixes some transform-feedback piglits, like: bin/ext_transform_feedback-nonflat-integral Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add dumping for use/def/live-in/live-outRob Clark2016-04-043-10/+42
| | | | | | Turned out to be useful to debug an issue in RA. Let's keep it. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop unused instr category argRob Clark2016-04-045-114/+108
| | | | | | No longer used, so drop the extra arg to ir3_instr_create() Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove ir3_instruction::categoryRob Clark2016-04-0410-93/+84
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: encode instruction category in opc_tRob Clark2016-04-045-192/+201
| | | | | | | | | | | | | Been on my TODO list for a while. If nothing else this will make gdb properly grok the opc_t enum. This first step preserves ir3_instruction::category (with an added assert that category matches what is encoded in opc_t). Next step is to drop the category field (and arg to ir3_instr_create()), but that is split into next commit for bisectability and so that we can run piglit in the intermediate state to flush out any problems. Signed-off-by: Rob Clark <[email protected]>
* nvc0: add hardware ETC2 and ASTC support on GK20A and GM107+Ilia Mirkin2016-04-043-2/+64
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallivm: Introduce lp_format_intrinsic.Jose Fonseca2016-04-043-14/+54
| | | | | | | | | | For adding .v4f32 like suffixes to intrinsics, taking special care for scalar case, which was being often neglected. This fixes invalid IR when doing mipmap filtering on SSE2 (the only case where we'd use intrinsics with scalars.) Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use llvm.fabs.Jose Fonseca2016-04-031-8/+3
| | | | | | Exactly the same code. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Prefer backend agnostic intrinsic for rounding.Jose Fonseca2016-04-031-7/+39
| | | | | | | | | We could unconditionally use these instrinsics, but performance with SSE2 would suck, as LLVM falls back to calling libm. lp_test_arit. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Add debug option to force SSE2.Jose Fonseca2016-04-031-11/+14
| | | | | | For simulating less capable machines. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Test abs.Jose Fonseca2016-04-031-0/+1
| | | | Trivial.
* llvmpipe: Build lp_test_arit on MSVC too.Jose Fonseca2016-04-031-3/+1
| | | | | | It builds fine now. Probably due to C99 support. Trivial.
* gallivm: Fix performance regressions due to vector selects.Jose Fonseca2016-04-031-22/+18
| | | | | | | | | LLVM often can't determine the mask elements are all ones/zeros, and there doesn't seem to be a good way to hint that. Thanks to Roland Scheidegger for spotting and analyzing the issue. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Remove lp_build_load_volatile.Jose Fonseca2016-04-032-12/+0
| | | | | | | No longer needed. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards.Jose Fonseca2016-04-039-27/+39
| | | | | | | | | Only provide a fallback for LLVM 3.3. One less dependency on LLVM C++ interface. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gm107/ir: add OP_SELP emission, used in DSQRT loweringIlia Mirkin2016-04-021-0/+30
| | | | | | | | The current DSQRT lowering code emits an OP_SELP, so we have to handle its emission. This will eventually go away, but no harm supporting this op. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: we can't load local memory directly into an outputIlia Mirkin2016-04-021-1/+2
| | | | | | | | | | | This fixes piglit tests like tests/spec/glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr.shader_test and related ones. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: fix envyas variants when building the code libSamuel Pitoiset2016-04-021-2/+2
| | | | | | | nvc0 and nve4 have been respectively replaced by gf100 and gk104. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* svga: remove unused svga_compile_key::texture_msaa fieldBrian Paul2016-04-022-2/+0
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* svga: check TXF instruction's target to determine MSAABrian Paul2016-04-021-1/+1
| | | | | | | | | | Rather than the currently bound texture. This goes along with the earlier patch to get away from examining bound textures and sampler views during shader translation. Fixes VMware bug 1632739. Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: add simple tgsi_is_msaa_target() helperBrian Paul2016-04-021-0/+8
| | | | Reviewed-by: Jose Fonseca <[email protected]>