summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* nir: rename nir_foreach_block*() to nir_foreach_block*_call()Connor Abbott2016-04-201-1/+1
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno/a4xx: lower srgb in shader for astc texturesRob Clark2016-04-197-6/+62
| | | | | | | | | | | | | | This *seems* like a hw bug, and maybe only applies to certain a4xx variants/revisions. But setting the SRGB bit in sampler view state (texconst0) causes invalid alpha for ASTC textures. Work around this by doing the srgb->linear conversion in the shader instead. This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb* (The remaining fails seem to be a bug w/ ASTC + linear filtering, also possibly a420.0 specific.) Signed-off-by: Rob Clark <[email protected]>
* freedreno: cleanup fd_set_sampler_viewsRob Clark2016-04-191-37/+24
| | | | | | | The separate FS/VS entrypoints are no longer used since a3ed98f. So just inline them. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix grouping issue w/ reverse swizzlesRob Clark2016-04-181-1/+17
| | | | | | | | | | | | | | | | | | | | | | | When we have something like: MOV OUT[n], IN[m].wzyx the existing grouping code was missing a potential conflict. Due to input needing to be sequential scalar regs, we have: IN: x <-> y <-> z <-> w which would be grouped to: OUT: w <-> z2 <-> y2 <-> x (where the 2 denotes a copy/mov) but that can't actually work. We need to realize that x and w are already in the same chain, not just that they aren't both already in new chain being built. With this fixed, we probably no longer need the hack from f68f6c0. Signed-off-by: Rob Clark <[email protected]>
* nir/dead_variables: Configurably work with any variable modeJason Ekstrand2016-04-131-1/+1
| | | | | | | The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <[email protected]>
* Revert "freedreno/a4xx: better occlusion/sample counting"Rob Clark2016-04-131-6/+1
| | | | | | | | | | This reverts commit 62fa868728c729152af0d7cecd1d3e47e831cb7d. dEQP-GLES3.functional.occlusion_query.* was unhappy about that change. Still not really sure *what* the other slots in the sample results buffer are. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: rasterizer_discard supportRob Clark2016-04-131-0/+17
| | | | | | | | | | | | | | | | | This one is slightly annoying, since trying to write RBRC from draw would clobber values set in the tiling/gmem code. We could do command- stream patching for RBRC, as is done on a3xx. Although since it seems to be a rarely used feature, it is easier just to do RMW to set/clear the bit. Fixes dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_triangles and related tests. a3xx still needs the same feature, although there it probably makes more sense to take advantage of the existing cmdstream patching which is required for RBRC for other reasons. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix array textures on a4xxRob Clark2016-04-131-3/+9
| | | | | | | | Seems like a4xx needs offset added to array index for all arrays, whereas a3xx only for cubemap arrays. Fixes a whole swath of dEQP fails (roughly *sampler2darray*). Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix stream-out offset handling for lines/trisRob Clark2016-04-131-1/+1
| | | | | | | | We need to increment offset by # of vertices, not by # of prims. Fixes a bunch of dEQP fails involving prims other than points. For example, dEQP-GLES3.functional.transform_feedback.position.lines_separate Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix handling for stream-out offsetsRob Clark2016-04-131-1/+2
| | | | | | | | | | | | | | | | | | If changed && append, we shouldn't be resetting the internal offset back to zero. This fixes issues w/ sequences like: glBeginTransformFeedback() glDraw() glPauseTransformFeedback() glDraw() glResumeTransformFeedback() glDraw() glEndTransformFeedback() Fixes dEQP-GLES3.functional.transform_feedback.array.separate.points.lowp_vec3 and related tests. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix prims-emitted queryRob Clark2016-04-133-2/+12
| | | | | | This should only count when TF is not paused. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix max-line-widthRob Clark2016-04-131-0/+10
| | | | | | | | | | | dEQP noticed that we were advertising completely bogus values. The actual maximum is 127.0f. *But* we have to use an artifically low maximum to work around a bug in the dEQP test, which gets confused when the max line width is too large and lines start going off-screen. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add flag to enable dEQP hacksRob Clark2016-04-132-0/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: hack to avoid getting stuck in a loopRob Clark2016-04-131-1/+11
| | | | | | | There are still some edge cases which result in a neighbor-loop. Which needs to be fixed, but this hack at least makes deqp tests finish. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use (ss) instead of (sy) for ldlvRob Clark2016-04-131-1/+7
| | | | | | | Fixes a bunch of flat-varying fail on a4xx (where we need to use ldlv to read the un-interpolated varying). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cleanup double cmps.s from frontendRob Clark2016-04-131-0/+31
| | | | | | | | | | | | | Since we cannot mov into a predicate register, the frontend uses a 'cmps.s p0.x, cond, 0' as a stand-in for mov to p0.x. It does this since it has no way to know that the source cond instruction (ie. for a kill, br, etc) will only be used to write the predicate reg. Detect this, and re-write the instruction writing p0.x to skip the original cmps.[sfu]. (It is done like this, rather than re-writing the dest of the first cmps.[sfu] in case the first cmps.[sfu] actually has other users.) Signed-off-by: Rob Clark <[email protected]>
* gallium: Add capability for ARB_robust_buffer_access_behavior.Bas Nieuwenhuizen2016-04-121-0/+1
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add pipe_context::set_active_query_state for pausing queriesMarek Olšák2016-04-121-0/+6
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir/lower_system_values: Add support for several computed valuesJason Ekstrand2016-04-111-0/+1
| | | | Reviewed-by: Rob Clark <[email protected]>
* gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENTEdward O'Callaghan2016-04-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PIPE_CAP to determine if the GL extension 'GL_ARB_framebuffer_no_attachments' shall be supported. The driver is required to support 'PIPE_FORMAT_NONE' via its 'is_format_supported()' callback in order to determine the MSAA modes the hardware supports so that values requested from the application using 'GL_ARB_framebuffer_no_attachments' may be quantized to what the hardware expects. V.2: Fix doc for a more detailed description of the PIPE_CAP and the corresponding GL constant. V.3: Renamed and repurposed once again. V.4: Remove CAP from cap_mapping array. [airlied: fix damaged whitespace] Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* freedreno/ir3: insert extra move into phiRob Clark2016-04-051-0/+10
| | | | | | | | | | | | | | | | | We had an implicit assumption that the phi src was assigned in it's source (pred) block leading into the phi. But this is not true with NIR, so we can't just ignore the source block specified in the nir_phi_src. Insert an extra mov in the source block. If it is not required the CP pass will take it back out again. Fixes: ./tests/spec/glsl-1.10/execution/vs-call-in-nested-loop.shader_test ./tests/spec/glsl-1.10/execution/vs-inner-loop-modifies-outer-loop-var.shader_test and probably others. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: eliminate unnecessary absneg'sRob Clark2016-04-052-3/+26
| | | | | | | | | | | | | | | | The frontend inserts (abs) and (neg)'s to convert between NIR boolean (~0/0) and native boolean (1/0). So we'd end up with things like: cmps.s.ge r1.x, ... absneg.s r1.x, (neg)r1.x absneg.s r1.x, (abs)r1.x sel.b32 r2.x, r0.x, r1.x, r0.y The (neg) already gets collapsed due to the following (abs). Now by realizing that r1.x comes from a cmps.s instruction, we can drop the (abs) as well. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: deal with duplicate phi sourcesRob Clark2016-04-041-5/+20
| | | | | | | | | | | | | Otherwise we end up with funny things like: mov.f32f32 r0.x, r1.y mov.f32f32 r0.x, r1.y (It doesn't happen as much after fixing the problem w/ CP into phi src, but it can still happen since we aren't too clever about generating phi sources in the first place.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix silly brain-fart in RARob Clark2016-04-041-2/+1
| | | | | | | We want to consider all the vars, not 1/32nd of them, when extending live-ranges. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't cp into phi'sRob Clark2016-04-041-0/+6
| | | | | | | | The block defining a phi source might not have been executed. If we allow copy propagation, we could end up pointing to a src instruction in the wrong block. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: we can't store immediate valuesRob Clark2016-04-041-0/+13
| | | | | | | | Fixes some transform-feedback piglits, like: bin/ext_transform_feedback-nonflat-integral Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add dumping for use/def/live-in/live-outRob Clark2016-04-043-10/+42
| | | | | | Turned out to be useful to debug an issue in RA. Let's keep it. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop unused instr category argRob Clark2016-04-045-114/+108
| | | | | | No longer used, so drop the extra arg to ir3_instr_create() Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove ir3_instruction::categoryRob Clark2016-04-0410-93/+84
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: encode instruction category in opc_tRob Clark2016-04-045-192/+201
| | | | | | | | | | | | | Been on my TODO list for a while. If nothing else this will make gdb properly grok the opc_t enum. This first step preserves ir3_instruction::category (with an added assert that category matches what is encoded in opc_t). Next step is to drop the category field (and arg to ir3_instr_create()), but that is split into next commit for bisectability and so that we can run piglit in the intermediate state to flush out any problems. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix for load_front_face intrinsicRob Clark2016-03-281-1/+8
| | | | | | | Seems like trying to widen in the same instruction as the add.s does a non-sign-extending widen. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix compiler warnRob Clark2016-03-281-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir: add a bit_size parameter to nir_ssa_dest_initConnor Abbott2016-03-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: rename nir_const_value fields to include bitsize informationIago Toral Quiroga2016-03-171-5/+5
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* freedreno/ir3: lower extract_byte/wordRob Clark2016-03-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | The following commits broke things by starting to feed us unhandled extract_u16/extract_u8 opcodes: commit 905ff861982450831a56d112036f68a751337441 Author: Matt Turner <[email protected]> AuthorDate: Wed Feb 3 14:28:31 2016 -0800 Commit: Matt Turner <[email protected]> CommitDate: Fri Mar 4 11:52:34 2016 -0800 nir: Recognize open-coded extract_u16. commit 76289fbfa84a06ef4db8ad44ea0eb88ad0be8d5c Author: Matt Turner <[email protected]> AuthorDate: Thu Jan 21 09:09:48 2016 -0800 Commit: Matt Turner <[email protected]> CommitDate: Fri Mar 4 11:52:34 2016 -0800 nir: Recognize open-coded extract_u8. Signed-off-by: Rob Clark <[email protected]>
* freedreno: OUT_RELOC vs OUT_RELOCW fixesRob Clark2016-03-133-7/+7
| | | | | | Make sure we use OUT_RELOCW() in cases where the buffer is written to. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: hw binningRob Clark2016-03-134-33/+210
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: use generated headers for draw initiatorRob Clark2016-03-131-3/+4
| | | | | | No need to open-code this. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: remove RB_RENDER_CONTROL patchingRob Clark2016-03-136-41/+8
| | | | | | | | | Bitfields where shuffled around for the better on a4xx, so we don't need any patching on this one. It appears to be something we set entirely in the gmem code so no conflict between tiling and render state like we had in a3xx. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2016-03-135-11/+32
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: move where we deal w/ binning FSRob Clark2016-03-133-10/+10
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: move where we deal w/ binning FSRob Clark2016-03-133-10/+10
| | | | | | | Move where we pick dummy FS for binning pass, so the whole driver sees the same dummy/no-op FS stage. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: constify the shader variantsRob Clark2016-03-132-6/+6
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: constify the shader variantsRob Clark2016-03-134-13/+13
| | | | | | Most of the driver just needs read-only access, so constify.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: remove duplicate mark of end of binning cmdsRob Clark2016-03-131-3/+0
| | | | Signed-off-by: Rob Clark <[email protected]>
* gallium: add CAPs returning PCI device locationMarek Olšák2016-03-091-0/+4
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallium: add external usage flags to resource_from(get)_handle (v2)Marek Olšák2016-03-091-1/+2
| | | | | | | | | This will allow drivers to make better decisions about texture sharing for DRI2, DRI3, Wayland, and OpenCL. v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER Reviewed-by: Axel Davy <[email protected]>
* freedreno/ir3: enable shareable shadersRob Clark2016-03-015-8/+12
| | | | | | | Now that we are no longer using the pctx reference in the shader, drop it and turn on shareable shaders. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: pass ctx to constant-emit codeRob Clark2016-03-014-25/+21
| | | | | | | Rather than fishing it out of the shader. This removes the other big user of shader->pctx. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add dev ptr to ir3_compilerRob Clark2016-03-016-8/+10
| | | | | | | | And use this for allocating bo's to hold the shader binary, rather than accessing the dev via ctx ptr. One step towards making shaders sharable across contexts. Signed-off-by: Rob Clark <[email protected]>