summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* nvc0: handle the case where there are no framebuffer attachmentsIlia Mirkin2016-04-094-11/+44
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: support sending string markers down into the command streamIlia Mirkin2016-04-094-2/+52
| | | | | | | This should hopefully make it a little easier to debug with GL applications like glretrace and looking at command streams. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: add invalidate_resource support for buffer resourcesIlia Mirkin2016-04-097-2/+51
| | | | | | | Provide a callback to reallocate the underlying storage of a resource so that it is not bound to any existing fences. Signed-off-by: Ilia Mirkin <[email protected]>
* vc4: Move FRAG_X/Y/REV_FLAG to a QFILE like VPM or TLB color writes.Eric Anholt2016-04-084-27/+29
| | | | | This gives us one less set of special instruction generation cases, and instead just the case for returning the correct register to read.
* vc4: Allow TLB Z/color/stencil writes from any ALU operation in QIR.Eric Anholt2016-04-085-65/+100
| | | | | | | | This lets us write the Z directly from the FTOI for computed Z, and may let us coalesce color writes in the future. No change in my shader-db, but clearly drops an instruction in piglit's early-z test.
* vc4: Add a helper function for the construction of qregs.Eric Anholt2016-04-084-12/+13
| | | | | The separate declaration of the struct is not helping clarity, and I was going to be writing a whole lot more of these in the upcoming patches.
* vc4: Add missing scheduling dependency for MS color writes.Eric Anholt2016-04-081-0/+1
|
* vc4: Drop the multi_instruction distinction for QIR instructions.Eric Anholt2016-04-082-14/+5
| | | | | | | It wasn't correctly flagged everywhere, and QPU generation now handles the only remaining case that was paying attention to it. No change on shader-db.
* vc4: Handle SF on instructions that write r4.Eric Anholt2016-04-081-10/+14
| | | | | | | Normal SFU writes couldn't have SF because they were marked as multi_instruction, but tex_result and tlb_color_read weren't. This ended up not being a problem according to anything in shader-db, but it seems possible.
* vc4: Allow multi-instruction QIR nodes to get VPM optimization.Eric Anholt2016-04-081-2/+2
| | | | | | | | | | | There used to be multi-instruction operations that would use src[] twice, which is why we couldn't do some optimizations on them. This is no longer the case. total instructions in shared programs: 77973 -> 77969 (-0.01%) instructions in affected programs: 84 -> 80 (-4.76%) total estimated cycles in shared programs: 234165 -> 234157 (-0.00%) estimated cycles in affected programs: 92 -> 84 (-8.70%)
* vc4: Switch to using NIR_PASS macros.Eric Anholt2016-04-085-33/+32
| | | | This gets us better validation of our NIR transformations.
* vc4: Handle nir_intrinsic_load_user_clip_plane as a vec4.Eric Anholt2016-04-082-20/+12
| | | | | | | | I liked having all my NIR be scalar, but nir_validate() complains that the intrinsic writes 4 components but the destination we set up was only 1 component. I could generate a new scalar variant, but it's a lot easier to just leave it as a vec4. This doesn't hurt codegen since we GC unused uniforms, and UCP dot products use all the components anyway.
* vc4: Emit a warning and proceed for handling loops in NIR.Rhys Kidd2016-04-081-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | We don't really suppor control flow yet, but it's a lot nicer to render something and warn on stderr than to crash. Fixes the following piglit tests: - shaders/complex-loop-analysis-bug - shaders/glsl-fs-discard-04 Converts the following piglit tests from crash to fail: - shaders/glsl-fs-continue-inside-do-while - shaders/glsl-fs-loop - shaders/glsl-fs-loop-continue - shaders/glsl-fs-loop-nested - shaders/glsl-texcoord-array - shaders/glsl-vs-continue-inside-do-while - shaders/glsl-vs-loop - shaders/glsl-vs-loop-continue - shaders/glsl-vs-loop-nested No piglit regressions. v2 (Eric): Add stronger stderr warning. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Add a stub for NIR->QIR of control flow function nodesRhys Kidd2016-04-081-0/+11
| | | | | | | | We shouldn't have any NIR functions present since all GLSL functions get inlined, but this would be a more informative error if it does happen. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Add better debug of NIR->QIR control flow graph failureRhys Kidd2016-04-081-1/+2
| | | | | | | | | | | | | Ensure NIR control flow graph nodes that are unhandled in QIR are reported with sufficient verbosity to aid debugging. This improves piglit outputs, amongst other tools. There are no other remaining uses of assert(0) as a blunt tool within vc4. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Remove unused include from vc4_program.cRhys Kidd2016-04-081-1/+0
| | | | | | | | Found with grep and inspection. Test compiled on RPi hw. Assists any future effort to remove TGSI as an intermediate stage. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radeonsi: do per-pixel clipping based on viewport statesMarek Olšák2016-04-082-11/+85
| | | | | | | | | In other words, vport scissors are derived from viewport states. If the scissor test is enabled, the intersection of both is used. The guard band will disable clipping, so we have to clip per-pixel. Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50/ir: do not try to attach JOIN ops to ATOMSamuel Pitoiset2016-04-071-1/+1
| | | | | | | | | | | This might result in an INVALID_OPCODE dmesg error in case a join is attached to an atomic operation. Spotted with arb_shader_image_load_store-host-mem-barrier on GK104. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* radeonsi: raise number of samplers per shader to 32Nicolai Hähnle2016-04-071-3/+3
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94835 Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: expand the compressed color and depth texture masks to 64 bitsNicolai Hähnle2016-04-073-18/+18
| | | | | | | | | | | This is in preparation of raising the number of exposed sampler views to 32 bits, which will raise the total number of sampler views to 33 for the polygon stipple texture. That texture should never be compressed (and it's certainly not a depth texture), but this approach seems cleaner to me than special-casing the last slot in all affected code paths. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: replace magic 16 by SI_NUM_USER_SAMPLERSNicolai Hähnle2016-04-071-1/+1
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* svga: new SVGA_MSAA env var to disable/enable MSAA pixel formatsBrian Paul2016-04-071-2/+4
| | | | | | On by default. Reviewed-by: Jose Fonseca <[email protected]>
* svga: add some trivial null pointer checksBrian Paul2016-04-073-0/+9
| | | | | | | | These small mallocs will probably never fail, but static analysis tools may complain about the missing checks. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* trace: add missing set_shader_images()Samuel Pitoiset2016-04-073-0/+81
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable perfect ZPASS counts for PIPE_QUERY_OCCLUSION_PREDICATEMarek Olšák2016-04-073-5/+16
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't use the real barrier instruction in tess ctrl shadersMarek Olšák2016-04-071-0/+8
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* r600: use radeon_emit in a few more places in evergreen_computeDave Airlie2016-04-071-4/+4
| | | | | | | | This is just a cleanup of the code. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: make compute global buffer functions static.Dave Airlie2016-04-072-98/+86
| | | | | | | | | This moves things around so that the global buffer handling functions in evergreen_compute.c are static. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: make two compute functions static.Dave Airlie2016-04-072-5/+3
| | | | | | | | These aren't used outside evergreen_compute.c Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: using pipe_grid_info more in evergreen_compute.Dave Airlie2016-04-072-26/+21
| | | | | | | | | No reason to pull the pieces apart here, also make one of the functions static as it's unused outside this. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: in evergreen_compute use ctx consistently instead of ctx_Dave Airlie2016-04-071-25/+25
| | | | | | Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: use rctx consistently in evergreen_compute.cDave Airlie2016-04-071-74/+74
| | | | | | | | Another step towards cleaning this up. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: cleanup whitespace in evergreen_compute.cDave Airlie2016-04-071-87/+75
| | | | | | | | | | This aligns the code with the style of the rest of the driver. Makes editing it a lot less painful. Acked-by: Tom Stellard <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: Enable ARB_framebuffer_no_attachmentsEdward O'Callaghan2016-04-071-1/+1
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Enable ARB_framebuffer_no_attachmentsEdward O'Callaghan2016-04-071-1/+1
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Improve assert info out of si_set_framebuffer_state()Edward O'Callaghan2016-04-071-0/+2
| | | | | | | | Lets give the developer a little hand if we are going to assert on a zero literal at the end of a branch. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Allow 16 samples MSAA mode for PIPE_FORMAT_NONEEdward O'Callaghan2016-04-071-0/+5
| | | | | | | | | For ARB_framebuffer_no_attachment; A is_format_supported() query with 'PIPE_FORMAT_NONE' passed implies a query of the number of samples supported from the framebuffer with no attachment. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* softpipe: Set samples and layers in set_framebuffer_state() cbEdward O'Callaghan2016-04-071-0/+2
| | | | | | | | | Carries across the number of samples and layers state in the 'softpipe_set_framebuffer_state()' callback. This state is part of 'ARB_framebuffer_no_attachments' support. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/trace: Dump no.of samples and layers in fb stateEdward O'Callaghan2016-04-071-0/+2
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENTEdward O'Callaghan2016-04-0714-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PIPE_CAP to determine if the GL extension 'GL_ARB_framebuffer_no_attachments' shall be supported. The driver is required to support 'PIPE_FORMAT_NONE' via its 'is_format_supported()' callback in order to determine the MSAA modes the hardware supports so that values requested from the application using 'GL_ARB_framebuffer_no_attachments' may be quantized to what the hardware expects. V.2: Fix doc for a more detailed description of the PIPE_CAP and the corresponding GL constant. V.3: Renamed and repurposed once again. V.4: Remove CAP from cap_mapping array. [airlied: fix damaged whitespace] Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: set shader calling conventionsBas Nieuwenhuizen2016-04-061-1/+16
| | | | | | | | | | | | | Note that old mesa + new LLVM or new mesa + old LLVM breaks with this change and the corresponding LLVM change (D18559). For LLVM version <= 3.8 we use the old method, but we can't detect people using a post 3.8 svn version that is still too old. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* freedreno/ir3: insert extra move into phiRob Clark2016-04-051-0/+10
| | | | | | | | | | | | | | | | | We had an implicit assumption that the phi src was assigned in it's source (pred) block leading into the phi. But this is not true with NIR, so we can't just ignore the source block specified in the nir_phi_src. Insert an extra mov in the source block. If it is not required the CP pass will take it back out again. Fixes: ./tests/spec/glsl-1.10/execution/vs-call-in-nested-loop.shader_test ./tests/spec/glsl-1.10/execution/vs-inner-loop-modifies-outer-loop-var.shader_test and probably others. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: eliminate unnecessary absneg'sRob Clark2016-04-052-3/+26
| | | | | | | | | | | | | | | | The frontend inserts (abs) and (neg)'s to convert between NIR boolean (~0/0) and native boolean (1/0). So we'd end up with things like: cmps.s.ge r1.x, ... absneg.s r1.x, (neg)r1.x absneg.s r1.x, (abs)r1.x sel.b32 r2.x, r0.x, r1.x, r0.y The (neg) already gets collapsed due to the following (abs). Now by realizing that r1.x comes from a cmps.s instruction, we can drop the (abs) as well. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: use bounded indexing for samplersBas Nieuwenhuizen2016-04-051-1/+4
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use bounded indexing for constant buffersBas Nieuwenhuizen2016-04-051-2/+3
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: allow multiple exports of the same texture with different usageMarek Olšák2016-04-051-21/+33
| | | | | | | | | Instead of failing an assertion, disable DCC and CMASK on the first export that needs it, and merge the external usage flags. v2: clear the EXPLICIT_FLUSH flag if it's not set; whitespace fixes Reviewed-by: Michel Dänzer <[email protected]>
* freedreno/ir3: deal with duplicate phi sourcesRob Clark2016-04-041-5/+20
| | | | | | | | | | | | | Otherwise we end up with funny things like: mov.f32f32 r0.x, r1.y mov.f32f32 r0.x, r1.y (It doesn't happen as much after fixing the problem w/ CP into phi src, but it can still happen since we aren't too clever about generating phi sources in the first place.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix silly brain-fart in RARob Clark2016-04-041-2/+1
| | | | | | | We want to consider all the vars, not 1/32nd of them, when extending live-ranges. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't cp into phi'sRob Clark2016-04-041-0/+6
| | | | | | | | The block defining a phi source might not have been executed. If we allow copy propagation, we could end up pointing to a src instruction in the wrong block. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: we can't store immediate valuesRob Clark2016-04-041-0/+13
| | | | | | | | Fixes some transform-feedback piglits, like: bin/ext_transform_feedback-nonflat-integral Signed-off-by: Rob Clark <[email protected]>