summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* v3d: Handle a no-intersection scissor even if it's outside of the VP.Eric Anholt2018-06-151-10/+8
| | | | | | The min/maxes ended up producing a negative clip width/height for dEQP-GLES3.functional.fragment_ops.scissor.outside_render_line. Just make sure they stay at 0 (or v3d 3.x's workaround) if that happens.
* v3d: Use the proper depth texture type for sampling.Eric Anholt2018-06-151-3/+3
| | | | Fixes failing tests in dEQP-GLES3.functional.texture.shadow
* v3d: Limit shader threading according to our maximum TMU fifo usage.Eric Anholt2018-06-151-10/+24
| | | | | | Fixes simulator assertion failures in dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment and similar complicated cases.
* v3d: Fix shaders using pixel center W but no varyings.Eric Anholt2018-06-154-16/+9
| | | | | | | | The docs called this field "uses both center W and centroid W", but actually it's "do you need center W even if varyings don't obviously call for it?" Fixes dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_w
* intel/aubinator: Use int to store getopt_long flags.Rafael Antognolli2018-06-151-2/+2
| | | | | | | | getopt_long flag parameter is an int pointer, so if we use bool to store those values, when getopt_long writes to one of them, it might end up overwriting the next one. Reviewed-by: Ian Romanick <[email protected]>
* Revert "radv: always set/load both depth and stencil clear values"Samuel Pitoiset2018-06-151-5/+28
| | | | | | | | | This fixes a rendering regression with RoTR. This reverts commit 4bdad9faddc82a4560603936ce5ade5707ecb254. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't check for linear images in emit_fast_color_clear()Samuel Pitoiset2018-06-151-2/+0
| | | | | | | | We don't enable CMASK for linear surfaces and addrlib only enables DCC for tiling surfaces. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allow RADV_PERFTEST=dccmsaa on GFX9Samuel Pitoiset2018-06-151-2/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add RADV_DEBUG=checkirSamuel Pitoiset2018-06-155-3/+11
| | | | | | | This allows to run the LLVM verifier pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: update ZRANGE_PRECISION in radv_update_bound_fast_clear_ds()Samuel Pitoiset2018-06-151-31/+15
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clean up radv_{set,load}_depth_clear_regs() helpersSamuel Pitoiset2018-06-153-32/+44
| | | | | | | And replace _regs by _metadata because it makes more sense. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always set/load both depth and stencil clear valuesSamuel Pitoiset2018-06-151-28/+5
| | | | | | | | I don't think that matter much to emit both values and that makes the code a bit simpler. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: update the fast ds clear values only if the image is boundSamuel Pitoiset2018-06-151-5/+32
| | | | | | | | It's unnecessary to update the fast depth/stencil clear values if the fast cleared depth/stencil image isn't currently bound. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clean up radv_{set,load}_color_clear_regs() helpersSamuel Pitoiset2018-06-153-33/+47
| | | | | | | And replace _regs by _metadata because it makes more sense. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: update the fast color clear values only if the image is boundSamuel Pitoiset2018-06-151-3/+32
| | | | | | | | It's unnecessary to update the fast color clear values if the fast cleared color image isn't currently bound. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* util/bitset: include util/macro.hChristian Gmeiner2018-06-151-0/+1
| | | | | | | | | | BITSET_FFS(x) macro makes use of ARRAY_SIZE(x) macro which is defined in util/macro.h. Include it directy to make usage more straightforward. Fixes: 692bd4a1ab9 ("util: replace Elements() with ARRAY_SIZE()") Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nvc0: add support for programmable sample locationsRhys Perry2018-06-1410-46/+299
| | | | Signed-off-by: Rhys Perry <[email protected]>
* st/mesa: add support for ARB_sample_locationsRhys Perry2018-06-148-7/+129
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Brian Paul <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]> (v2)
* gallium: add support for programmable sample locationsRhys Perry2018-06-1424-2/+120
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Brian Paul <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]> (v2)
* mesa: add support for ARB_sample_locationsRhys Perry2018-06-1412-28/+455
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Brian Paul <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]> (v2)
* v3d: Fix polygon offset for Z16 buffers.Eric Anholt2018-06-143-2/+14
| | | | | | Fixes: dEQP-GLES3.functional.polygon_offset.fixed16_displacement_with_units dEQP-GLES3.functional.polygon_offset.fixed16_render_with_units
* v3d: Fix configuration setup of mixed f32 and f16 render targets.Eric Anholt2018-06-141-1/+1
| | | | Fixes dEQP-GLES3.functional.fragment_out.random.26 and 6 others.
* v3d: Don't set the first_ez_state to DISABLED if after only UNDECIDED draws.Eric Anholt2018-06-141-1/+2
| | | | | | | | | We need to have the RCL start with EZ enabled, since those undecided draws had EZ enabled. But we do need to update from UNDECIDED to LT or GT as necessary still. Fixes many simulator assertion fails in deqp fragment_ops/interaction/basic_shader/*
* v3d: Use the right size for v3d 4.x TEXTURE_SHADER_STATE BO.Eric Anholt2018-06-141-2/+2
| | | | This doesn't really matter, since they both get rounded up to 4096.
* v3d: Add static asserts for other packed packet sizes.Eric Anholt2018-06-142-0/+7
|
* v3d: Fix the size of the packed attribute state.Eric Anholt2018-06-141-1/+1
| | | | Fixes segfaults in dEQP-GLES3.functional.vertex_array_objects.all_attributes.
* v3d: Remove some unused context fields from vc4.Eric Anholt2018-06-141-11/+0
|
* v3d: Remove unused QUNIFORM_STENCIL left over from vc4.Eric Anholt2018-06-142-11/+0
|
* v3d: Use our #define for max attributes in shader caps.Eric Anholt2018-06-141-1/+1
|
* v3d: Fix undefined results for a swap_color_rb RT from a float shader output.Eric Anholt2018-06-141-1/+4
| | | | | Fixes segfaults and undefined behavior in dEQP-GLES3.functional.fragment_out.basic.fixed.srgb8_alpha8_lowp_float
* radv: remove multisample bit from shader key.Dave Airlie2018-06-153-4/+0
| | | | | | This wasn't being used anywhere inside the shader from what I can see. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/compiler: Properly consider UBO loads that cross 32B boundaries.Kenneth Graunke2018-06-141-2/+14
| | | | | | | | | | | | | | | | | | | | | | | The UBO push analysis pass incorrectly assumed that all values would fit within a 32B chunk, and only recorded a bit for the 32B chunk containing the starting offset. For example, if a UBO contained the following, tightly packed: vec4 a; // [0, 16) float b; // [16, 20) vec4 c; // [20, 36) then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, which means that we ought to record two 32B chunks in the bitfield. Similarly, dvec4s would suffer from the same problem. v2: Rewrite the accounting, my calculations were wrong. v3: Write a comment about partial values (requested by Jason). Reviewed-by: Rafael Antognolli <[email protected]> [v1] Reviewed-by: Jason Ekstrand <[email protected]> [v3]
* glsl: Don't copy propagate elements from SSBO or shared variables eitherIan Romanick2018-06-141-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | Since SSBOs can be written by a different GPU thread, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. The same shader was helped by this patch and the previous. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399119 -> 14399113 (<.01%) instructions in affected programs: 683 -> 677 (-0.88%) helped: 1 HURT: 0 total cycles in shared programs: 532973113 -> 532971865 (<.01%) cycles in affected programs: 524666 -> 523418 (-0.24%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
* glsl: Don't copy propagate from SSBO or shared variables eitherIan Romanick2018-06-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | Since SSBOs can be written by other GPU threads, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399120 -> 14399119 (<.01%) instructions in affected programs: 684 -> 683 (-0.15%) helped: 1 HURT: 0 total cycles in shared programs: 532978931 -> 532973113 (<.01%) cycles in affected programs: 530484 -> 524666 (-1.10%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
* meson: only build vl_winsys_dri.c when x11 platform is usedLukas Rusak2018-06-141-1/+1
| | | | | | | | | | | | | | This seems to have been missed in the move from autotools This fixes the following build issue: ../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error: X11/Xlib-xcb.h: No such file or directory #include <X11/Xlib-xcb.h> ^~~~~~~~~~~~~~~~ Fixes: b1b65397d0c4978e36a84c0a1c98a4bd6cb9588e ("meson: Build gallium auxiliary") Reviewed-by: Dylan Baker <[email protected]>
* st/mesa: add missing switch cases in glsl_to_tgsi_visitor::visit()Brian Paul2018-06-141-0/+2
| | | | | | To silence compiler warning about unhandled switch cases. Reviewed-by: Charmaine Lee <[email protected]>
* radv: Fix output for sparse MRTs.Bas Nieuwenhuizen2018-06-141-9/+10
| | | | | | | | | | | | We need to init the cb_shader_format correctly with the changed col_format, so this moves the col_format adjustment to before the adjustment to before the cb_shader_mask gets generated. Fixes: 06d3c650980 "radv: fix a GPU hang when MRTs are sparse" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903 CC: 18.1 <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: update the ZRANGE_PRECISION value for the TC-compat bugSamuel Pitoiset2018-06-141-0/+108
| | | | | | | | | | | | | | | | | | | | | | On GFX8+, there is a bug that affects TC-compatible depth surfaces when the ZRange is not reset after LateZ kills pixels. The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match the last fast clear value. Because the value is set to 1 by default, we only need to update it when clearing Z to 0.0. We also need to set the depth clear regs and to update ZRANGE_PRECISION when initializing a TC-compat depth image to 0. Original patch from James Legg. This fixes random CTS fails with dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 CC: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: reduce maxFragmentInputComponentsSamuel Iglesias Gonsálvez2018-06-141-1/+1
| | | | | | | | | | | | | | | | If the application asks for the maximum number of fragment input components (128), use all of them plus some builtins that are passed in the VUE, then we exceed the maximum number of used VUE slots (32) and we break one assert that checks this limit. Also, with separate shader objects, we add CLIP_DIST0, CLIP_DIST1 builtins in brw_compute_vue_map() because we don't know if gl_ClipDistance is going to be read/write by an adjacent stage. Fixes VK-GL-CTS CL#2569. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointersMarek Olšák2018-06-131-2/+2
| | | | | | | | This fixes: GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations Cc: 18.0 18.1 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/gfx9: update & clean up a DPBB heuristicMarek Olšák2018-06-131-9/+5
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: set POPS_DRAIN_PS_ON_OVERLAP due to a hw bugMarek Olšák2018-06-131-2/+4
| | | | | | This may not be needed yet, but let's set it now. Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: remove UINT_MAX array terminators in bin size tablesMarek Olšák2018-06-131-19/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: update bin sizesMarek Olšák2018-06-131-35/+38
| | | | | | This is based on our docs (recently updated), not amdvlk. Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: update primitive binning code for EQAAMarek Olšák2018-06-131-4/+9
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: assume that rasterizer state is non-NULL in draw_vboMarek Olšák2018-06-134-75/+61
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: micro-optimize prim checking and fix guardband with lines+adjacencyMarek Olšák2018-06-134-13/+23
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: move the guardband registers into a separate state atomMarek Olšák2018-06-135-19/+35
| | | | | | | | | They have a different frequency of updates and don't change when scissors change. I think this even fixes something in si_update_vs_viewport_state. Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: implement the scissor bug workaround without performance dropMarek Olšák2018-06-132-29/+81
| | | | | | This might improve performance on Vega10 and Raven. Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't set VGT_LS_HS_CONFIG if it doesn't changeMarek Olšák2018-06-133-6/+12
| | | | Tested-by: Dieter Nützel <[email protected]>