summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* nouveau/nvc0: Extract common tile mode macroThierry Reding2018-03-091-6/+9
| | | | | | | | | | | Add a new macro that can be used to extract the tiling mode from a tile_mode value. This is will be used to determine the number of GOBs used in block linear mode. Acked-by: Emil Velikov <[email protected]> Tested-by: Andre Heider <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Thierry Reding <[email protected]>
* radeonsi: remove chip_class parameter from si_lower_nirMarek Olšák2018-03-084-10/+6
| | | | | | | We can get it from si_screen. Reviewed-by: Timothy Arceri <[email protected]> Acked-by: Alex Deucher <[email protected]>
* radeonsi: expand constbuf 0 address correctly to fix Vega10 hangsMarek Olšák2018-03-081-4/+17
| | | | | | | | | | This is only required with the latest libdrm. This fixes 32-bit support with high addresses. (and possibly 64-bit support too because the high bits need to be masked out) Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* radeonsi: align command buffer starting address to fix some Raven hangsMarek Olšák2018-03-081-2/+3
| | | | | | Cc: 17.3 18.0 <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* etnaviv: add get_driver_query_group_info(..)Christian Gmeiner2018-03-081-0/+13
| | | | | | | This enables AMD_performance_monitor extension. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* etnaviv: add query_group_info for sw countersChristian Gmeiner2018-03-082-6/+29
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* ac/radeonsi: add emit_kill to the abiTimothy Arceri2018-03-081-0/+1
| | | | | | | | This should fix a regression with Rocket League grass rendering on the NIR backend. Reviewed-by: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717
* radeonsi: add si_llvm_emit_kill() helperTimothy Arceri2018-03-082-12/+21
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: make use of if/loop build helpers in acTimothy Arceri2018-03-082-160/+11
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove si_llvm_add_attributeMarek Olšák2018-03-073-25/+16
|
* radeonsi: fix passing address32_hi to LLVM for high valuesMarek Olšák2018-03-071-2/+5
| | | | The old function treats high values as negative, which LLVM interprets as 0.
* radeonsi: assume has_virtual_memory == trueMarek Olšák2018-03-072-34/+18
|
* radeonsi: add/update assertions for 32-bit address spaceMarek Olšák2018-03-072-3/+18
|
* radeonsi: prevent a negative buffer offset in si_upload_descriptorsMarek Olšák2018-03-071-4/+3
|
* radeonsi: properly extract a buffer address from a descriptorMarek Olšák2018-03-071-1/+7
|
* radeonsi: fix vertex buffer address computation with full 64-bit addressesMarek Olšák2018-03-071-3/+3
|
* radeonsi: mask out high VM address bits in registers where neededMarek Olšák2018-03-073-22/+24
|
* ac: add ac_count_scratch_private_memory()Samuel Pitoiset2018-03-061-28/+4
| | | | | | | Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* tgsi/scan: use wrap-around shift behavior explicitly for file_maskRoland Scheidegger2018-03-062-2/+7
| | | | | | | | | | | | | | The comment said it will only represent the lowest 32 regs. This was not entirely true in practice, since at least on x86 you'll get masked shifts (unless the compiler could recognize it already and toss it out). It turns out this actually works out alright (presumably noone uses it for temp regs) when increasing max sampler views, so make that behavior explicit. Albeit it feels a bit hacky (but in any case, explicit behavior there is better than undefined behavior). Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeonsi/nir: fix handling of doubles for gs inputsTimothy Arceri2018-03-061-2/+6
| | | | | | | Fixes piglit test: tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move si_nir_load_input_gs() to si_shader.cTimothy Arceri2018-03-063-29/+20
| | | | | | | | All the tess shader and tgsi equivalents are here and it allows use to use llvm_type_is_64bit() in the following patch without exposing it externally. Reviewed-by: Dave Airlie <[email protected]>
* broadcom/vc4: Add support for HW perfmonBoris Brezillon2018-03-055-12/+249
| | | | | | | | | The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* r600: fix color export maskRoland Scheidegger2018-03-051-0/+1
| | | | | | | | | | | The r600 code (not the eg one) forgot to copy the ps_color_export_mask in commit 5b14e06d8b42e2b08ebc52b6c314ef8647d87a1f when updating the pixel state, leading to misrenderings (probably with MRT). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262 Tested-by: LoneVVolf <[email protected]> Tested-by: Pavel Vinogradov <[email protected]>
* freedreno/ir3: start dealing with half-precisionRob Clark2018-03-053-30/+81
| | | | | | | | | | | | | | | | | Some instructions, assume src and/or dst is half-precision based on a type field (ie. f32/s32/u32 are full precision but others are half precision). So add some code to sanity check the src/dst registers to catch mixups. Also propagate half-precision flag for SSA sources. The instruction consuming a SSA value needs to be of the same type as the one producing it. This is probably not complete half-precision support, but a useful first step. We do still need to add support for nir alu instructions for converting between half/full precision. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix fixing-up register footprintRob Clark2018-03-052-18/+27
| | | | | | | | | | | | | It isn't just vertex shaders that need to fixup reg footprint for inputs populated before shader starts. This problem showed up with compute shaders. If you have (for example) a localregid sysval, but only the .x component is used, the hw still writes the .yz components, which could overflow into other threads causing corruption. Showed up in cl cts 'basic/test_basic intmath_int'. But in theory the same problem could crop up elsewhere. Signed-off-by: Rob Clark <[email protected]>
* freedreno: surfaces can be PIPE_BUFFERRob Clark2018-03-051-4/+10
| | | | | | At least for clover. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: handle compute resourcesRob Clark2018-03-051-2/+4
| | | | | | Not *entirely* sure why this is a different BIND bit, but it is. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: ignore return jumpRob Clark2018-03-051-0/+1
| | | | | | | I think this should also always only occur at the end of a BB (by definition), and the BB successor should be the end block. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add some more compute capsRob Clark2018-03-052-4/+21
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: don't expose 64b pointers yetRob Clark2018-03-051-2/+5
| | | | | | | Temporary hack, but since we can't do 64b math yet in ir3, pretend that we don't support 64b pointers. Signed-off-by: Rob Clark <[email protected]>
* freedreno: steal handy macro for compute caps from nouveauRob Clark2018-03-051-42/+17
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add global_bindings stateRob Clark2018-03-054-4/+85
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: small cleanupRob Clark2018-03-051-3/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add pctx->memory_barrier()Rob Clark2018-03-051-0/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cmdline compiler updates for spv shadersRob Clark2018-03-051-0/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* ac: add ac_build_fsign()Samuel Pitoiset2018-03-051-11/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_isign()Samuel Pitoiset2018-03-051-8/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_fract()Samuel Pitoiset2018-03-051-8/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* virgl: add offset alignment values to to v2 caps struct[email protected]2018-03-053-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | glBindBufferRange(..) in vrend_draw_bind_ubo is failing with more than one uniform block. This is due to improper alignment of the start of the second block. Let's query the proper alignment from the driver and pass it back to Mesa. Let's query for the texture alignment too, even though the Virgl renderer doesn't call glTexBufferRange yet. The default values are the widest workable range possible (for example, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256). Fixes: dEQP-GLES3.functional.ubo.* on Nvidia Example test: dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex Note: This is based on "virgl: reduce some default capset limits.", which hasn't landed in Mesa yet but should relatively soon. Signed-off-by: Dave Airlie <[email protected]>
* virgl: reduce some default capset limits.Dave Airlie2018-03-051-8/+8
| | | | | | | | | Since v2 might take a while to rollout, we should reduce these inside some gathered minimums and then v2 can increase them using host values. Reviewed-by: Stéphane Marchesin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* virgl: handle getting new capsets.Dave Airlie2018-03-051-1/+24
| | | | | | | | This checks the kernel api is new enough and asks for the larger caps size since the kernel won't mess it up now. Reviewed-by: Stéphane Marchesin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/nir: call ac_lower_indirect_derefs()Timothy Arceri2018-03-054-4/+6
| | | | | | | | Fixes piglit tests: tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add chip class to compiler_ctx_stateTimothy Arceri2018-03-053-0/+4
| | | | | | This will be used in the following patch. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* swr/rast: Fix macOS macro.Vinson Lee2018-03-041-2/+2
| | | | | | | Fixes: a25093de7188 ("swr/rast: Implement JIT shader caching to disk") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-By: George Kyriazis <[email protected]>
* svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gsCharmaine Lee2018-03-021-1/+2
| | | | | | | | Since geometry shader also consumes prescale constants, the geometry shader constant buffer will need to be updated when prescale factor is changed. Reviewed-by: Brian Paul <[email protected]>
* svga: fix blending regressionBrian Paul2018-03-021-11/+24
| | | | | | | | | | | | | | | | | | | The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't translate blend state when it's disabled for a colorbuffer") subtly changed the details of gallium's per-RT blend state. In particular, when pipe_rt_blend_state[i].blend_enabled is true, we have to get the src/dst blend terms from pipe_rt_blend_state[i], not [0] as before. We now have to scan the blend targets to find the first one that's enabled (if any). We have to use the index of that target for getting the src/dst blend terms. And note that we have to set identical blend terms for all targets. This fixes the Piglit fbo-drawbuffers2-blend test. VMware bug 2063493. Reviewed-by: Charmaine Lee <[email protected]>
* svga: check svga_have_vgpu10() in svga_delete_blend_state()Brian Paul2018-03-021-1/+1
| | | | | | | We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not enabled (bs->id==0 by default), resulting in lots of device errors. Reviewed-by: Neha Bhende<[email protected]>
* svga: if svga_update_state() fails, skip the draw callBrian Paul2018-03-021-5/+5
| | | | | | | | | | | | | | | | | | | | | If svga_update_state() fails, we flush the command buffer and retry. If it fails again, it likely means we were unable to translate a shader for some reason (uses too many resources, for example). In that case, let's just skip the draw call. The alternative, just disabling the shader stage in question, would certainly lead to bad rendering anyway, and probably device errors. Fixes failed assertion running Piglit glsl-1.50/execution/ variable-indexing/gs-output-array-vec4-index-wr.shader_test since it uses too many GS output registers (though the test still fails). VMware bug 2063492. v2: also call pipe_debug_message() so apps or apitrace can be notified when this issue occurs. v3: use svga_update_state_retry(). Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: let svga_update_state_retry() return a boolBrian Paul2018-03-022-6/+9
| | | | | | | This will allow minor simplifications elsewhere. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: s/unsigned/boolean/ for a few local varsBrian Paul2018-03-021-6/+6
| | | | Reviewed-by: Charmaine Lee <[email protected]>