summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radv: copy indirect lowering settings from radeonsiTimothy Arceri2017-10-201-1/+26
| | | | | | | | | | | | | | It looks the original indirect mask was probably copied from ANV. Sascha Willems demo results: tessellation ~4000 -> ~4200 fps V2: continue lowering local indirects due to llvm deficiencies. Tested-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: stop redundant setting of active_stagesTimothy Arceri2017-10-201-2/+0
| | | | | | We already set it when above in the nir compilation loop. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move some code out of loop in store_tcs_output()Timothy Arceri2017-10-201-5/+5
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Modify rsrc1/rsrc2 generation for merged tess.Bas Nieuwenhuizen2017-10-191-7/+16
| | | | | | | No OC_LDS_EN for HS, and the included LS vgpr_comp_cnt is at a different offset. Reviewed-by: Dave Airlie <[email protected]>
* radv: Set correct registers for merged shader rings.Bas Nieuwenhuizen2017-10-191-12/+24
| | | | | | We need different regs to end up in s0/s1. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add GFX9 HS emitting code.Bas Nieuwenhuizen2017-10-191-5/+16
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Remove remaining hard coded references to VS.Bas Nieuwenhuizen2017-10-193-7/+28
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Update GFX9 user data regs for GS/tess.Bas Nieuwenhuizen2017-10-194-14/+25
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Add code to compile merged shaders.Bas Nieuwenhuizen2017-10-194-13/+39
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add LS-HS input VGPR workaround.Bas Nieuwenhuizen2017-10-191-0/+18
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Compile the bodies of multiple shaders.Bas Nieuwenhuizen2017-10-191-50/+83
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Expand user SGPR descriptions a bit.Bas Nieuwenhuizen2017-10-191-3/+3
| | | | | | To prevent VS/TCS collisions in merged shaders. Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Don't write to the dynamic HS word on GFX9.Bas Nieuwenhuizen2017-10-191-11/+16
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add function creation for merged LS+HS.Bas Nieuwenhuizen2017-10-191-76/+178
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Make scan_shader_output_decl less dependent on the context.Bas Nieuwenhuizen2017-10-191-14/+17
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.Bas Nieuwenhuizen2017-10-191-1/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Change interface to allow multiple source shaders.Bas Nieuwenhuizen2017-10-193-39/+48
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add HS calling convention.Bas Nieuwenhuizen2017-10-191-1/+4
| | | | | | Needed for GFX9 merged shaders. Reviewed-by: Dave Airlie <[email protected]>
* ac: Parse the new HS RSRC1 register.Bas Nieuwenhuizen2017-10-191-0/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* swr: knob overrides for Intel Xeon PhiTim Rowley2017-10-195-1/+37
| | | | | | | | Architecture benefits from having more threads/work outstanding. Patch by Jan Zielinski. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add api to override draws in flightTim Rowley2017-10-194-19/+31
| | | | | | | | Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO. Patch by Jan Zielinski. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Widen fetch shader to SIMD16 (disabled for now)Tim Rowley2017-10-191-13/+428
| | | | | | | Refactored the gather operation to process 16 elements at a time via paired SIMD8 operations. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Change DS memory allocationTim Rowley2017-10-192-2/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix indentationTim Rowley2017-10-191-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Miscellaneous viewport array code changesTim Rowley2017-10-195-38/+71
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Minor changes for os-xTim Rowley2017-10-191-2/+4
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* i965: Don't disable aux buffers for non-overlapping miplevels.Kenneth Graunke2017-10-191-3/+7
| | | | | | | | | | | | | | | | | | | | Meta's GenerateMipmap implementation binds the same image for both sampling and rendering - but it samples from one miplevel while rendering the next. This is a false self-dependency, and there's no need to disable auxiliary buffers in this case. In fact, we really want to leave it enabled so the new miplevels gain color compression. Thankfully, the texture object's _MaxLevel is always one shy of the miplevel being rendered. So we can simply check if irb->mt_level is overlaps with the texture's defined levels. If not, there's no self- dependency and we can leave the auxiliary buffers enabled. Fixes a performance regression in GFXBench4 Car Chase, which apparently calls glGenerateMipmap() on every frame. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103247 Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by; Jason Ekstrand <[email protected]>
* i965: Remove the intel_miptree_prepare_fb_fetch wrapper.Kenneth Graunke2017-10-193-20/+10
| | | | | | | | Now that intel_miptree_prepare_texture takes levels and layers, there's not much use in this anymore. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by; Jason Ekstrand <[email protected]>
* i965: Only resolve texture levels/layers that are accessed.Kenneth Graunke2017-10-191-2/+16
| | | | | | | This should avoid unnecessary resolves when working with texture views. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by; Jason Ekstrand <[email protected]>
* i965: Make intel_miptree_prepare_texture() take level/layer arguments.Kenneth Graunke2017-10-193-21/+13
| | | | | | | | | | | | This effectively exports intel_miptree_prepare_texture_slices() as intel_miptree_prepare_texture(). The hope is to avoid resolves for when using texture views that access a subset of the levels/layers. For now, we pass the same arguments to separate the mechanical change from the one that actually modifies our behavior. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by; Jason Ekstrand <[email protected]>
* gallium: add more exceptions to tgsi_util_get_inst_usage_maskTim Rowley2017-10-191-0/+12
| | | | | | | | | | | | | A number of double/int64 operations don't have matching read and write usage masks, which the fallthrough case of tgsi_util_get_inst_usage_mask assumes for componentwise tagged instructions. No regressions in llvmpipe piglit; fixes a large number of swr regressions. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* isl: Fix width check in isl_gen7_choose_msaa_layout.Kenneth Graunke2017-10-191-1/+1
| | | | | | | | | | The restriction is supposed to apply if the width *field* is >= 8192, meaning the actual width *value* is >= 8193. The code also incorrectly used == for some reason. Reviewed-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use is_scheduling_barrier instead of schedule_node::is_barrier.Kenneth Graunke2017-10-191-22/+10
| | | | | | | | | | | | | | | | | | | Commit a73116ecc60414ade89802150b tried to make add_barrier_deps() walk to the next barrier, and stop. To accomplish that, it added an is_barrier flag. Unfortunately, this only works half of the time. The issue is that add_barrier_deps() walks both backward (to the previous barrier), and forward (to the next barrier). It also sets is_barrier. Assuming that we're processing instructions in forward order, this means that is_barrier will be set for previous instructions, but not future ones. So we'll never see it, and walk further than we need to. dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 now compiles its shaders in 3.6 seconds instead of 3.3 minutes. Reviewed-by: Matt Turner <[email protected]> Tested-by: Pallavi G <[email protected]>
* i965: Move fs_inst::has_side_effects()'s eot check to the parent class.Kenneth Graunke2017-10-195-9/+3
| | | | | | | | | This eliminates a layer of wrapping, and makes a backend_instruction sufficient. The downside is that it exposes 'eot' to the vec4 backend, which it doesn't need, but can basically happily ignore. Reviewed-by: Matt Turner <[email protected]> Tested-by: Pallavi G <[email protected]>
* tgsi: fix tgsi_util_get_inst_usage_maskRoland Scheidegger2017-10-191-6/+6
| | | | | | | | The logic for handling shadow coords was completely broken. Fixes be3ab867bd444594f9d9e0f8e59d305d15769afd. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103265 Reviewed-by: Marek Olšák <[email protected]>
* glsl/linker: produce error when invalid explicit locations are usedIago Toral Quiroga2017-10-193-5/+43
| | | | | | | | | | | | | | We only need to add a check to validate output locations here. For inputs with invalid locations we will fail to link when we can't find a matching output in the same (invalid) location. v2: compute location slots properly depending on shader stage and variable type / direction Fixes: KHR-GL45.enhanced_layouts.varying_location_limit Reviewed-by: Nicolai Hähnle <[email protected]>
* i965/sbe: fix active components for SSO programs with over 16 inputsIago Toral Quiroga2017-10-191-8/+2
| | | | | | | | | | | | | | | | | | | | | | When we have up to 16 FS inputs, the SF unit will reorder our inputs to be consecutive, however, when we have more than 16 we need to to read our inputs from the URB exactly as they have been output from the previous stage. This means that for SSO we have to consider if we have URB padding due to unused input locations. Specifically, this affects gen9 active components programming, since for things to work in scenarios with over 16 inputs that have padded regions we need to ensure that we program active components for the padded regions too. If we don't do this the hardware won't read the URB properly for inputs located after padded regions. Found empirically. Fixes (these also require a patch in CTS): KHR-GL45.enhanced_layouts.varying_locations KHR-GL45.enhanced_layouts.varying_array_locations Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Do not log a perf warning when mapping an idle boChris Wilson2017-10-191-2/+3
| | | | | | | | | | We only want to scare the user away from causing a GPU stall for mapping a busy bo. The time taken to instantiate the set of pages for a buffer and their mmapping is unavoidable and flagging idle bo as being busy is "crying wolf". Reported-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use a union to bitcast a floatMatt Turner2017-10-181-1/+2
| | | | ... which does not break C's aliasing rules.
* drirc: Group a few games in the glthread whitelist together.Darren Salt2017-10-191-6/+21
| | | | Signed-off-by: Marek Olšák <[email protected]>
* drirc: Enable glthread for more games (Saints Row 4 & Gat out of Hell).Darren Salt2017-10-191-0/+6
| | | | | | | | | | | | “Saints Row: Gat out of Hell” benefits from this on slower CPUs in that usage spikes on individual cores are avoided, which in turn makes it harder to hit a bug which causes broken audio and the game to hang on exit. “Saints Row IV” appears to be fine either way, but also exhibits the audio breakage bug: glthread is therefore being enabled on the grounds that it should make it a little harder to hit that bug. Signed-off-by: Marek Olšák <[email protected]>
* radv: reset dirty flags after flushing all statesSamuel Pitoiset2017-10-181-2/+2
| | | | | | | | Move it to radv_cmd_buffer_flush_state() because if rasterizerDiscardEnable is true, the flags are not cleared. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not re-emit the index buffer for every draw callSamuel Pitoiset2017-10-181-29/+28
| | | | | | | | | It can only be changed when CmdBindIndexBuffer() is called or when a secondary buffer is used. Though not always, but let's re-emit the packets in this situation for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove useless mask operation in radv_cs_emit_draw_indexed_packet()Samuel Pitoiset2017-10-181-1/+1
| | | | | | | This saves few CPU cycles when CmdDrawIndexed() is used a lot. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Do not read from the disk cache with RADV_DEBUG=nocache.Bas Nieuwenhuizen2017-10-181-1/+2
| | | | | | Otherwise the flag is borderline useless. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Set active_stages after getting cached shadersAlex Smith2017-10-181-1/+6
| | | | | | Fixes: 7d45d22fdd2e ("radv: switch to using radv_create_shaders()") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Don't free NIR shaders if tracingAlex Smith2017-10-181-1/+1
| | | | | | | | | Fixes a crash while generating a hang report. Fixes: 7d45d22fdd2e ("radv: switch to using radv_create_shaders()") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* Revert "egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}"Marek Olšák2017-10-185-18/+31
| | | | | | This reverts commit 8cb84c8477a57ed05d703669fee1770f31b76ae6. This fixes crashing shader-db/run.
* Revert "egl: drop EGL driver `name`"Marek Olšák2017-10-185-1/+10
| | | | | | This reverts commit 6414d6bd8d2897f4ba643357fe3037f3acd60879. This is needed to apply the next revert.
* st/mesa: set dimension for constants in ATI_fragment_shaderMiklós Máté2017-10-181-0/+4
| | | | | | | | | This fixes an assertion failure introduced by 30a2f0dfd46de. Fixes: 30a2f0dfd46 ("radeonsi: add an assertion that only Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>