aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_link.cpp
Commit message (Collapse)AuthorAgeFilesLines
* i965: Add brw_program_serialize_nirJordan Justen2017-12-081-6/+1
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Serialize nir later in the linking processJordan Justen2017-12-011-9/+16
| | | | | | | | | | | Fixes MESA_GLSL=cache_fb with piglit tests/spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out.shader_test Fixes: 0610a624a12 i965/link: Serialize program to nir after linking for shader cache Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988 Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use nir_lower_atomics_to_ssbos and delete ABO compiler code.Kenneth Graunke2017-11-151-0/+2
| | | | | | | | | | | | We use the same hardware mechanism for both atomic counters and SSBO atomics, so there's really no benefit to maintaining separate code to handle each case. Instead, we can just use Rob's shiny new NIR pass to convert atomic_uints to SSBOs, and delete piles of code. The ssbo_start section of the binding table becomes a combined ABO and SSBO section, with ABOs first, then SSBOs. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/nir: Break the linking code into a helper in brw_nir.cJason Ekstrand2017-11-081-34/+4
| | | | | Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com> Cc: [email protected]
* i965: disable NIR linking on HSW and belowTimothy Arceri2017-11-071-1/+4
| | | | | | | Fixes: 379b24a40d3d "i965: make use of nir linking" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103537 Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Don't link when the program was found in the disk cacheJordan Justen2017-10-311-0/+3
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Cc: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/link: Serialize program to nir after linking for shader cacheJordan Justen2017-10-311-0/+10
| | | | | | | | | | | | | If the shader cache is enabled, after linking the program, we serialize the program to nir. This will be saved out by the glsl shader cache support. Later, if the same program is found in the cache, we can use the nir for a fallback in the unlikely case that the gen binary program is not found in the cache. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Get rid of nir_shader::stageJason Ekstrand2017-10-201-2/+4
| | | | | | | | It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/link: Use prog->nir instead of creating a temporaryJason Ekstrand2017-09-281-4/+3
| | | | | | | | | This way, when NIR_PASS_V makes a clone of the shader (for testing nir_clone), the new and lowered version gets re-assigned to prog->nir. [[email protected]: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/link: Make more use of NIR_PASSJason Ekstrand2017-09-281-6/+6
| | | | | | [[email protected]: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/link: Make better use of temporary variablesJason Ekstrand2017-09-281-4/+5
| | | | | | | | | | | The way NIR_PASS works (and, by extension, nir_optimize) is that they may clone the shader and throw the old one away. (We use this for testing nir_clone.) It's better if we just make a temporary variable, use it for everything, and re-assign to the gl_program at the end. [[email protected]: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: make use of nir linkingTimothy Arceri2017-09-261-0/+56
| | | | | | | | | | | | | | | | | | | | | For now linking is just removing unused varyings between stages. shader-db results BDW: total instructions in shared programs: 13198288 -> 13191693 (-0.05%) instructions in affected programs: 48325 -> 41730 (-13.65%) helped: 473 HURT: 0 total cycles in shared programs: 541184926 -> 541159260 (-0.00%) cycles in affected programs: 213238 -> 187572 (-12.04%) helped: 435 HURT: 8 V2: - lower indirects on demoted inputs as well as outputs. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: call brw_shader_gather_info() from the callers of brw_create_nir()Timothy Arceri2017-09-261-0/+14
| | | | | | | This will allow us to insert a nir linking step in brw_link_shader(). Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965: Force outputs_written to contain varyings needed by stream-out.Kenneth Graunke2017-09-211-3/+6
| | | | | | | | | | | | | If transform feedback is recording a varying, it needs a slot in the VUE map, regardless of whether or not the shader writes it. Together with the previous patch, this fixes: - KHR-GL45.enhanced_layouts.xfb_capture_struct The test captures a structure where the vertex shader writes the first and third members - but the second still needs a slot. Reviewed-by: Juan A. Suarez Romero <[email protected]>
* i965: drop brw->gen in favor of devinfo->genLionel Landwerlin2017-08-301-3/+6
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Move SOL PSIZ hacks from draw time to link time.Kenneth Graunke2017-06-011-0/+36
| | | | | | | | | We can just update the gl_transform_feedback_info fields at link time to make the VUE header fields have the right location and component. Then we don't need to handle them specially at draw time, which is expensive. Reviewed-by: Rafael Antognolli <[email protected]>
* nir: Embed the shader_info in the nir_shader againJason Ekstrand2017-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove unused variable 'options'Matt Turner2017-04-251-2/+0
| | | | Should have been removed in commit ad55b1a7701a
* i965: remove GLSL IR optimisation loopTimothy Arceri2017-04-241-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IVB is running into some spilling issues in piglit with the loop removed. However those tests are not really reflective of a real world use case, also fp64 is brand new to IVB so we leave the spilling issues to be resolved at a later time. Run time for shader-db on my machine goes from ~795 seconds to ~665 seconds. shader-db results BDW: total instructions in shared programs: 12969459 -> 12968891 (-0.00%) instructions in affected programs: 1463154 -> 1462586 (-0.04%) helped: 3622 HURT: 3326 total cycles in shared programs: 246453572 -> 246504318 (0.02%) cycles in affected programs: 208842622 -> 208893368 (0.02%) helped: 24029 HURT: 35407 total loops in shared programs: 2931 -> 2931 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 14560 -> 14498 (-0.43%) spills in affected programs: 2270 -> 2208 (-2.73%) helped: 17 HURT: 2 total fills in shared programs: 19671 -> 19632 (-0.20%) fills in affected programs: 2060 -> 2021 (-1.89%) helped: 17 HURT: 2 LOST: 17 GAINED: 40 Most of the hurt shaders are 1-2 instructions, with what looks like a max of 7. I've looked at the worst cycles regressions and as far as I can tell its just a scheduling difference. Acked-by: Elie Tournier <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move the back-end compiler to src/intel/compilerJason Ekstrand2017-03-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | Mostly a dummy git mv with a couple of noticable parts: - With the earlier header cleanups, nothing in src/intel depends files from src/mesa/drivers/dri/i965/ - Both Autoconf and Android builds are addressed. Thanks to Mauro and Tapani for the fixups in the latter - brw_util.[ch] is not really compiler specific, so it's moved to i965. v2: - move brw_eu_defines.h instead of brw_defines.h - remove no-longer applicable includes - add missing vulkan/ prefix in the Android build (thanks Tapani) v3: - don't list brw_defines.h in src/intel/Makefile.sources (Jason) - rebase on top of the oa patches [Emil Velikov: commit message, various small fixes througout] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Do int64 lowering in NIRJason Ekstrand2017-03-011-5/+0
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Reduce cross-pollination between the DRI driver and compilerJason Ekstrand2017-03-011-2/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop unused STATE_TEXRECT_SCALE code.Kenneth Graunke2017-03-011-2/+0
| | | | | | | | | | | | | | In the past, we used this on Gen4-5 to transform non-normalized texture coordinates (for sampler2DRect) to normalized ones. We also used it on Gen6-7.5 for sampler2DRect with GL_CLAMP. Jason dropped this code in 6c8ba59cff14a1a86273f4008ff2a8e68335ab25 in favor of using nir_lower_tex(), which just does a textureSize() call. But we were still setting up these state references for useless uniform data. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Enable ARB_gpu_shader_int64 on Gen8+Ian Romanick2017-01-201-0/+5
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Make unify_interfaces not spread VARYING_BIT_TESS_LEVEL_*.Kenneth Graunke2017-01-061-2/+5
| | | | | | | | | | | This is harmless today because gl_TessLevelInner/Outer in the TES is currently treated as system values. However, when we move to treating them as inputs, this would cause a bug: with no TCS present, it would propagate TES reads of VARYING_SLOT_TESS_LEVEL into the VS output VUE map slots. This is totally bogus - those don't even exist in the VS. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa/glsl: set SamplersUsed directly in gl_programTimothy Arceri2017-01-061-1/+0
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965: stop passing gl_shader_program to the precompile and codegen functionsTimothy Arceri2017-01-061-4/+4
| | | | | | | | We no longer need it. While we are at it we mark the vs, gs, and wm codegen functions as static. Reviewed-by: Eric Anholt <[email protected]>
* mesa/glsl/i965: remove Driver.NewShader()Timothy Arceri2016-12-301-11/+0
| | | | | | | | | After removing brw_shader in the previous commit this is no longer needed. V2: remove use in src/compiler/glsl/test_optpass.cpp Reviewed-by: Eric Anholt <[email protected]>
* i965: move compiled_once flag to brw_programTimothy Arceri2016-12-301-5/+3
| | | | | | | This allows us to delete brw_shader and removes the last use of gl_linked_shader in the codegen paths. Reviewed-by: Eric Anholt <[email protected]>
* i965: make use of nir_lower_returns() for GLTimothy Arceri2016-12-231-6/+0
| | | | | | | | | | | | | | | | | | | | | Fixes two new piglit tests: spec/glsl-1.10/execution/vs-nested-return-sibling-loop.shader_test spec/glsl-1.10/execution/vs-nested-return-sibling-loop2.shader_test shader-db results for BDW: total instructions in shared programs: 12903158 -> 12903134 (-0.00%) instructions in affected programs: 27100 -> 27076 (-0.09%) helped: 32 HURT: 6 total cycles in shared programs: 294922518 -> 294922804 (0.00%) cycles in affected programs: 4372828 -> 4373114 (0.01%) helped: 31 HURT: 8 Reviewed-by: Kenneth Graunke <[email protected]>
* i965: use nir_lower_indirect_derefs() for GLSLTimothy Arceri2016-12-231-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. The changes seem to be caused be the difference in the GLSL IR vs NIR variable index lowering passes. The GLSL IR pass creates a simple if ladder for arrays of size 4 or less, while the NIR pass implements a binary search for all arrays regardless of size. Shader-db results BDW: total instructions in shared programs: 13021176 -> 13021819 (0.00%) instructions in affected programs: 57693 -> 58336 (1.11%) helped: 20 HURT: 190 total cycles in shared programs: 299805580 -> 299750826 (-0.02%) cycles in affected programs: 2290024 -> 2235270 (-2.39%) helped: 337 HURT: 442 total fills in shared programs: 19984 -> 19984 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 4 GAINED: 0 V2: remove the do_copy_propagation() call from the i965 GLSL IR linking code. This call was added in f7741c52111 but since we are moving the variable index lowering to NIR we no longer need it and can just rely on the nir copy propagation pass. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove brw_lower_texture_gradientsIago Toral Quiroga2016-12-131-1/+0
| | | | | | | This has been ported to NIR now so we don'tneed to keep the GLSL IR lowering any more. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: delay adding built-in uniforms to Parameters listTimothy Arceri2016-12-091-23/+19
| | | | | | | | | | This is a step towards using NIR optimisations over GLSL IR optimisations. Delaying adding built-in uniforms until after we convert to NIR gives it a chance to optimise them away. V2: move the new code back to brw_link_shader() Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Unify shader interfaces explicitly.Kenneth Graunke2016-12-061-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A while ago, I made i965 start compiling shaders independently. The VUE map layouts were based entirely on each shader's input/output bitfields. Assuming the interfaces match, this works out well - both sides will compute the same layout, and outputs are correctly routed to inputs. At the time, I had assumed that the linker would guarantee that the interfaces match. While it usually succeeds, it unfortunately seems to fail in some cases. For example, Piglit's tcs-input-read-array-interface test has a VS output array with two elements, but the TCS only reads one. The linker isn't able to eliminate the unused element from the VS, which makes the interfaces not match. Another case is where a shader other than the last writes clip/cull distances. These should be demoted to ordinary varyings, but they currently aren't - so we think they still have some special meaning, and prevent them from being eliminated. Fixing the linker to guarantee this in all cases is complicated. It needs to be able to optimize out dead code. It's tied into varying packing and other messiness. While we can certainly improve it---and should---I'd rather not rely on it being correct in all cases. This patch ORs adjacent stages' input/output bitfields together, ensuring that their interface (and hence VUE map layout) will be compatible. This should safeguard us against linker insufficiencies. Fixes line rendering in Dolphin, and the Piglit test based on it: spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97232 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* Revert "i965: use nir_lower_indirect_derefs() for GLSL"Jason Ekstrand2016-12-051-0/+13
| | | | | This reverts commit 9404439a754e5640ccd98df40fa694835c0d8759. I didn't intend to push it and it breaks clip and cull distance.
* i965: use nir_lower_indirect_derefs() for GLSLTimothy Arceri2016-12-051-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. Shader-db results BDW: total instructions in shared programs: 8705873 -> 8706194 (0.00%) instructions in affected programs: 32515 -> 32836 (0.99%) helped: 3 HURT: 79 total cycles in shared programs: 74618120 -> 74583476 (-0.05%) cycles in affected programs: 528104 -> 493460 (-6.56%) helped: 47 HURT: 37 LOST: 2 GAINED: 0
* glsl: create gl_program at the start of linking rather than the endTimothy Arceri2016-11-191-9/+1
| | | | | | | | | | | | | | | | | | This will allow us to directly store metadata we want to retain in gl_program this metadata is currently stored in gl_linked_shader and will be lost if relinking fails even though the program will remain in use and is still valid according to the spec. "If a program object that is active for any shader stage is re-linked unsuccessfully, the link status will be set to FALSE, but any existing executables and associated state will remain part of the current rendering state until a subsequent call to UseProgram, UseProgramStages, or BindProgramPipeline removes them from use." This change will also help avoid the double handing that happens in _mesa_copy_linked_program_data(). Reviewed-by: Emil Velikov <[email protected]>
* st/mesa/i965: simplify gl_program references and stop leakingTimothy Arceri2016-11-191-3/+2
| | | | | | | | | | | | | | In i965 we were calling _mesa_reference_program() after creating gl_program and then later calling it again with NULL as a param to get the refcount back down to 1. This changes things to not use _mesa_reference_program() at all and just have gl_linked_shader take ownership of gl_program since refcount starts at 1. The st and ir_to_mesa linkers were worse as they were both getting in a state were the refcount would never get to 0 and we would leak the program. Reviewed-by: Emil Velikov <[email protected]>
* i965: only try print GLSL IR once when using INTEL_DEBUG to dump irTimothy Arceri2016-11-171-0/+10
| | | | | | | | Since we started releasing GLSL IR after linking the only time we can print GLSL IR is during linking. When regenerating variants only NIR will be available. Reviewed-by: Emil Velikov <[email protected]>
* glsl/lower_if: don't lower branches touching tess control outputsMarek Olšák2016-11-151-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: use nir_shader_gather_info() over do_set_program_inouts()Timothy Arceri2016-11-111-2/+0
| | | | | | | This takes us one step closer to being able to drop the GLSL IR optimisation passes during linking in favour of the NIR passes. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add temporary copy_shader_info() functionTimothy Arceri2016-10-261-4/+4
| | | | | | | | | | | | | This function is added here to ease refactoring towards using the new shared shader_info. Once refactoring is complete and values are set directly it will be removed. We call it from _mesa_copy_linked_program_data() rather than glsl_to_nir() so that the values will be set for all drivers. In order to do this some calls need to be moved around so that we make sure to call do_set_program_inouts() before _mesa_copy_linked_program_data() Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: modify _mesa_copy_linked_program_data() to take gl_linked_shaderTimothy Arceri2016-10-261-3/+4
| | | | | | | | | | This allows us to do some small tidy ups, but will also allow us to call a new function that copies values to a shared shader info from here. In order to make this change this function now requires _mesa_reference_program() to have previously been called. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Silence unused parameter warningsIan Romanick2016-10-171-2/+1
| | | | | | | | | | | | | | | | | | | | brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* glsl: don't crash when dumping shaders if some come from cacheTimothy Arceri2016-09-281-4/+10
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: stop passing stage as a function parameterTimothy Arceri2016-09-261-5/+3
| | | | | | We already pass the shader so we can just get the stage from this. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Rename intelScreen to screen.Kenneth Graunke2016-09-201-2/+2
| | | | | | | | "intelScreen" is wordy and also doesn't fit our style guidelines. "screen" is shorter, which is nice, because we use it fairly often. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: release GLSL IR in LinkShader after it's not neededTapani Pälli2016-09-091-0/+11
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Enable GL_KHR_blend_equation_advanced on G45 and later.Kenneth Graunke2016-08-251-0/+2
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Get rid of the do_lower_unnormalized_offsets passJason Ekstrand2016-07-221-1/+0
| | | | | | | | | We can do this in NIR now. No need to keep a GLSL pass lying around for it. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>