summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* anv,radv: disable StorageImageWriteWithoutFormat for nowIlia Mirkin2016-12-312-2/+2
| | | | | | | | | | | The SPIR-V capability isn't even marked as enabled, and there are no tests in Vulkan-CTS. Per Jason Ekstrand, this won't work in anv as such write-only surfaces require additional setup which is currently not performed. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Dave Airlie <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* i965: Avoid NULL pointer dereference when transform feedback is off.Kenneth Graunke2016-12-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | upload_3dstate_streamout can be called when there's no currently bound transform feedback object. In this case, we get the default object, which has a NULL shader (previously gl_shader_program, now gl_program). The old code did something sketchy, but which worked: const struct gl_transform_feedback_info *linked_xfb_info = &xfb_obj->shader_program->LinkedTransformFeedback; Here, if shader_program is NULL, this would be a bogus pointer of 0x60. But we never actually dereferenced it, so it worked out. With Timothy's recent reworks, we actually end up dereferencing xfb_obj->program along the way, which crashes since it's NULL. The solution is to move this pointer initialization into the "active" block, where we know it actually exists and won't be bogus. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99231 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl/mesa: add reference to gl_shader_program_data from gl_programTimothy Arceri2016-12-315-0/+21
| | | | | | | | | We also add the stubs for the standalone compiler in this change. By adding a reference here we can now refactor some code to use gl_program where we were previously awkwardly using gl_shader_program. Reviewed-by: Eric Anholt <[email protected]>
* mesa: make union in gl_program a struct and add FIXMETimothy Arceri2016-12-311-1/+5
| | | | | | | | i915 is mixing the use of these fields, for now change this to a struct and add a FIXME. Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99229
* i965/peephole_ffma: Use nir_builderJason Ekstrand2016-12-301-29/+14
| | | | | Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir/split_var_copies: Use a nir_shader rather than a void *mem_ctxJason Ekstrand2016-12-301-3/+3
| | | | | Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir/opt_peephole_select: Pass around the actual nir_shaderJason Ekstrand2016-12-301-4/+5
| | | | | Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir/conditional_if: Properly use the builderJason Ekstrand2016-12-301-11/+10
| | | | | | | | | | We were passing around a void *mem_ctx and using that to initialize the builder which was wrong since that pointed to ralloc_parent(impl) which is the shader but the builder is supposed to be initialized with the nir_function_impl. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir/lower_var_copies: Use a shader rather than a void *mem_ctxJason Ekstrand2016-12-302-9/+10
| | | | | Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir/lower_io: Use the builder instead of carrying a mem_ctxJason Ekstrand2016-12-301-8/+8
| | | | | Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir/from_ssa: Use nir_builder for emit_copyJason Ekstrand2016-12-301-13/+13
| | | | | | | | This lets us get rid of the void *mem_ctx parameter and make things a bit more type safe. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir: Make nir_copy_deref follow the "clone" patternJason Ekstrand2016-12-3013-65/+51
| | | | | | | | | We rename it to nir_deref_clone, re-order the sources to match the other clone functions, and expose nir_deref_var_clone. This past part, in particular, lets us get rid of quite a few lines since we no longer have to call nir_copy_deref and wrap it in deref_as_var. Reviewed-by: Jordan Justen <[email protected]>
* freedreno/ir3: rework varying slots (maybe??)Rob Clark2016-12-301-4/+9
| | | | | | | | | | See: dEQP-GLES2.functional.shaders.swizzles.vector_swizzles.mediump_vec2_yyyy_fragment if we only access (in FS) varying.y then it ends up in slot zero.. I'm not sure the hw likes that.. Signed-off-by: Rob Clark <[email protected]>
* spirv: always expose SpvCapabilityStorageImageExtendedFormatsIlia Mirkin2016-12-293-5/+1
| | | | | | | | | | I forgot to do this in commit 76b97d544e ("anv: enable storage image extended formats"). Since both drivers support this now, no need for the conditional enable. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* anv: add support for extended texture gatherIlia Mirkin2016-12-292-2/+1
| | | | | | | | | Now that the SPIR-V -> NIR translation is in place, no additional logic is required. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* radv: only allow cmask/dcc in color optimal.Dave Airlie2016-12-301-3/+2
| | | | | | | | I had this on transfers due to the clear color cmd, but it seems like that path shouldn't get fast clears. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: only allow cmask/dcc on exclusive or concurrent with graphics queue.Dave Airlie2016-12-301-3/+6
| | | | | | | Otherwise we don't get the barriers to flush dcc etc. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: Rewrite lower_regs_to_ssa to use the phi builderJason Ekstrand2016-12-291-421/+174
| | | | | This keeps some of Connor's original code. However, while I was at it, I updated this very old pass to a bit more modern NIR.
* nir/phi-builder: Set the value in the block when creating a phiJason Ekstrand2016-12-291-1/+1
| | | | | | | | | After we figure out the value that we are going to return, we have a loop that walks up the dominance tree and sets the value in each of the blocks that doesn't have one yet. In the case of the phi, the def is set to NEEDS_PHI not NULL, so the last one where the phi node actually goes never gets filled out. This can lead to duplicating the phi node unnecessarily.
* nir: Add foreach_register helper macrosJason Ekstrand2016-12-291-0/+5
|
* nir: Rename convert_to_ssa lower_regs_to_ssaJason Ekstrand2016-12-299-13/+12
| | | | This matches the naming of nir_lower_vars_to_ssa, the other to-SSA pass.
* mesa/glsl/i965: remove Driver.NewShader()Timothy Arceri2016-12-3010-58/+2
| | | | | | | | | After removing brw_shader in the previous commit this is no longer needed. V2: remove use in src/compiler/glsl/test_optpass.cpp Reviewed-by: Eric Anholt <[email protected]>
* i965: move compiled_once flag to brw_programTimothy Arceri2016-12-308-48/+23
| | | | | | | This allows us to delete brw_shader and removes the last use of gl_linked_shader in the codegen paths. Reviewed-by: Eric Anholt <[email protected]>
* mesa/glsl: move BlendSupport bitfield to gl_programTimothy Arceri2016-12-305-11/+20
| | | | | | | | | | | | This will let us to make _CurrentFragmentProgram a gl_program pointer allowing for simpilifications to be made. We also need to add a field to gl_shader to hold it during parsing. In gl_program we put it inside a union in anticipation of moving more fields here that can be only fs or vertex stage fields. Reviewed-by: Eric Anholt <[email protected]>
* mesa: store gl_program in gl_transform_feedback_object rather than ↵Timothy Arceri2016-12-305-23/+21
| | | | | | | | | gl_shader_program This will allow us to make the CurrentProgram array store gl_program which allows us to do a bunch of simplifications. Reviewed-by: Eric Anholt <[email protected]>
* mesa/glsl: move LinkedTransformFeedback from gl_shader_program to gl_programTimothy Arceri2016-12-3012-48/+62
| | | | | | | | | | | | This will help allow us to store gl_program in the CurrentProgram array rather than gl_shader_program which will allow a bunch of simplifications. Note that we make LinkedTransformFeedback a pointer so we don't waste memory creating a struct for each stage. We also store a pointer to the gl_program that will contain the pointer in gl_shader_program so we can get easy access to the correct stage. Reviewed-by: Eric Anholt <[email protected]>
* i965: get LinkedTransformFeedback from gl_transform_feedback_objectTimothy Arceri2016-12-301-20/+9
| | | | | | | | | We have already set the gl_shader_program pointer to the correct shader program in _mesa_BeginTransformFeedback() so use it. This is more consistent with how we do it for gen7. Reviewed-by: Eric Anholt <[email protected]>
* mesa: move _Used to gl_programTimothy Arceri2016-12-303-6/+6
| | | | | | We no longer need to initialise it because gl_program is never reused. Reviewed-by: Eric Anholt <[email protected]>
* mesa/compiler: add local_size_variable to shader_infoTimothy Arceri2016-12-302-0/+3
| | | | | | | | This will be used in api_validate.c in a following patch when we switch to using gl_program pointers for the pipelines CurrentProgram array. Reviewed-by: Eric Anholt <[email protected]>
* mesa: pass gl_program to _mesa_append_uniforms_to_file()Timothy Arceri2016-12-303-5/+4
| | | | | | This now contains everything we need. Reviewed-by: Eric Anholt <[email protected]>
* glsl/mesa: set separate_shader directly in shader_infoTimothy Arceri2016-12-302-1/+1
| | | | Reviewed-by: Eric Anholt <[email protected]>
* mesa/glsl: move subroutine metadata to gl_programTimothy Arceri2016-12-305-119/+123
| | | | | | | | This will allow us to store gl_program rather than gl_shader_program as the current program perstage which allows us to simplify code that makes use of the CurrentProgram list. Reviewed-by: Eric Anholt <[email protected]>
* mesa/compiler: add stage to shader_infoTimothy Arceri2016-12-302-0/+4
| | | | | | | | | | | | This will allow us to simplify the current program logic for SSO. Also since we aim to detach shader_info from nir_shader this will come in handy avoiding passing nir_shader around just to keep track of the stage we are dealing with. V2: set stage for arb asm programs also. Reviewed-by: Eric Anholt <[email protected]>
* vc4: Rework scheduling of thread switch to cut one more NOP.Eric Anholt2016-12-291-46/+75
| | | | | | | | | | | | | | Jonas's patch got us most of the benefit of scheduling instructions into the delay slots of thread switch, but if there had been nothing to pair the thrsw with, it would move the thrsw up and leave a NOP where the thrsw was. Instead, don't pair anything with thrsw through the normal scheduling path, and have a separate helper function that inserts the thrsw earlier if possible and inserts any necessary NOPs. total instructions in shared programs: 93027 -> 92643 (-0.41%) instructions in affected programs: 14952 -> 14568 (-2.57%)
* vc4: Fill thread switching delay slotsJonas Pfeil2016-12-291-7/+38
| | | | | | | | | | | | | | | Scan for instructions without a signal set in front of the switching instruction and move the signal up there. shader-db results: total instructions in shared programs: 94494 -> 93027 (-1.55%) instructions in affected programs: 23545 -> 22078 (-6.23%) v2: Fix re-emitting of the instruction in the loop trying to emit NOPs, drop a scheduling change from branch delay slots. (by anholt) Signed-off-by: Jonas Pfeil <[email protected]>
* vc4: Enable NIR-based loop unrolling.Eric Anholt2016-12-291-0/+5
| | | | | This successfully unrolls a new shader in GLB2.7, which also gets that shader to successfully compile in multithreaded mode.
* nir: stop gcc warning about uninitialised variablesTimothy Arceri2016-12-291-1/+1
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* radv: denote support for extended storage image formats.Dave Airlie2016-12-281-2/+4
| | | | | | | | | I'm sure anv has support for these as well, but this is just a first use of the interface to allow different supported spir-v features. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* spirv: add interface for drivers to define support extensions.Dave Airlie2016-12-285-4/+24
| | | | | | | | | | | I expect over time the struct contents will change as all drivers support stuff etc, but for now this should be a good starting point. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/shaderobj: Fix races on refcountsChad Versace2016-12-281-10/+4
| | | | | | | | | | | | | | | | | | | | | | Use atomic ops when updating gl_shader::RefCount. Fixes intermittent failures and crashes in 'dEQP-EGL.functional.sharing.gles2.multithread.*'. All tests in that group now pass except 'dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.textures.copyteximage2d_texsubimage2d_render'. Tested with: mesa: branch 'master' at d6545f2 deqp: branch 'nougat-cts-dev' at 4acf725 with additional local fixes DEQP_TARGET: x11_egl hw: Intel Broadwell 0x1616 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99085 Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Cc: [email protected] Cc: Mark Janes <[email protected]> Cc: Haixia Shi <[email protected]>
* freedreno/ir3: fix linkage::var sizeRob Clark2016-12-271-1/+1
| | | | | | | It should actually be 32 for a4xx/a5xx.. we still only advertise 16 but for a5xx the linkage map includes position/psize. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: treat clipvertex like a normal varyingRob Clark2016-12-271-3/+1
| | | | | | | | We need this in case it is streamed out. Not sure why we were treating it specially before. Having it as a VS out is harmless if FS doesn't have a matching input. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: transform-feedback supportRob Clark2016-12-277-38/+209
| | | | | | | | | | | We'll need to revisit when adding hw binning pass support, whether we can still do this in main draw step, as we do w/ a3xx/a4xx, or if we needed to move it to the binning stage. Still some failing piglits but most tests pass and the common cases seem to work. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2016-12-277-43/+81
| | | | | | | Pull in a5xx streamout related regs. Also fixes a couple incorrect register definitions. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: UBO support for 64b GPUs (a5xx)Rob Clark2016-12-271-3/+24
| | | | | | Update address calculation to support 64b addresses. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rework location of driver constantsRob Clark2016-12-276-53/+75
| | | | | | | | | | | Rework how we lay out driver constants (driver-params, UBO/TFBO buffer addresses, immediates) for more flexibility. For a5xx+ we need to deal with the fact that gpu ptrs are 64b instead of 32b, which makes the fixed offset scheme not work so well. While we are dealing with that we might also make the layout more dynamic to account for varying # of UBOs, etc. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix emit for bo addressesRob Clark2016-12-271-3/+9
| | | | | | Reloc for the buffer address is two dwords on 64b devices (a5xx+) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: texture layoutRob Clark2016-12-272-2/+2
| | | | | | | Seems to be imilar to a4xx, and sampler state "array-pitch" needs to be aligned to page size. Signed-off-by: Rob Clark <[email protected]>
* ttn: set ->info->num_ubosRob Clark2016-12-271-1/+4
| | | | | | | | | For dealing w/ 32b vs 64b gpu addresses, I need to rework how we pass UBO buffer addresses to shader, and knowing up front the # of UBOs is useful. But I noticed ttn wasn't setting this. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0Chad Versace2016-12-271-1/+8
| | | | | | | | | | | | The spec implicitly allows the incoming count to be 0. From the Vulkan 1.0.38 spec, Section 4.1 Physical Devices: If the value referenced by pQueueFamilyPropertyCount is not 0 [then do stuff]. Cc: [email protected] Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>