summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* mesa/st: Fix leaks of TGSI tokens in VP variants.Eric Anholt2019-03-141-14/+20
| | | | | | | | | | Starting a glxgears and closing it, I was seeing a lot of leaked TGSI for the fixed function VPs. v2: drop unused delete_ir() arg. Fixes: 3b4929ec6e64 ("st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.") Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/st: Make sure that prog_to_nir NIR gets freed.Eric Anholt2019-03-141-0/+6
| | | | | | | | | | GLSL NIR gets freed on relink by _mesa_delete_program(), but for ARB programs we need to free the old NIR when PSN is used to set up new NIR in the same gl_program. Additionally, set the base .nir field so that it will get freed by _mesa_delete_program(). Fixes: 3d7611e9a6c6 ("st/nir: use NIR for asm programs") Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: add logging function for formatted stringMark Janes2019-03-142-0/+35
| | | | Reviewed-by: Erik Faye-Lund <[email protected]>
* mesa: rename logging functions to reflect that they format stringsMark Janes2019-03-1412-92/+92
| | | | | | | In preparation for the definition of a function to log a formatted string. Reviewed-by: Erik Faye-Lund <[email protected]>
* mesa: properly report the length of truncated log messagesMark Janes2019-03-141-0/+3
| | | | | | | | | | | | | _mesa_log_msg must provide the length of the string passed into the KHR_debug api. When the string formatted by _mesa_gl_vdebugf exceeds MAX_DEBUG_MESSAGE_LENGTH, the length is incorrectly set to the number of characters that would have been written if enough space had been available. Fixes: 30256805784450b8bb9d4dabfb56226271ca9d24 ("mesa: Add support for GL_ARB_debug_output with dynamic ID allocation.") Reviewed-by: Erik Faye-Lund <[email protected]>
* i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9Plamena Manolova2019-03-141-1/+24
| | | | | | | | | | | ARB_fragment_shader_interlock depends on memory fences to ensure fragment ordering and this ordering guarantee is only supported from GEN9 onwards. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109980 Fixes: 939312702e35 "i965: Add ARB_fragment_shader_interlock support." Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove scaling factors from P010, P012Tapani Pälli2019-03-141-2/+2
| | | | | | | | | | | | | | | | | | Patch removes scaling factors introduced in 2a2e69f975b but leaves option to use scaling in place as it could be useful with other upcoming YUV formats. We did this scaling because ffmpeg was shifting channel bits down, however it seems this is not the right place as compositor wants to flip same buffers directly to display as well and therefore bitshifting needs to be done by the client when receiving frame from ffmpeg. Now P0x formats are treated the same, e.g. P010 is same as P016 but with lower 6 bits set to zeros. Fixes: 2a2e69f975b "i965: add P0x formats and propagate required scaling factors" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* st/glsl_to_nir: fix incorrect arrary accessTimothy Arceri2019-03-121-2/+5
| | | | | | | | | | | | | This fixes a segfault when we try to access the array using a -1 when the array wasn't allocated in the first place. Before 7536af670b75 we would just access a pre-allocated array that was also load/stored to/from the shader cache. But now the cache will no longer allocate these arrays if they are empty. The change resulted in tests such as the following segfaulting when run with a warm shader cache. tests/spec/arb_arrays_of_arrays/execution/sampler/fs-struct-const-index.shader_test
* i965: Reimplement all the PIPE_CONTROL rules.Kenneth Graunke2019-03-111-136/+403
| | | | | | | | | | | | | | | | | | | | | | | | | | This implements virtually all documented PIPE_CONTROL restrictions in a centralized helper. You now simply ask for the operations you want, and the pipe control "brain" will figure out exactly what pipe controls to emit to make that happen without tanking your system. The hope is that this will fix some intermittent flushing issues as well as GPU hangs. However, it also has a high risk of causing GPU hangs and other regressions, as this is a particularly sensitive area and poking the bear isn't always advisable. Mark Janes noted that this patch helps with some GPU hangs on Icelake. This does re-enable the VF Invalidate => Write Immediate workaround on Gen8, which had been disabled (bug 103787) due to GPU hangs. The old code did this workaround after another which would have added CS stall bits, so it missed a workaround. The new code orders them properly and appears to work. v4: Don't pass "bo, offset, imm" to a recursive CS stall (caught by Topi Pohjolainen), drop Gen10 workarounds that are unnecessary for production hardware. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use genxml for emitting PIPE_CONTROL.Kenneth Graunke2019-03-117-230/+362
| | | | | | | | | | | While this does add a bunch of boilerplate, it also protects us against the hardware moving bits, or changing their meaning. For something as finnicky as PIPE_CONTROL, the extra safety seems worth it. We turn PIPE_CONTROL_* into an bitfield of arbitrary flags, and then pack them appropriately. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Rename ISP_DIS to INDIRECT_STATE_POINTERS_DISABLE.Kenneth Graunke2019-03-112-2/+2
| | | | | | Clearer name. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move some genX infrastructure to genX_boilerplate.h.Kenneth Graunke2019-03-114-128/+174
| | | | | | | This will let us make multiple genX_*.c files, without copy and pasting all this boilerplate. Reviewed-by: Topi Pohjolainen <[email protected]>
* st/mesa: minor refactoring of texture/sampler delete codeBrian Paul2019-03-113-6/+11
| | | | | | | | | | | | Rename st_texture_free_sampler_views() to st_delete_texture_sampler_views() to align with st_DeleteTextureObject(), its only caller. Move the call to st_texture_release_all_sampler_views() from st_DeleteTextureObject() to st_delete_texture_sampler_views() so all the sampler view clean-up code is in one place. Reviewed-by: Neha Bhende <[email protected]>
* st/mesa: rename st_texture_release_sampler_view()Brian Paul2019-03-113-5/+5
| | | | | | | To st_texture_release_context_sampler_view() to be more clear that it's context-specific. Reviewed-by: Neha Bhende <[email protected]>
* st/mesa: add/improve sampler view commentsBrian Paul2019-03-111-2/+8
| | | | Reviewed-by: Neha Bhende <[email protected]>
* st/mesa: move around some code in st_context.cBrian Paul2019-03-112-122/+116
| | | | | | | | | | | | st_init_driver_functions() is only called in st_context.c so there's no need for the prototype in st_context.h To avoid a forward declaration of st_init_driver_functions() in st_context.c, we need to move around several other functions. No functional change. Reviewed-by: Neha Bhende <[email protected]>
* st/mesa: move utility functions, macros into new st_util.h fileBrian Paul2019-03-1133-91/+184
| | | | | | | | | | | | | | To de-clutter st_context.h. Clean up remaining function prototypes in st_context.h. The st_vp_uses_current_values() helper is only used in st_context.c so move it there. The st_get_active_states() function is only used in st_context.c so remove its prototype in st_context.h Reviewed-by: Neha Bhende <[email protected]>
* prog_to_nir: fix write from vps to FOGKarol Herbst2019-03-081-1/+7
| | | | | | | | | | | for fragment programs we already treat fog as a single component value, but for vp we didn't. Fixes fog related piglit tests with my out of tree Nouveau nir patches. Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: init hash keys with memset(), not designated initializersBrian Paul2019-03-082-5/+17
| | | | | | | | | | Since the compiler may not zero-out padding in the object. Add a couple comments about this to prevent misunderstandings in the future. Fixes: 67d96816ff5 ("st/mesa: move, clean-up shader variant key decls/inits") Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: whitespace, formatting fixes in st_cb_flush.cBrian Paul2019-03-081-14/+19
| | | | Trivial.
* st/mesa: move, clean-up shader variant key decls/initsBrian Paul2019-03-082-10/+7
| | | | | | | Move the variant key declarations inside the scope they're used. Use designated initializers instead of memset() calls. Reviewed-by: Neha Bhende <[email protected]>
* isl: Add a swizzle parameter to isl_buffer_fill_state()Kenneth Graunke2019-03-071-0/+1
| | | | | | | This is necessary for legacy texture buffer object formats, where we'll need to use a swizzle to fake e.g. luminance. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/decoders: handle decoding MI_BBS from ringLionel Landwerlin2019-03-071-1/+1
| | | | | | | | | An MI_BATCH_BUFFER_START in the ring buffer acts as a second level batchbuffer (aka jump back to ring buffer when running into a MI_BATCH_BUFFER_END). Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/decoders: add address space indicator to get BOsLionel Landwerlin2019-03-071-1/+1
| | | | | | | Some commands like MI_BATCH_BUFFER_START have this indicator. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* st/glsl: start spilling out common st glsl conversion codeTimothy Arceri2019-03-067-122/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NIR and TGSI paths are currently intertwined which makes it not only hard to follow but also makes it hard to take advantage of the differences in IR. Here we take the first step to splitting that path apart. With this we take the opportunity to no longer call the GLSL IR optimisation passes after the final lowering calls for NIR. We can instead just use the NIR passes which can produce better code and should also result in faster compile times. The speed-up can be measured in some dolphin uber shaders due to no longer calling lower_if_to_cond_assign() for example dolphin/ubershaders/120.shader_test goes from ~1.63 -> ~1.53 seconds on my machine. There are some code changes as a result of not calling lower_if_to_cond_assign(), this is because it flattens ifs that contain UBOs where as NIR's peephole select doesn't. This is were most of the regressions in Max Waves happens with shader-db. shader-db results (VEGA): Totals from affected shaders: SGPRS: 2349056 -> 2349640 (0.02 %) VGPRS: 1322160 -> 1323300 (0.09 %) Spilled SGPRs: 21190 -> 21527 (1.59 %) Spilled VGPRs: 99 -> 99 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 72 -> 72 (0.00 %) dwords per thread Code Size: 57260904 -> 57270932 (0.02 %) bytes Compile Time: 1107186 -> 1022942 (-7.61 %) milliseconds LDS: 786 -> 786 (0.00 %) blocks Max Waves: 391932 -> 391619 (-0.08 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Eric Anholt <[email protected]>
* i965: stop calling nir_lower_returns()Timothy Arceri2019-03-061-3/+1
| | | | | | We now call this for all drivers in glsl_to_nir() instead. Reviewed-by: Eric Anholt <[email protected]>
* glsl: use NIR function inlining for drivers that use glsl_to_nir()Timothy Arceri2019-03-062-2/+2
| | | | | | | | glsl_to_nir() is still missing support for converting certain functions to NIR, so for those we use the GLSL IR optimisations to remove the functions. Reviewed-by: Eric Anholt <[email protected]>
* st/nir: Move 64-bit lowering laterJason Ekstrand2019-03-061-2/+5
| | | | | | | | | | | | | | Now that we have a loop unrolling cost function and loop unrolling isn't going to kill us the moment we have a 64-bit op in a loop, we can go ahead and move 64-bit lowering later. This gives us the opportunity to do more optimizations and actually let the full optimizer run even on 64-bit ops rather than hoping one round of opt_algebraic will fix everything. This substantially reduces both fp64 shader compile times and the resulting code size. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_doubles: Inline functions directly in lower_doublesJason Ekstrand2019-03-062-23/+8
| | | | | | | | | | | | Instead of trusting the caller to already have created a softfp64 function shader and added all its functions to our shader, we simply take the softfp64 shader as an argument and do the function inlining ouselves. This means that there's no more nasty functions lying around that the caller needs to worry about cleaning up. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Add a shared helper for building float64 shadersJason Ekstrand2019-03-063-99/+5
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Compile the fp64 program based on nir optionsJason Ekstrand2019-03-061-1/+2
| | | | | | | | | | Instead of looking the devinfo directly, look at the lowering options we provided to NIR. This is more accurate as it's now checking for "do we need full software lowering" rather than a hardware bit. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()Timothy Arceri2019-03-062-4/+4
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_location_offset() -> struct_location_offset()Timothy Arceri2019-03-062-2/+2
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename is_record() -> is_struct()Timothy Arceri2019-03-063-8/+8
| | | | | | | | | | Replace was done using: find ./src -type f -exec sed -i -- \ 's/is_record(/is_struct(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* nir: Add multiplier argument to nir_lower_uniforms_to_ubo.Timur Kristóf2019-03-052-2/+2
| | | | | | | | | | | | | Note that locations can be set in different units, and the multiplier argument caters to supporting these different units. For example, st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4, while tgsi_to_nir uses bytes, so the multiplier should be 16. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Move nir_lower_uniforms_to_ubo to compiler/nir.Timur Kristóf2019-03-056-105/+2
| | | | | | | | | | | | The nir_lower_uniforms_to_ubo function is useful outside of mesa/state_tracker, and in fact is needed to produce NIR for drivers that have the PIPE_CAP_PACKED_UNIFORMS capability. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Implement threaded GL support.Kenneth Graunke2019-03-053-0/+51
| | | | | | | | | | | | | | | | | | | Now i965 supports mesa_glthread=true like Gallium drivers do. According to Markus (degasus), the Citra emulator now runs ~30% faster. Emmanuel (linkmauve) also reported that the Dolphin emulator improved by 2.8x on one game. (Both of those still need to be added to drirc.) An Intel Mesa CI run with mesa_glthread=true appears to be happy. Bioshock Infinite's benchmark mode seems to be around 15-20% faster on my Skylake GT4 at 1920x1080. Tested-by: Markus Wick <[email protected]> Tested-by: Emmanuel Gil Peyrot <[email protected]> Tested-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: [u/i]mulExtended optimization for GLSLSagar Ghuge2019-03-042-0/+2
| | | | | | | | | | | | | | | Optimize mulExtended to use 32x32->64 multiplication. Drivers which are not based on NIR, they can set the MUL64_TO_MUL_AND_MUL_HIGH lowering flag in order to have same old behavior. v2: Add missing condition check (Jason Ekstrand) Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <Matt Turner <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa: whitespace fixes in st_texture.hBrian Paul2019-03-041-9/+13
| | | | Trivial.
* st/mesa: line wrapping, whitespace fixes in st_cb_texture.cBrian Paul2019-03-041-2/+4
| | | | Trivial.
* st/mesa: whitespace fixes in st_sampler_view.cBrian Paul2019-03-041-6/+10
| | | | | Replace tabs w/ spaces. 80-column wrapping. Trivial.
* st/mesa: Invalidate the gallium array atom only if needed.Mathias Fröhlich2019-03-041-2/+4
| | | | | | | | | | | | Now that the buffer object usage history tracks if it is being used as vertex buffer object, we can restrict setting the ST_NEW_VERTEX_ARRAYS bit to dirty on glBufferData calls to buffers that are potentially used as vertex buffer object. Also put a note that the same could be done for index arrays used in indexed draws. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* mesa: Track buffer object use also for VAO usage.Mathias Fröhlich2019-03-044-4/+15
| | | | | | | | | We already track the usage history for buffer objects in a lot of aspects. Add GL_ARRAY_BUFFER and GL_ELEMENT_ARRAY_BUFFER to gl_buffer_object::UsageHistory. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* mesa: Expose EXT_texture_query_lod and add support for its use shadersGert Wollny2019-03-031-0/+1
| | | | | | | | | | EXT_texture_query_lod provides the same functionality for GLES like the ARB extension with the same name for GL. v2: Set ES 3.0 as minimum GLES version as required by the extension Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: add support for lowering fp64/int64 for nir driversDave Airlie2019-03-021-1/+98
| | | | | | | | | | | | | | | | | | | | This might enough for iris and possible r600 (when it gets NIR) This appears to work for iris. v2: * change cap return so DOUBLES == 2 means sw emu v3: * Refactor using int64/doubles lowering options which were added into nir options * Remove DOUBLES == 2 added in v2 [jordan: Remove "2" value on PIPE_CAP_DOUBLES] [jordan: Use lowering options added to nir options] Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* st/nir: count num_uniforms for FS bultin shaderCaio Marcelo de Oliveira Filho2019-02-271-0/+2
| | | | | | | | Usually the uniforms will be assigned locations and have their slots counted automatically, but for builtin shaders the location assignment is manual. So count them too otherwise we get num_uniforms == 0. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: fix shader cache for packed param listTimothy Arceri2019-02-282-0/+7
| | | | | | | | | | | | | | | Some types of params such as some builtins are always padded. We need to keep track of this so we can restore the list correctly. Here we also remove a couple of cache entries that are not actually required as they get rebuilt by the _mesa_add_parameter() calls. This patch fixes a bunch of arb_texture_multisample and arb_sample_shading piglit tests for the radeonsi NIR backend. Fixes: edded1237607 ("mesa: rework ParameterList to allow packing") Reviewed-by: Marek Olšák <[email protected]>
* i965: Fix allow_higher_compat_version workaround limited by OpenGL 3.0Yevhenii Kolesnikov2019-02-281-6/+12
| | | | | | | | | | | | Added check for higher compat profile being allowed before assigning certain extensions. Fixes: 272fe9494232 (mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile) Signed-off-by: Danylo Piliaiev <[email protected]> Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107052
* mesa: fix display list corner case assertionBrian Paul2019-02-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a failed assertion in glDeleteLists() for the following case: list = glGenLists(1); glDeleteLists(list, 1); when those are the first display list commands issued by the application. When we generate display lists, we plug in empty lists created with the make_list() helper. This function uses the OPCODE_END_OF_LIST opcode but does not call dlist_alloc() which would set the InstSize[OPCODE_END_OF_LIST] element to non-zero. When the empty list was deleted, we failed the InstSize[opcode] > 0 assertion. Typically, display lists are created with glNewList/glEndList so we set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc(). That's why this bug wasn't found before. To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST] element in make_list(). The game oolite was hitting this. Fixes: https://github.com/OoliteProject/oolite/issues/325 Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: whitespace/formatting fixes in st_cb_texture.cBrian Paul2019-02-261-32/+58
| | | | | | Remove trailing whitespace, replace tabs w/ spaces, etc. Trivial.