aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* glsl_to_tgsi: fix instruction order for bindless texturesMarek Olšák2017-10-061-4/+14
| | | | | | | | | We emitted instructions loading the bindless handle after the memory instruction. Cc: 17.2 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl_to_tgsi: enable copy propagation for tessellation shadersMarek Olšák2017-10-061-4/+3
| | | | | | just don't propagate output reads Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: Use a 565 format for GL_RGB and GL_UNSIGNED_SHORT_5_6_5 textures.Kenneth Graunke2017-10-051-0/+3
| | | | | | | | | | Found while trying to optimize an application. Not observed to help performance on i965, but should at least reduce the memory usage of such textures a bit. Reviewed-by: Eric Anholt <[email protected]> Tested-by: Eero Tamminen <[email protected]>
* i965: Add Atom graphics names to parse_devid_override()Matt Turner2017-10-041-0/+3
|
* automake: add texcompress_s3tc_tmp.h to the sources listEmil Velikov2017-10-041-0/+1
| | | | | | | | Otherwise it will be missing from the tarball. Fixes: f7daa737d17 ("mesa: Combine libtxc_dxtn sources into texcompress_s3tc_tmp.h") Signed-off-by: Emil Velikov <[email protected]>
* mesa: silence 'variable may be used uninitialized' warning in teximage.cBrian Paul2017-10-031-1/+1
| | | | | | Found with MinGW optimized build. Reviewed-by: Charmaine Lee <[email protected]>
* mesa: silence 'variable may be used uninitialized' warning in bufferobj.cBrian Paul2017-10-031-0/+1
| | | | | | Found with MinGW optimized build. Reviewed-by: Charmaine Lee <[email protected]>
* build: Remove HAVE_DLOPENMatt Turner2017-10-021-4/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Delete now unused dlopen.hMatt Turner2017-10-023-99/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Remove force_s3tc_enable driconf variableMatt Turner2017-10-023-5/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa/st: Drop has_lib_dxtc argument from st_init_extensions()Matt Turner2017-10-024-11/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Drop Mesa_DXTn from gl_contextMatt Turner2017-10-0215-95/+19
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Drop function pointer checks in s3tc codeMatt Turner2017-10-021-133/+60
| | | | | | | Now never null! Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Call DXTn functions directlyMatt Turner2017-10-021-92/+25
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Remove fprintf referring to libdxtnMatt Turner2017-10-021-1/+1
| | | | | | | | | When this file is included by Gallium, the fprintf causes it to fail to compile. This is an unreachable error case, and we shouldn't be calling fprintf directly. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Remove prototypes and mark S3TC functions staticMatt Turner2017-10-021-18/+5
| | | | | | | This file will be #included, so the functions should be static. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Remove commented-out DXTn fetch codeMatt Turner2017-10-021-80/+0
| | | | | | | Has been disabled for 12 years. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Combine libtxc_dxtn sources into texcompress_s3tc_tmp.hMatt Turner2017-10-025-306/+245
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: Import libtxc_dxtn sourcesMatt Turner2017-10-024-0/+1144
| | | | | | | | Imported from master (commit ef07298391c6dcad843e0b13e985090c1dd76e76) of https://cgit.freedesktop.org/~mareko/libtxc_dxtn/ Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Emil Velikov <[email protected]>
* st/mesa: don't use pipe_surface for passing information about EGLImageMarek Olšák2017-10-031-46/+50
| | | | | | | | | | Use st_egl_image instead. radeonsi doesn't like when we create a pipe_surface with PIPE_FORMAT_NV12. This fixes NV12 texturing on radeonsi using kmscube. Cc: 17.1 17.2 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: Implement ARB_indirect_parameters.Plamena Manolova2017-10-024-1/+124
| | | | | | | | | | | We can implement ARB_indirect_parameters for i965 by taking advantage of the conditional rendering mechanism. This works by issuing maxdrawcount draw calls and using conditional rendering to predicate each of them with "drawcount > gl_DrawID" Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Refactor brw_try_draw_prims.Plamena Manolova2017-10-021-117/+119
| | | | | | | | | | | In order to add our ARB_indirect_parameters implementation we need to refactor brw_try_draw_prims so that it operates on a per primitive basis and move the loop into brw_draw_prims. This commit refactors the brw_try_draw_prims function and renames it to brw_draw_single_prim. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Indroduce brw_finish_drawing.Plamena Manolova2017-10-021-7/+14
| | | | | | | | | | | In order to add our ARB_indirect_parameters implementation we need to refactor brw_try_draw_prims so that it operates on a per primitive basis and move the loop into brw_draw_prims. This commit introduces the brw_finish_drawing function where we move the code that executes once after the loop. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Introduce brw_prepare_drawing.Plamena Manolova2017-10-021-19/+27
| | | | | | | | | | | In order to add our ARB_indirect_parameters implementation we need to refactor brw_try_draw_prims so that it operates on a per primitive basis and move the loop into brw_draw_prims. This commit introduces the brw_prepare_drawing function where we move the code that executes once before the loop. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: use R10G10B10X2 format where applicableNicolai Hähnle2017-10-021-3/+8
| | | | | | | | This is the last step of fixing dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev for radeonsi. Reviewed-by: Marek Olšák <[email protected]>
* mesa/main: R10G10B10_(A2) formats are not color renderable in ESNicolai Hähnle2017-10-021-2/+5
| | | | | | | | | | | | | The EXT_texture_type_2_10_10_10_REV (ES only) states the following issue: "1. Should textures specified with this type be renderable? UNRESOLVED: No. A separate extension could provide this functionality." This partially fixes dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.{rgb,rgba}_unsigned_int_2_10_10_10_rev Reviewed-by: Marek Olšák <[email protected]>
* mesa/main: select the R10G10B10X2_UNORM internal format based on data typeNicolai Hähnle2017-10-021-1/+3
| | | | | | | ES requires it. This is a partial fix for dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev Reviewed-by: Marek Olšák <[email protected]>
* glsl: do not set the 'smooth' qualifier by default on ES shadersNicolai Hähnle2017-10-021-1/+7
| | | | | | | | | | | | It leads to surprising states with integer inputs and outputs on vertex processing stages (e.g. geometry stages). Instead, rely on the driver to choose smooth interpolation by default. We still allow varyings to match when one stage declares it as smooth and the other declares it without interpolation qualifiers. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* i965: skip reading unused slots at the begining of the URB for the FSIago Toral Quiroga2017-10-021-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can start reading the URB at the first offset that contains varyings that are actually read in the URB. We still need to make sure that we read at least one varying to honor hardware requirements. This helps alleviate a problem introduced with 99df02ca26f61 for separate shader objects: without separate shader objects we assign locations sequentially, however, since that commit we have changed the method for SSO so that the VUE slot assigned depends on the number of builtin slots plus the location assigned to the varying. This fixed layout is intended to help SSO programs by avoiding on-the-fly recompiles when swapping out shaders, however, it also means that if a varying uses a large location number close to the maximum allowed by the SF/FS units (31), then the offset introduced by the number of builtin slots can push the location outside the range and trigger an assertion. This problem is affecting at least the following CTS tests for enhanced layouts: KHR-GL45.enhanced_layouts.varying_array_components KHR-GL45.enhanced_layouts.varying_array_locations KHR-GL45.enhanced_layouts.varying_components KHR-GL45.enhanced_layouts.varying_locations which use SSO and the the location layout qualifier to select such location numbers explicitly. This change helps these tests because for SSO we always have to include things such as VARYING_SLOT_CLIP_DIST{0,1} even if the fragment shader is very unlikely to read them, so by doing this we free builtin slots from the fixed VUE layout and we avoid the tests to crash in this scenario. Of course, this is not a proper fix, we'd still run into problems if someone tries to use an explicit max location and read gl_ViewportIndex, gl_LayerID or gl_CullDistancein in the FS, but that would be a much less common bug and we can probably wait to see if anyone actually runs into that situation in a real world scenario before making the decision that more aggresive changes are required to support this without reverting 99df02ca26f61. v2: - Add a debug message when we skip clip distances (Ilia) - we also need to account for this when we compute the urb setup for the fragment shader stage, so add a compiler util to compute the first slot that we need to read from the URB instead of replicating the logic in both places. v3: - Make the util more generic so it can account for all unused slots at the beginning of the URB, that will make it more useful (Ken). - Drop the debug message, it was not what Ilia was asking for. Suggested-by: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st/glsl_to_tgsi: use LDEXP when availableNicolai Hähnle2017-09-291-3/+7
| | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* st/glsl_to_tgsi: fix conditional assignments to packed shader outputsNicolai Hähnle2017-09-291-1/+9
| | | | | | | | | | | | | | | | Overriding the default (no-op) swizzle is clearly counter-productive, since the whole point is putting the destination register as one of the source operands so that it remains unmodified when the assignment condition is false. Fragment depth and stencil outputs are a special case due to how their source swizzles are manipulated in translate_src when compiling to TGSI. Fixes dEQP-GLES2.functional.shaders.conditionals.if.*_vertex Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* st/glsl_to_tgsi: fix a use-after-free in merge_two_dstsNicolai Hähnle2017-09-291-1/+2
| | | | | | | | | | | | | Found by address sanitizer. The loop here tries to be safe, but in doing so, it ends up doing exactly the wrong thing: the safe foreach is for when the loop variable (inst) could be deleted and nothing else. However, this particular can delete inst's successor, but not inst itself. Fixes: 8c6a0ebaad72 ("st/mesa: add st fp64 support (v7.1)") Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* i965/link: Use prog->nir instead of creating a temporaryJason Ekstrand2017-09-281-4/+3
| | | | | | | | | This way, when NIR_PASS_V makes a clone of the shader (for testing nir_clone), the new and lowered version gets re-assigned to prog->nir. [[email protected]: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/link: Make more use of NIR_PASSJason Ekstrand2017-09-281-6/+6
| | | | | | [[email protected]: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/link: Make better use of temporary variablesJason Ekstrand2017-09-281-4/+5
| | | | | | | | | | | The way NIR_PASS works (and, by extension, nir_optimize) is that they may clone the shader and throw the old one away. (We use this for testing nir_clone.) It's better if we just make a temporary variable, use it for everything, and re-assign to the gl_program at the end. [[email protected]: Tested NIR_TEST_CLONE=1 with valgrind] Tested-by: Jordan Justen <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* st/mesa: don't call close() on WindowsBrian Paul2017-09-281-0/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: fix texture updates for ATI_fragment_shaderMarek Olšák2017-09-281-3/+5
| | | | | | Cc: 17.1 17.2 <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: enable up to 32 inputs for geometry shaders in gen8+Iago Toral Quiroga2017-09-281-1/+2
| | | | | | | | | | | | | | | | | | | | | We have been exposing only 16 since 1e3e72e3054de with arguments based on register pressure and the number of available GRFs, however, our scalar backend will always limit the number of push registers for GS threads to 24 and fallback to pull model for anything else, so there is really no reason to lower the number under those arguments. By bumping this up to 32 we make it the same as all the other stages, which is a nice feature to have that can help applications in some cases (I recently fixed a bug in CTS that assumed that the number of input locations in a stage matches the number of output locations in the previous stage for example). Pre-gen8, we use the vector backend and push model, so in that case the arguments in 1e3e72e3054de are still valid. v2: check if we have scalar GS instead of the hw gen to enable this (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Convert brw->*_program into a brw->programs[i] array.Kenneth Graunke2017-09-2622-126/+147
| | | | | | This makes it easier to loop over programs. Reviewed-by: Alejandro Piñeiro <[email protected]>
* i965: make use of nir linkingTimothy Arceri2017-09-261-0/+56
| | | | | | | | | | | | | | | | | | | | | For now linking is just removing unused varyings between stages. shader-db results BDW: total instructions in shared programs: 13198288 -> 13191693 (-0.05%) instructions in affected programs: 48325 -> 41730 (-13.65%) helped: 473 HURT: 0 total cycles in shared programs: 541184926 -> 541159260 (-0.00%) cycles in affected programs: 213238 -> 187572 (-12.04%) helped: 435 HURT: 8 V2: - lower indirects on demoted inputs as well as outputs. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: call brw_shader_gather_info() from the callers of brw_create_nir()Timothy Arceri2017-09-262-7/+18
| | | | | | | This will allow us to insert a nir linking step in brw_link_shader(). Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965: create a brw_shader_gather_info() helperTimothy Arceri2017-09-262-7/+16
| | | | | | | | This will help us call gather info at a later point and allow us to do some linking in nir. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965: Rename do_flush_locked to submit_batch().Kenneth Graunke2017-09-251-3/+4
| | | | | | | do_flush_locked isn't a great name - especially given that there's no locking going on in our code relating to execbuf. Reviewed-by: Chris Wilson <[email protected]>
* i965: Use atomic ops in get_new_program_id().Kenneth Graunke2017-09-252-6/+1
| | | | | | | | | | | | | We have a nice utility function for this, which eliminates the need for locking stuff. This isn't really performance critical, but it's less code to use the atomic. p_atomic_inc_return does pre-increment rather than post-increment, so we change screen->program_id to be initialized to 0 instead of 1. At which point, we can just delete the initialization because intel_screen is rzalloc'd. Reviewed-by: Chris Wilson <[email protected]>
* i965: Convert brw_bufmgr to use C11 mutexes instead of pthreads.Kenneth Graunke2017-09-251-18/+17
| | | | | | | There's no real advantage or disadvantage here, it's just for stylistic consistency with the rest of the codebase. Reviewed-by: Chris Wilson <[email protected]>
* i965: Delete dead meta stencil blit program fields from brw_context.Kenneth Graunke2017-09-251-3/+0
| | | | These have been unused for a while now.
* i965: Force outputs_written to contain varyings needed by stream-out.Kenneth Graunke2017-09-211-3/+6
| | | | | | | | | | | | | If transform feedback is recording a varying, it needs a slot in the VUE map, regardless of whether or not the shader writes it. Together with the previous patch, this fixes: - KHR-GL45.enhanced_layouts.xfb_capture_struct The test captures a structure where the vertex shader writes the first and third members - but the second still needs a slot. Reviewed-by: Juan A. Suarez Romero <[email protected]>
* i965: Compute VS/GS output VUE map from the NIR info.Kenneth Graunke2017-09-212-2/+2
| | | | | | | | | | unify_interfaces() only updates the NIR program info, not the copy in the gl_program itself. So, by using the old copy, we were missing out on these updates. The TCS/TES ones already did this correctly. Reviewed-by: Juan A. Suarez Romero <[email protected]>
* i965: Fix brw_finish_batch to grow the batchbuffer.Kenneth Graunke2017-09-211-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | brw_finish_batch emits commands needed at the end of every batch buffer, including any workarounds. In the past, we freed up some "reserved" batch space before calling it, so we would never have to flush during it. This was error prone and easy to screw up, so I deleted it a while back in favor of growing the batch. There were two problems: 1. We're in the middle of flushing, so brw->no_batch_wrap is guaranteed not to be set. Using BEGIN_BATCH() to emit commands would cause a recursive flush rather than growing the buffer as intended. 2. We already recorded the throttling batch before growing, which replaces brw->batch.bo with a different (larger) buffer. So growing would break throttling. These are easily remedied by shuffling some code around and whacking brw->no_batch_wrap in brw_finish_batch(). This also now includes the final workarounds in the batch usage statistics. Found by inspection. Fixes: 2c46a67b4138631217141f (i965: Delete BATCH_RESERVED handling.) Reviewed-by: Chris Wilson <[email protected]>
* i965: Move MI_BATCHBUFFER_END handling into brw_finish_batch().Kenneth Graunke2017-09-211-7/+7
| | | | | | This is, by definition, finishing the batch. Reviewed-by: Chris Wilson <[email protected]>