summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Implement rasterizer discard via SOL unless required for queries.Kenneth Graunke2016-06-232-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | We currently use CL_INVOCATION_COUNT for the GL_PRIMITIVES_GENERATED query, which involves passing all primitives to the clipper. When rasterizer discard is enabled, we program the clipper in REJECT_ALL mode, rather than using the SOL stage's "Rendering Disable" feature. See commit f09b91f78247409f54c975f56cb10d5f350fe64e for an explanation of why we implement GL_PRIMITIVES_GENERATED this way. Apparently the SOL stage's "Rendering Disable" feature is a lot faster than having the clipper reject all primitives. It's safe to use when no GL_PRIMITIVES_GENERATED query is active, as we don't care about CL_INVOCATION_COUNT incrementing. This patch makes us use SO_RENDERING_DISABLE when no query is active, but continues falling back to the clipper in REJECT_ALL mode when the queries are enabled. It brings back the perf_debug for the clipper case (which I removed in commit 1f9445ff57b, thinking it wasn't useful). Improves performance in Gl32GSCloth by 84.8303% +/- 2.07132% (n = 10) on my Broadwell GT2 laptop. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Combine 3DSTATE_STREAMOUT emitters and genX_sol_state atoms.Kenneth Graunke2016-06-234-99/+37
| | | | | | | | | | | They're basically the same. Let's avoid the code duplication. v2: Fix SO_BUFFER_ENABLE stuff to only happen on Gen < 8 (caught by Jason Ekstrand). Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Copy propagate before doing variable index lowering.Kenneth Graunke2016-06-231-0/+2
| | | | | | | | | | | | | | | | | | | | | The scalar backend currently doesn't support variable indexing on temporary arrays, but it does support it on uniform arrays, and some stages support it for input arrays. Make sure these are propagated through before exploding indirects into piles of if-ladders unnecessarily. On Broadwell, no instruction count change in shader-db. total cycles in shared programs: 80675652 -> 80674928 (-0.00%) cycles in affected programs: 649972 -> 649248 (-0.11%) helped: 386 HURT: 165 This will help avoid code quality regressions in a future commit. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965/blorp: Disable vertex element swizzlingTopi Pohjolainen2016-06-232-4/+18
| | | | | | | | Without vertex elements originating directly from vertex fetcher are not passed to wm-state correctly. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Let program data tell if push constants are neededTopi Pohjolainen2016-06-233-15/+35
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Use prog data counters to guide wm/ps setupTopi Pohjolainen2016-06-233-3/+8
| | | | | | | just as core upload logic does. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Use prog data counters to guide sf/sbe setupTopi Pohjolainen2016-06-235-9/+37
| | | | | | | just as core upload logic does. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Avoid division by zero.Ardinartsev Nikita2016-06-231-11/+15
| | | | | | | | | Fixes regression introduced by af5ca43f2676bff7499f93277f908b681cb821d0 Cc: "12.0 11.2" <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419
* glsl/mesa: stop duplicating geom and tcs layout valuesTimothy Arceri2016-06-235-41/+40
| | | | | | | | | | | | We already store these in gl_shader and gl_program here we remove it from gl_shader_program and just use the values from gl_shader. This will allow us to keep the shader cache restore code as simple as it can be while making it somewhat clearer where these values originate from. Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl/mesa: stop duplicating tes layout valuesTimothy Arceri2016-06-233-24/+28
| | | | | | | | | | | | | | | We already store this in gl_shader and gl_program here we remove it from gl_shader_program and just use the values from gl_shader. This will allow us to keep the shader cache restore code as simple as it can be while making it somewhat clearer where these values originate from. V2: remove unnecessary NULL check Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral <[email protected]>
* st/mesa: expose EXT_vertex_array_bgra when supported by backendChristian Gmeiner2016-06-221-1/+2
| | | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965/blorp: Only set src_z for gen8+ 3D texturesJason Ekstrand2016-06-221-2/+9
| | | | | | | | Otherwise, we end up with a bogus value in the third component. On gen6-7 where we always use 2D textures, this can cause problems if the SurfaceArray bit is set in the SURFACE_STATE. Acked-by: Chad Versace <[email protected]>
* i965/gen7,8: Set SURFACE_IS_ARRAY for all non-3D texture typesJason Ekstrand2016-06-222-2/+2
| | | | | | | | | There's no real reason why we shouldn't set this bit. It does affect how the sampler operates a bit but since you can have a 2D non-array view of a 2D_ARRAY texture that distinction is very weak. Also, this is what ISL will do and we would like this change to be isolated from using ISL. Reviewed-by: Chad Versace <[email protected]>
* i965/gen4: Subtract 1 from buffer sizesJason Ekstrand2016-06-221-3/+3
| | | | | | | | | The PRM states that the values put in Width, Height, and Depth should be various bits from the value size - 1. We seem to have done this wrong more-or-less from the start. Reviewed-by: Chad Versace <[email protected]> Cc: "11.1 11.2 12.0" <[email protected]>
* i965: Remove fake W-tiled render target supportJason Ekstrand2016-06-223-47/+9
| | | | | | | This hasn't been used since 1cfb4bc890b8 where we deleted the meta stencil blit path. Reviewed-by: Chad Versace <[email protected]>
* i965/fs: Use a default Y coordinate of 0 for TXF on gen9+Jason Ekstrand2016-06-221-0/+2
| | | | | | | | | | | Previously, we were incrementing length but not actually putting anything in the Y coordinate. This meant that 1-D TXF operations had a garbage array index. If the surface is emitted as 1-D non-array, the coordinate gets discarded and it works fine. If it happens to be bound as an array surface, it may count as an out-of-bounds array access and you get zero. Reviewed-by: Ian Romanick <[email protected]> Cc: "11.1 11.2 12.0" <[email protected]>
* i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCHJason Ekstrand2016-06-221-2/+2
| | | | | | Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: "11.1 11.2 12.0" <[email protected]>
* i965/blorp/gen8: Use the correct max level and layer in emit_surface_statesJason Ekstrand2016-06-221-3/+2
| | | | | | | | | We were adding in the base which is wrong because the values given in the miptree are relative to zero and not the base layer/level. Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "11.1 11.2 12.0" <[email protected]>
* i965: Drop the maximum 3D texture size to 512 on Sandy BridgeJason Ekstrand2016-06-221-1/+10
| | | | | | | | | | | | | | The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be set to the depth of a 3-D texture when rendering. Unfortunatley, that field is only 9 bits on Sandy Bridge and prior so we can't actually bind a 3-D texturing for rendering if it has depth > 512. On Ivy Bridge, this field was bumpped to 11 bits so we can go all the way up to 2048. On Iron Lake and prior, we don't support layered rendering and we use OffsetX/Y hacks to render to particular layers so 2048 is ok there too. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "11.1 11.2 12.0" <[email protected]>
* i965/gen4-6: Handle gl_texture_object::BaseLevel and MinLayer correctlyJason Ekstrand2016-06-221-1/+3
| | | | | | | | | This is basically a direct translation of what we do for gen7. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036 Cc: "11.1 11.2 12.0" <[email protected]>
* i965/gen4: Pull texture formats from the texture object not the miptreeJason Ekstrand2016-06-221-1/+1
| | | | | | | | | | | This makes texture views sort-of work. It doesn't add full texture view support for gen4-5 but it is enough to fix the GL_ARB_copy_image formats piglit test on Iron Lake. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036 Cc: "11.1 11.2 12.0" <[email protected]>
* i965: Fix point size with tessellation/geometry shaders in GLES.Kenneth Graunke2016-06-224-9/+59
| | | | | | | | | | | | | Our previous code worked for desktop GL, and ES without geometry or tessellation shaders. But those features require fancier point size handling. Fortunately, we can use one rule for all APIs. Fixes a number of dEQP tests with EXT_tessellation_shader enabled: dEQP-GLES31.functional.tessellation_geometry_interaction.point_size.* Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: move vs outputs written into a helperTimothy Arceri2016-06-222-31/+47
| | | | | | | We will reuse this for fs key generation for the on disk shader cache. Reviewed-by: Iago Toral Quiroga <[email protected]>
* st/mesa: use a single memcpy in st_ReadPixels when possibleNicolai Hähnle2016-06-221-8/+15
| | | | | | | | This avoids costly address recomputations, function overhead, and may trigger large copy optimizations. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: Reorganize prog_data->total_scratch code a bit.Kenneth Graunke2016-06-211-16/+19
| | | | | | Cc: "12.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* st/mesa: cache staging texture for glReadPixelsNicolai Hähnle2016-06-214-14/+110
| | | | | | v2: add ST_DEBUG flag for disabling (suggested by Ilia) Reviewed-by: Marek Olšák <[email protected]> (v1)
* st/mesa: invalidate readpixels cacheNicolai Hähnle2016-06-2112-0/+17
| | | | | | | Whenever a draw happens or some other function call might change the result of future glReadPixels calls, we must invalidate the cache. Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: add readpix_cache structureNicolai Hähnle2016-06-213-0/+22
| | | | Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: move ReadPixels blit into a separate functionNicolai Hähnle2016-06-211-52/+78
| | | | Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: flush bitmap cache before CopyImageSubDataNicolai Hähnle2016-06-211-0/+3
| | | | | | | Found by inspection. Cc: 11.2 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: flush bitmap cache before texture functionsNicolai Hähnle2016-06-212-0/+12
| | | | | | | | | | | As far as I can tell, a sequence of glBitmap followed by texture functions that refer to a texture bound as the framebuffer is well within what should be allowed. Found by inspection. Cc: 11.2 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: flush bitmap cache before compute dispatchNicolai Hähnle2016-06-211-0/+3
| | | | | | | | | | In the unlikely case that a program uses glBitmap to render to a framebuffer whose texture is bound in a compute shader. Found by inspection. Cc: 11.2 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: get PrimitiveMode from the program rather than the shader structTimothy Arceri2016-06-211-3/+2
| | | | | | | This is more consistent with what we do elsewhere and will allow us to only cache one of the values in the shader cache. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Fix multiplication of immediates on Cherryview/Broxton.Kenneth Graunke2016-06-201-1/+4
| | | | | | | | | | | | | | | | | | | | Cherryview and Broxton don't support DW x DW multiplication. We have piles of code to handle this, but apparently weren't retyping in the immediate case. For example, tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes makes the simulator angry about instructions such as: mul(8) r18<1>:D r10.0<8;8,1>:D 0x00000003:D Just retype to W or UW. It should be safe on all platforms. Cc: "12.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Delete redundant extension enablesIan Romanick2016-06-201-9/+0
| | | | | | | A nearly identical block already exists in the gen >= 6 block above. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Fix incorrect "see also" commentsIan Romanick2016-06-201-1/+1
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Silence unused parameter warningIan Romanick2016-06-201-1/+1
| | | | | | | | | | main/pipelineobj.c: In function ‘delete_pipelineobj_cb’: main/pipelineobj.c:110:30: warning: unused parameter ‘id’ [-Wunused-parameter] delete_pipelineobj_cb(GLuint id, void *data, void *userData) ^ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: add support for GL_EXT_window_rectanglesIlia Mirkin2016-06-1811-1/+176
| | | | | | | | Make sure to pass the requisite information in draws, blits, and clears that work on the context's draw buffer. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: add GL_EXT_window_rectangles state storage/retrieval functionalityIlia Mirkin2016-06-188-3/+120
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glapi: add GL_EXT_window_rectangles entrypointsIlia Mirkin2016-06-183-0/+16
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* scons: put the generated git_sha1.h file in top-level src/ directoryBrian Paul2016-06-171-43/+2
| | | | | | | | | | To match what's done in the automake build. v2: Use git rev-parse to get a 10-character hash ID Fix Python imports Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* mesa: remove remaining tabs in api_validate.cTimothy Arceri2016-06-171-11/+11
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXTSamuel Iglesias Gonsálvez2016-06-171-3/+25
| | | | | | | | | | | | | | | | | | | | From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions, page 844: "When source or destination datatype is 64b or operation is integer DWord multiply, indirect addressing must not be used." v2: - Fix it for Broxton too. v3: - Simplify code by using subscript() and not creating a new num_components variable (Kenneth). Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "12.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXTIago Toral Quiroga2016-06-171-0/+25
| | | | | | | | | | | | | | | | | | | | | | | From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions: "When source or destination is 64b (...), regioning in Align1 must follow these rules: 1. Source and destination horizontal stride must be aligned to the same qword. (...)" v2: - Fix it for Broxton too. v3: - Remove inst->regs_written change as it is not necessary (Ken) Cc: "12.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Tested-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix comment about CS scratch space encodings on Broadwell+.Kenneth Graunke2016-06-161-1/+1
| | | | | | I typo'd this. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Drop perf_debug about rasterizer discard in SOL vs. clipper.Kenneth Graunke2016-06-161-3/+5
| | | | | | | | | | | I recently experimented with performing rasterizer discard in the SOL unit instead of the clipper, and as far as I can tell, it's basically the same performance. The clipper comes directly after SOL anyway, and setting the clipper to REJECT_ALL should be pretty darn cheap. Keep the perf_debug on Sandybridge, where the GS actually does work. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Enable GL_ARB_ES3_1_compatibility on Gen8+ if CS are available.Kenneth Graunke2016-06-161-1/+3
| | | | | | | | | | | | There are almost no tests in any test suite, but what little I've found seems to work. Ilia believes everything is in place. v2: Predicate the enable on ES 3.1 being available (Gen8+) and also ARB_compute_shader being available (requested by Ilia). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: If validation fails in a debug context just emit a debug messageIan Romanick2016-06-161-2/+15
| | | | | | | | | | | | | There are quite a few pipelines that desktop applications (including a bunch of piglit test) can expect to have run but don't meet the GLES requirements. Instead of failing validation, just emit a debug message. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" <[email protected]> Cc: Gregory Hainaut <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa/main: Update _mesa_new_shader.Jose Fonseca2016-06-161-1/+1
| | | | | | | Left over from 31dee99e052902bc08ddbb1009748dc982ac3211. It should fix Clang Windows build. Trivial.
* i965: remove remaining tabs in brw_link.cppTimothy Arceri2016-06-161-13/+13
| | | | Acked-by: Kenneth Graunke <[email protected]>