aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: Prevent coordinate overflow in intel_emit_linear_blitChris Wilson2015-09-011-38/+34
| | | | | | | | | | | | | | | | | | | | | | | | | Fixes regression from commit 8c17d53823c77ac1c56b0548e4e54f69a33285f1 Author: Kenneth Graunke <[email protected]> Date: Wed Apr 15 03:04:33 2015 -0700 i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions. which adjusted the coordinates to be relative to the nearest cacheline. However, this then offsets the coordinates by up to 63 and this may then cause them to overflow the BLT limits. For the well aligned large transfer case, we can use 32bpp pixels and so reduce the coordinates by 4 (versus the current 8bpp pixels). We also have to be more careful doing the last line just in case it may exceed the coordinate limit. Reported-and-tested-by: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90734 Signed-off-by: Chris Wilson <[email protected]> Cc: Kenneth Graunke <[email protected]> Cc: Ian Romanick <[email protected]> Cc: Anuj Phogat <[email protected]> Cc: [email protected] Reviewed-by: Anuj Phogat <[email protected]>
* i965/nir: enable the dead control flow optimizationConnor Abbott2015-09-011-0/+2
| | | | | | | | total instructions in shared programs: 7541551 -> 7541381 (-0.00%) instructions in affected programs: 3054 -> 2884 (-5.57%) helped: 29 Reviewed-by: Kenneth Graunke <[email protected]>
* i965: advertise ASTC support for SkylakeNanley Chery2015-08-311-0/+5
| | | | | | | v2: remove OES ASTC extension reference. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* i965/fs: Use greater-equal cmod to implement maximum.Matt Turner2015-08-312-4/+6
| | | | | | | | | | The docs specifically call out SEL with .l and .ge as the implementations of MIN and MAX respectively. Among other things, SEL with these conditional mods are commutative. See commit 3b7f683f. Reviewed-by: Jordan Justen <[email protected]>
* i965/chv|skl: Apply sampler bypass w/aBen Widawsky2015-08-312-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | Certain compressed formats require this setting. The docs don't go into much detail as to why it's needed exactly. This patch introduces no piglit regressions on gen9 (bsw is untested). Note that the SKL "regressions" are fixed tests, and the egl_khr_gl_colorspace tests are WTF. The patch also fixes nothing I can find. http://otc-mesa-ci.jf.intel.com/job/Leeroy/127820/ v2: Reworded commit message (Matt); Added piglit results link. Restructured condition (Matt) Moved check out to function (Nanley). I left the setting of the bit in the surface state open coded because it seems to go better with the existing code. v3: Use and inline function only in gen8_emit_texture_surface_state() (Matt). Cc: Matt Turner <[email protected]> Cc: Nanley Chery <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Remove fs_visitor::try_replace_with_sel().Matt Turner2015-08-283-92/+0
| | | | | | | No shader-db changes on g4x, snb, hsw, or bdw. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Replace awful variable names.Matt Turner2015-08-281-40/+40
| | | | | | | | | | | | | | | | | start_to -> dst_start end_to -> dst_end start_from -> src_start end_from -> src_end var_to -> dst_var var_from -> src_var reg_to -> dst_reg reg_to_offset -> dst_reg_offset reg_from -> src_reg Not sure how these made sense to me before. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Skip blocks in register coalescing interference check.Matt Turner2015-08-281-14/+20
| | | | | | | | No need to walk through instructions in blocks we know don't contain our registers' live ranges. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Improve register coalescing interference check.Matt Turner2015-08-281-8/+11
| | | | | | | | | | | | | | | | | | | I always thought that the is_control_flow() -> return false check was a bad hack, and some previous attempts to remove it have failed and have been reverted. The previous two patches fix some problems that caused register coalescing to not notice some interference between registers, which the is_control_flow() check apparently works around. With that fixed, we can calculate interference more accurately. total instructions in shared programs: 6261319 -> 6257917 (-0.05%) instructions in affected programs: 346282 -> 342880 (-0.98%) helped: 1552 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use overwrites_reg() instead of dst.equals().Matt Turner2015-08-281-2/+2
| | | | | | | | equals() returns false for registers with different types, using it isn't appropriate to determine whether an is overwriting a register. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Only consider fixed_hw_reg in equals() if file is HW_REG/IMM.Matt Turner2015-08-282-3/+6
| | | | | | | | | | | | | | Noticed when debugging things that lead to the next patch. On G45 (and presumably ILK) this helps register coalescing: total instructions in shared programs: 4077373 -> 4077340 (-0.00%) instructions in affected programs: 43751 -> 43718 (-0.08%) helped: 52 HURT: 2 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Do not set the size for zero-size uniformsMarta Lofstedt2015-08-281-3/+4
| | | | | | | | | | | | | | | | | | Zero sized uniforms can exist in the list, but they don't get get any space allocated in prog_data->params or in the param_size array, so the size should not be set for them. This was previously fixed in: commit: 781dc7c0e1f41502f18e07c0940af949a78d2792. However, commit: 259f7291de2387aa3ac5f856b39b7b934a1d8e7d removed the fix. Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/nir: Make use of nir_opt_undefBoyan Ding2015-08-271-0/+2
| | | | | | | | | | | | Shader-db result on Ivy Bridge: total instructions in shared programs: 145484 -> 145445 (-0.03%) instructions in affected programs: 225 -> 186 (-17.33%) helped: 5 HURT: 0 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Signed-off-by: Boyan Ding <[email protected]>
* i965/fs: Split VGRFs after lowering pull constantsJason Ekstrand2015-08-271-2/+2
| | | | | | | | | | The split_virtual_grfs code doesn't properly rewrite reladdr so we need to make sure that any uniform indirects are lowered away first. This fixes the glsl-fs-uniform-indexed-by-swizzled-vec4.shader_test in piglit Cc: "10.6" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i964/fs: Refactor assign_constant_locationsJason Ekstrand2015-08-271-46/+40
| | | | | | | | | Now that all constant locations are assigned in a single function, we can refactor it a bit to unify things. In particular, we now handle pull_constant_loc and push_constant_loc more similarly and we only modify stage_prog_data->params[] in one place at the end of the function. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Rename INTEL_DEBUG=vec4vs to INTEL_DEBUG=vec4.Kenneth Graunke2015-08-271-1/+1
| | | | | | | | | | | | | | | driParseDebugString() doesn't have actual code to parse comma separated lists (or any other supported options?); instead it dumbly uses strstr(). This means that INTEL_DEBUG="vec4vs" will trigger both DEBUG_VEC4VS and DEBUG_VS, as "vs" is also a substring. We should probably improve the driconf parsing, but for now, just rename the option so it's usable in the meantime. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kristian Høgsberg <[email protected]>
* i965: refactor miptree alignment calculation codeNanley Chery2015-08-261-55/+30
| | | | | | | | | | | | Remove redundant checks and comments by grouping our calculations for align_w and align_h wherever possible. v2: reintroduce brw. don't include functional changes. don't adjust function parameters or create a new function. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* i965: change the meaning of cpp for compressed texturesNanley Chery2015-08-264-35/+15
| | | | | | | | | | | | | | | | | | | | An ASTC block takes up 16 bytes for all block width and height configurations. This size is not integrally divisible by all ASTC block widths. Therefore cpp is changed to mean bytes per block if the texture is compressed. Because the original definition was bytes per block divided by block width, all references to the mipmap width must be divided the block width. This keeps the address calculation formulas consistent. For example, the units for miptree_level x_offset and miptree total_width has changed from pixels to blocks. v2: reuse preexisting ALIGN_NPOT macro located in an i965 driver file. v3: move ALIGN_NPOT into seperate commit. simplify cpp assignment in copy_image_with_blitter(). update miptree width and offset variables in: intel_miptree_copy_slice(), intel_miptree_map_gtt(), and brw_miptree_layout_texture_3d(). Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* i965: correct mt->align_h for 2D textures on SkylakeNanley Chery2015-08-261-3/+8
| | | | | | | | | | | In agreement with commit 4ab8d59a23, vertical alignment values are equal to four times the block height on Gen9+. v2: add newlines to separate declarations, statments, and comments. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* i965: use ALIGN_NPOT for setting ASTC mipmap layoutsNanley Chery2015-08-262-15/+15
| | | | | | | | | | ALIGN is changed to ALIGN_NPOT because alignment values are sometimes not powers of two when working with ASTC. v2: handle texture arrays and LDR-only systems. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/macros: move ALIGN_NPOT to macros.hNanley Chery2015-08-261-6/+0
| | | | | | | | | | | Aligning with a non-power-of-two number is a general task that can be used in various places. This commit is required for the next one. v2: add greater than 0 assertion (Anuj). convert the macro to a static inline function. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/macros: add power-of-two assertions for alignment macrosNanley Chery2015-08-261-1/+1
| | | | | | | | | | | | | ALIGN and ROUND_DOWN_TO both require that the alignment value passed into the macro be a power of two in the comments. Using software assertions verifies this to be the case. v2: use static inline functions instead of gcc-specific statement expressions (Brian). v3: fix indendation (Brian). v4: add greater than zero requirement (Anuj). Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* i965/surface_formats: add support for 2D ASTC surface formatsNanley Chery2015-08-262-0/+119
| | | | | | | | | | | | | | | | | | Define two-thirds of the 2D Intel ASTC surface formats (LDR-only). This allows a 1-to-1 mapping from the mesa format to the Intel format. ASTC textures will default to being processed in LDR mode. If there is hardware support for HDR/Full mode and the texture is not sRGB, add the format bit necessary to process it in HDR/Full mode. v2: remove extra newlines. v3: follow existing coding style in translate_tex_format(). v4: expound on the GEN9_SURFACE_ASTC_HDR_FORMAT_BIT comment. update SF table - ASTC is actually supported in Gen8. v5: conform the ASTC MESA_FORMAT enums to the existing naming convention. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/formats: remove compressed formats from matching functionNanley Chery2015-08-252-2/+2
| | | | | | | | | | | | | | | All compressed formats return GL_FALSE and there isn't any evidence to support that this behaviour would change. Remove all switch cases for compressed formats. v2. Since the exhaustive switch is removed, add a gtest to ensure all formats are handled. v3. Ensure that GL_NO_ERROR is set before returning. v4. Fix an arg to _mesa_uncompressed_format_to_type_and_comps(); fix formatting and misc improvements (Chad). Reviewed-by: Chad Versace <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* nir: Use nir_shader::stage rather than passing it around.Kenneth Graunke2015-08-251-1/+1
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Combine assign_constant_locations and ↵Jason Ekstrand2015-08-252-30/+11
| | | | | | | | | | move_uniform_array_access_to_pull_constants The comment above move_uniform_array_access_to_pull_constants was completely bogus because it has nothing to do with lowering instructions. Instead, it's assiging locations of pull constants. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Rework uniform handlingJason Ekstrand2015-08-253-29/+11
| | | | | | | | | | | Previously, we treated the entire UNIFORM file as if it had two elements: One for direct things and one for indirect. This is substantially different from how the old visitor code handled it where each element was effectively its own uniform. This commit makes the NIR path more like the old ir_visitor path where each uniform is separate. This should allow us to more easily make decisions about what to push. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4_nir: Get rid of the uniform_driver_location trackingJason Ekstrand2015-08-252-20/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/intrinsics: Add a second const index to load_uniformJason Ekstrand2015-08-252-2/+2
| | | | | | | | | | | In the i965 backend, we want to be able to "pull apart" the uniforms and push some of them into the shader through a different path. In order to do this effectively, we need to know which variable is actually being referred to by a given uniform load. Previously, it was completely flattened by nir_lower_io which made things difficult. This adds more information to the intrinsic to make this easier for us. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Pass a type_size() function pointer into nir_lower_io().Kenneth Graunke2015-08-252-18/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, there were four type_size() functions in play - the i965 compiler backend defined scalar and vec4 type_size() functions, and nir_lower_io contained its own similar functions. In fact, the i965 driver used nir_lower_io() and then looped over the components using its own type_size - meaning both were in play. The two are /basically/ the same, but not exactly in obscure cases like subroutines and images. This patch removes nir_lower_io's functions, and instead makes the driver supply a function pointer. This gives the driver ultimate flexibility in deciding how it wants to count things, reduces code duplication, and improves consistency. v2 (Jason Ekstrand): - One side-effect of passing in a function pointer is that nir_lower_io is now aware of and properly allocates space for image uniforms, allowing us to drop hacks in the backend Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> v2 Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move type_size() methods out of visitor classes.Kenneth Graunke2015-08-258-27/+28
| | | | | | | | I want to use C function pointers to these, and they don't use anything in the visitor classes anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Make setup_vec4_uniform_value and _image_uniform_values take an offsetJason Ekstrand2015-08-257-22/+38
| | | | | | | | This way they don't implicitly increment the uniforms variable and don't have to be called in-sequence during uniform setup. Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Rename setup_vector_uniform_values to setup_vec4_uniform_valueJason Ekstrand2015-08-256-17/+18
| | | | | | | The new name more accurately represents what it does: Set up a single vec4 uniform value. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Always re-emit the pipeline select during invariant state emissionChris Wilson2015-08-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On the older platforms where we don't have logical contexts preserving state across batches, we emit the invariant state setup on every batch using the brw_invariant_state atom. This includes the pipeline selection which is cached with the introduction of commit 0e0e23ef537c9add672ff322f34e129a07edc55e Author: Jordan Justen <[email protected]> Date: Wed Apr 22 11:43:50 2015 -0700 i965/state: Emit pipeline select when changing pipelines However, we do not reset the cache between batches on context-less platforms resulting in us not setting the pipeline selection and can cause GPU hangs if a media pipelined was loaded in the meantime (e.g. mixing mplayer/gstreamer using libva and gnome-shell). A simple solution is to just forcibly re-emit the pipeline select along with the invariant state and reset the cache at that point. Reported-and-tested-by: Tomasz C. <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91254 Signed-off-by: Chris Wilson <[email protected]> Cc: Jordan Justen <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Cc: "10.6 11.0" <[email protected]>
* i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is usedNeil Roberts2015-08-221-2/+13
| | | | | | | | | | | | | | | | | | When the edge flag element is enabled then the elements are slightly reordered so that the edge flag is always the last one. This was confusing the code to upload the 3DSTATE_VF_INSTANCING state because that is uploaded with a separate loop which has an instruction for each element. The indices used in these instructions weren't taking into account the reordering so the state would be incorrect. v2: Use nr_elements instead of brw->vb.nr_enabled so that it will cope when gl_VertexID is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91292 Cc: <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Tested-by: Mark Janes <[email protected]>
* i965: Swap the order of the vertex ID and edge flag attributesNeil Roberts2015-08-222-29/+57
| | | | | | | | | | | | | | | | | | | | The edge flag data on Gen6+ is passed through the fixed function hardware as an extra attribute. According to the PRM it must be the last valid VERTEX_ELEMENT structure. However if the vertex ID is also used then another extra element is added to source the VID. This made it so the vertex ID is in the wrong register in the vertex shader and the edge attribute is no longer in the last element. v2: Also implement for BDW+ v3 [by Ben]: Remove 10.5 tag. Too late. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84677 Cc: <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Tested-by: Ben Widawsky <[email protected]> Tested-by: Mark Janes <[email protected]>
* i965: Move control flush into pipelined conditional renderChris Wilson2015-08-222-14/+11
| | | | | | | | | | | | | | | The nv_conditional_render piglits were sporadically failing. Moving the control flush from the write and placing it just before the read was sufficient to make the piglits pass a 1000/1000 times. The bspec says that the flush enable bit "waits until all previous writes of immediate data from post sync circles are complete before executing the next command" - the operative word being previous! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90691 Signed-off-by: Chris Wilson <[email protected]> Cc: Neil Roberts <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use NIR by default for vertex shadersJason Ekstrand2015-08-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shader-db results for vec4 on i965: total instructions in shared programs: 1499894 -> 1502261 (0.16%) instructions in affected programs: 1414224 -> 1416591 (0.17%) helped: 2434 HURT: 10543 GAINED: 1 LOST: 0 Shader-db results for vec4 on g4x: total instructions in shared programs: 1437411 -> 1439779 (0.16%) instructions in affected programs: 1362402 -> 1364770 (0.17%) helped: 2434 HURT: 10544 GAINED: 0 LOST: 0 Shader-db results for vec4 on Iron Lake: total instructions in shared programs: 1437214 -> 1439593 (0.17%) instructions in affected programs: 1362205 -> 1364584 (0.17%) helped: 2433 HURT: 10544 GAINED: 1 LOST: 0 Shader-db results for vec4 on Sandy Bridge: total instructions in shared programs: 2022092 -> 1941570 (-3.98%) instructions in affected programs: 1886838 -> 1806316 (-4.27%) helped: 7510 HURT: 10737 GAINED: 0 LOST: 0 Shader-db results for vec4 on Ivy Bridge: total instructions in shared programs: 1853749 -> 1804960 (-2.63%) instructions in affected programs: 1686736 -> 1637947 (-2.89%) helped: 6735 HURT: 11101 GAINED: 0 LOST: 0 Shader-db results for vec4 on Haswell: total instructions in shared programs: 1853749 -> 1804960 (-2.63%) instructions in affected programs: 1686736 -> 1637947 (-2.89%) helped: 6735 HURT: 11101 GAINED: 0 LOST: 0 Signed-off-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Acked-by: Matt Turner <[email protected]>
* i965: Fix "handle nir_intrinsic_image_size"Martin Peres2015-08-201-4/+3
| | | | | | | | | | | | | | I pushed a half-baked version of "i965: handle nir_intrinsic_image_size" by accident. Not having the Reviewed-by: tags on the last two commits should have been a red flag but I somehow missed it after the QA check. This patch should fix image-size for non-int images. I will add support to the piglit test for all the other image types. Sorry for the noise. Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: enable GL_ARB_shader_image_sizeMartin Peres2015-08-201-0/+1
| | | | Signed-off-by: Martin Peres <[email protected]>
* i965: handle nir_intrinsic_image_sizeMartin Peres2015-08-201-0/+46
| | | | | | | | | | | | | | v2, Review from Francisco Jerez: - avoid the camelCase for the booleans - init the booleans using the sampler type - force the initialization of all the components of the output register v3: - Rename a variable from CubeMapArray to CubeArray to re-use GLSL's name (Ilia) - Fix some indentation and drop parenthesis (Topi) - Fix a signed/unsigned comparaison warning Signed-off-by: Martin Peres <[email protected]>
* mesa: Don't lose track of the shader image layer originally specified by the ↵Francisco Jerez2015-08-201-2/+2
| | | | | | | | | | | | | | | | user. The spec requires that all layers of the image starting from the 0-th are bound to the image unit regardless of the Layer parameter when Layered is true, so I was setting gl_image_unit::Layer to zero in that case for the convenience of the driver back-end. However the ES31-CTS.shader_image_load_store.basic-api-bind conformance test checks that the layer value returned by glGetInteger is the same that was originally specified, regardless of the value of layered. Rename Layer to _Layer as is usual for other derived state and keep track of the original layer value as gl_image_unit::Layer. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Rename MaxCombinedImageUnitsAndFragmentOutputs to ↵Francisco Jerez2015-08-201-1/+1
| | | | | | | | | | | | | MaxCombinedShaderOutputResources. The name of both the GLSL built-in variable and the glGetInteger param with the same value changed in GLSL ES 3.1 and GL 4.5. Its semantics also changed slightly, since the limit now also takes into account the number of SSBs in use. Switch our internal data structures to the up-to-date name. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/bdw: Fix setting the instancing state for the SGVS elementNeil Roberts2015-08-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When gl_VertexID or gl_InstanceID is used a 3DSTATE_VF_SGVS instruction is sent to create a sort of element to store the generated values. The last instruction in this chunk of code looks like it was trying to set the instancing state for the element using the 3DSTATE_VF_INSTANCING instruction. However it was sending brw->vb.nr_buffers instead of the element index. This instruction is supposed to take an element index and that is how it is used further down in the function so the previous code looks wrong. Perhaps previously the number of buffers coincidentally matched the number of enabled elements so the value was generally correct anyway. In a subsequent patch I want to change a bit how it chooses the SGVS element index so this needs to be fixed. v2 [by Ben] Remove stable 10.5 stable tag (it's too late now) Commit update as follows: The number of vertex buffers emitted is always <= the number of vertex elements. To maximize reuse (actually, to minimize relocations - according to the code comments), a vertex buffer is only emitted once, even when we setup multiple components (3DSTATE_VERTEX_ELEMENT) from that buffer. This meant that the previous code would use the wrong indexed element for these reuse cases. This patch by itself prevents hangs on BSW in the linked bug. It doesn't make the test pass, the remaining patches are needed for that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91610 Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Tested-by: Mark Janes <[email protected]> Cc: <[email protected]>
* util/ra: Make allocating conflict lists optionalJason Ekstrand2015-08-182-2/+2
| | | | | | | | | Since i965 is now using make_reg_conflicts_transitive and doesn't need q-value computations, they are disabled on i965. They are enabled everywhere else so that they get the old behavior. This reduces the time spent in eglInitialize() on BDW by around 10-15%. Reviewed-by: Eric Anholt <[email protected]>
* i965/reg_allocate: Use make_reg_conflicts_transitiveJason Ekstrand2015-08-182-3/+12
| | | | | | | | | | Instead of adding transitive conflicts as we go, we now add regular conflicts and them make them all transitive at the end. This reduces screen creation time substantially on BDW. The time spent in eglInitialize is reduced from 27.78 ms/call to 9.92 ms/call in debug mode and from 13.15 ms/call to 4.54 ms/call in release mode (about 65% in either case). Reviewed-by: Eric Anholt <[email protected]>
* drirc: Add "Unigine Oil Rush" quirk (allow_glsl_extension_directive_midshader).Richard Yao2015-08-181-0/+2
| | | | | | | | | | | | | Appears to fix shader compilation. Tested by starting the client and observing that the screen was correct after the trailers ran when previously, it was blank. Play tested on amd64. This was suggested by "Kuuchan" on the Steam forums: https://steamcommunity.com/app/200390/discussions/0/540731690861139279/?insideModal=1#c594820656479479870 Acked-by: Matt Turner <[email protected]> Signed-off-by: Richard Yao <[email protected]>
* i965/gen7: Resolve GCC sign-compare warning.Rhys Kidd2015-08-181-1/+1
| | | | | | | | | | mesa/src/mesa/drivers/dri/i965/gen7_sol_state.c: In function 'gen7_upload_3dstate_so_decl_list': mesa/src/mesa/drivers/dri/i965/gen7_sol_state.c:119:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < linked_xfb_info->NumOutputs; i++) { ^ Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* i965/gen6: Resolve GCC sign-compare warning.Rhys Kidd2015-08-181-1/+1
| | | | | | | | | | | | | mesa/src/mesa/drivers/dri/i965/gen6_vs_state.c: In function 'gen6_upload_push_constants': mesa/src/mesa/drivers/dri/i965/gen6_vs_state.c:85:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < prog_data->nr_params; i++) { ^ mesa/src/mesa/drivers/dri/i965/gen6_vs_state.c:92:17: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < prog_data->nr_params; i++) { ^ Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* i965: Resolve GCC sign-compare warning.Rhys Kidd2015-08-181-1/+1
| | | | | | | | | | | | | mesa/src/mesa/drivers/dri/i965/brw_vs_surface_state.c: In function 'brw_upload_pull_constants': mesa/src/mesa/drivers/dri/i965/brw_vs_surface_state.c:84:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < prog_data->nr_pull_params; i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_vs_surface_state.c:89:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ALIGN(prog_data->nr_pull_params, 4) / 4; i++) { ^ Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Thomas Helland <[email protected]>