summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mesa: Skip redundant texture completeness checking during image validation.Francisco Jerez2015-10-091-1/+2
| | | | | | | | | | | | | The call to _mesa_test_texobj_completeness() is unnecessary if the texture is already known to be complete. If the texture object is dirtied in the meantime _BaseComplete and _MipmapComplete will be reset to false. _mesa_is_image_unit_valid() will start to be called more frequently in a future commit, so it seems desirable to avoid the unnecessary work. Tested-by: Ye Tian <[email protected]> CC: "11.0" <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Expose function to calculate whether a shader image unit is valid.Francisco Jerez2015-10-092-4/+15
| | | | | | | | | | | | | A future commit will remove all texture object-dependent derived state from the image unit struct to make validation unnecessary on texture state changes. Instead of checking gl_image_unit::_Valid drivers will be required to call this function when needed to find out whether an image unit is in a valid state and whether access from the shader is allowed. Tested-by: Ye Tian <[email protected]> CC: "11.0" <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Don't tell the hardware about our UAV access.Francisco Jerez2015-10-096-19/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hardware documentation relating to the UAV HW-assisted coherency mechanism and UAV access enable bits is scarce and sometimes contradictory, and there's quite some guesswork behind this commit, so let me summarize the background first: HSW and later hardware have infrastructure to support a stricter form of data coherency between shader invocations from separate primitives. The mechanism is controlled by the "Accesses UAV" bits on 3DSTATE_VS, _HS, _DS, _GS and _PS (or _PS_EXTRA on BDW+), and the "UAV Coherency Required" bit on the 3DPRIMITIVE command. Regardless of whether "UAV Coherency Required" is set, the hardware fixed-function units will increment a per-stage semaphore for each request received if "Accesses UAV" is set for the same or any lower stage. An implicit DC flush is emitted by the lowermost stage with "Accesses UAV" set once it's done processing the request, this also happens regardless of the value of "UAV Coherency Required". The completion of the DC flush will cause the same stage and all previous ones to decrement the semaphore, marking the UAV accesses for the primitive as coherent with L3. The "UAV Coherency Required" 3DPRIMITIVE bit will cause a pipeline stall before any threads are dispatched for the first FF stage with "Accesses UAV" set until the semaphore is cleared for the same stage. Effectively this guarantees that UAV memory accesses performed by previous primitives from any stage will be strictly ordered (and thanks to the implicit DC flush visible in memory) with UAV accesses from the following primitives. None of this is required by the usual image, atomic counter and SSBO GL APIs which have very relaxed cross-primitive coherency and ordering requirements, so we don't actually ever set the "UAV Coherency Required" bit -- Ordering with respect to shader invocations from previous stages on the same primitive where there is a data dependency is of course already guaranteed as the spec requires, regardless of this mechanism being enabled. We do set the "Accesses UAV" bits though since my commit ac7664e493655e290783c23a0412b9c70936da50 (which this patch partially reverts), mainly because of comments like the following from the BDW PRM: > 3DSTATE_GS >[...] > 12 Accesses UAV > Format: Enable > This field must be set when GS has a UAV access. There are similar comments in the documentation for the other 3DSTATE_*S commands. The "must" part is misleading and unjustified AFAIK. Most of the "Accesses UAV" bits don't seem to have any side effects other than the implicit DC flushes and the related book-keeping in anticipation for a subsequent primitive with "UAV Coherency Required" set, so in most cases they are unnecessary and may incur a performance penalty. There is an exception though. On Gen8+ the PS_EXTRA UAV access bit influences the calculation of the PS UAV-only and ThreadDispatchEnable signals which on previous generations were set explicitly by the driver, so we cannot always avoid enabling it on the PS stage. The primary motivation for this change is that in fact the hardware coherency mechanism is buggy and will cause a rather non-deterministic hang on Gen8 when VS is the only stage with "Accesses UAV" set and the processing of a request terminates immediately after the implicit DC flush is sent for a previous primitive with no additional vertices being emitted for the second primitive, what will cause the hardware to skip sending a second DC flush and cause the VS to stall indefinitely waiting for a response from the DC (BDWGFX HSD 1912017). This hardware bug can be reproduced on current master with the spec@arb_shader_image_load_store@host-mem-barrier@Indirect/RaW piglit subtest (if you have the patience to run it a few dozen times). The proposed workaround is to insert CS STALLs speculatively between 3DPRIMITIVE commands when "Accesses UAV" is enabled for the VS stage only. Because this would affect one of the hottest paths in the driver and likely decrease performance even further due to the unnecessary serialization, and because we don't actually need the implicit DC flushes, it seems better to just disable them. Cc: 11.0 <[email protected]>
* nir/instr_set: remove unnecessary check in nir_instrs_equal()Connor Abbott2015-10-091-2/+1
| | | | | | | | | | | | | This was originally added to nir_instrs_equal() instead of nir_instr_can_cse() incorrectly, but this was fixed when moving to the instruction set API (as it had to be, otherwise hashing wouldn't work). Now, this is dead code since instr_can_rewrite() will only return true for texture instructions that use an index, so we can turn the check into an assert. This also means that now nir_instrs_equal(instr, instr) will always return true unless it assert-fails. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: make nir_instrs_equal() staticConnor Abbott2015-10-092-3/+1
| | | | | | | | | | This was previously tied to CSE, since it would only work for instructions where nir_can_cse() (now instr_can_rewrite()) returned true. Now that CSE uses the instruction set abstraction which only uses this internally, we can make it local to nir_instr_set.c. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir/cse: use the instruction set APIConnor Abbott2015-10-091-115/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | This replaces an O(n^2) algorithm with an O(n) one, while allowing us to import most of the infrastructure required for GVN. The idea is to walk the dominance tree depth-first, similar when converting to SSA, and remove the instructions from the set when we're done visiting the sub-tree of the dominance tree so that the only instructions in the set are the instructions that dominate the current block. No piglit regressions. No shader-db changes. Compilation time for full shader-db: Difference at 95.0% confidence -35.826 +/- 2.16018 -6.2852% +/- 0.378975% (Student's t, pooled s = 3.37504) v2: - rebase on start_block removal - remove useless state struct - change commit message Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: add an instruction set APIConnor Abbott2015-10-092-0/+349
| | | | | | | | | | | | | | | | | | | | | | | This will replace direct usage of nir_instrs_equal() in the CSE pass, which reduces an O(n^2) algorithm with an effectively O(n) one. It'll also be useful for implementing GVN on top of GCM. v2: - Add texture support. - Add more comments. - Rename instr_can_hash() to instr_can_rewrite() since it's really more about whether its uses can be rewritten, and it's implicitly used by nir_instrs_equal() as well. - Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason). - Make the HASH() macro less magical (Topi). - Rewrite the commit message. v3: - For sorting phi sources, use a VLA, store pointers to the sources, and compare the predecessor pointer directly (Jason). Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: constify instruction comparison functionsConnor Abbott2015-10-092-4/+4
| | | | | | | | v2: rebase, don't constify nir_srcs_equal() as it's pass-by-value anyways Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: constify nir_ssa_alu_instr_src_components()Connor Abbott2015-10-091-1/+1
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: split out instruction comparison functionsConnor Abbott2015-10-095-181/+237
| | | | | | | | | | | | Right now nir_instrs_equal() is tied pretty tightly to CSE, but we're going to introduce the idea of an instruction set and tie it to that instead. In anticipation of that, move this into its own file where we'll add the rest of the instruction set implementation later. v2: Rebase on texture support. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* i965/fs: Handle non-const sample number in interpolateAtSampleNeil Roberts2015-10-094-43/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a non-const sample number is given to interpolateAtSample it will now generate an indirect send message with the sample ID similar to how non-const sampler array indexing works. Previously non-const values were ignored and instead it ended up using a constant 0 value. The generator will try to determine if the sample ID is dynamically uniform via nir_src_is_dynamically_uniform. If not it will query the pixel interpolator in a loop, once for each different live sample number. The next live sample number is found using emit_uniformize. If multiple live channels have the same sample number then they will be handled in a single iteration of the loop. The loop is necessary because the indirect send message doesn't seem to have a way to specify a different value for each fragment. This fixes the following two Piglit tests: arb_gpu_shader5-interpolateAtSample-nonconst arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform v2: Handle dynamically non-uniform sample ids. v3: Remove the BREAK instruction and predicate the WHILE directly. Make the tokens arrays const. (Matt Turner) v4: Iterate over the live channels instead of each possible sample number. v5: Don't special case immediate values in brw_pixel_interpolator_query. Make a better wrapper for the function to set up the PI send instruction. Ensure that the SHL instructions are scalar. (Francisco Jerez). Reviewed-by: Francisco Jerez <[email protected]>
* i965: Add a second successor to BRW_OPCODE_WHILENeil Roberts2015-10-091-0/+4
| | | | | | | | | It is possible to directly predicate the WHILE instruction. In this case there will be a second successor block because the execution can resume from the instruction after the loop. This will be used in a subsequent patch. Reviewed-by: Matt Turner <[email protected]>
* nir: Add a function to determine if a source is dynamically uniformNeil Roberts2015-10-092-0/+30
| | | | | | | | | | | | Adds nir_src_is_dynamically_uniform which returns true if the source is known to be dynamically uniform. This will be used in a later patch to add a workaround for cases that only work with dynamically uniform sources. Note that the function is not definitive, it can return false negatives (but not false positives). Currently it only detects constants and uniform accesses. It could easily be extended to include more cases. Reviewed-by: Matt Turner <[email protected]>
* nvc0: move HW SM queries to nvc0_query_hw_sm.c/h filesSamuel Pitoiset2015-10-098-796/+908
| | | | | | | Global performance counters (PCOUNTER) will be added to nvc0_query_hw_pm.c/h files. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: move HW queries to nvc0_query_hw.c/h filesSamuel Pitoiset2015-10-098-1215/+1310
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: move SW queries to nvc0_query_sw.c/h filesSamuel Pitoiset2015-10-095-84/+204
| | | | | | Loosely based on freedreno driver. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: move nvc0_so_target_save_offset() to its correct locationSamuel Pitoiset2015-10-093-24/+19
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: add a header file for nvc0_querySamuel Pitoiset2015-10-097-189/+202
| | | | | | | | This will allow to split SW and HW queries in an upcoming patch. While we are at it, make use of nvc0_query struct instead of pipe_query. Signed-off-by: Samuel Pitoiset <[email protected]>
* main: fix length of values written to glGetProgramResourceiv() for ↵Samuel Iglesias Gonsalvez2015-10-091-4/+10
| | | | | | | | | | ACTIVE_VARIABLES Return the number of values written. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* main: buffer array variables can have array size of 0 if they are unsizedSamuel Iglesias Gonsalvez2015-10-091-1/+8
| | | | | | | | | | | | | | | | | | | | | | From ARB_program_query_interface: For the property ARRAY_SIZE, a single integer identifying the number of active array elements of an active variable is written to <params>. The array size returned is in units of the type associated with the property TYPE. For active variables not corresponding to an array of basic types, the value one is written to <params>. If the variable is a shader storage block member in an array with no declared size, the value zero is written to <params>. v2: - Unsized arrays of arrays have an array size different than zero v3: - Arrays and unsized arrays will have an array_stride > 0. Use it instead of is_unsized_array flag (Timothy). Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* main: consider that unsized arrays have at least one active elementSamuel Iglesias Gonsalvez2015-10-091-1/+7
| | | | | | | | | | | | | | | | | | | | | From ARB_shader_storage_buffer_object: "When using the ARB_program_interface_query extension to enumerate the set of active buffer variables, only the first element of arrays (sized or unsized) will be enumerated" _mesa_program_resource_array_size() is used when getting the name (and name length) of the active variables. When it is an unsized array, we want to indicate it has one active element so the returned name would have "[0]" at the end. v2: - Use array_stride > 0 and array_elements == 0 to detect unsized arrays. Because of that, we don't need is_unsized_array flag (Timothy) Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* main: fix TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDESamuel Iglesias Gonsalvez2015-10-091-1/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the active variable is an array which is already a top-level shader storage block member, don't return its array size and stride when querying TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE respectively. Fixes the following 12 dEQP-GLES31 tests: dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.column_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.row_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.column_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.row_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.column_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.column_major_mat3x4 v2: - Fix check when the shader storage block is instanced - Write auxiliary function to do the check. v3: - Check if full_instanced_name is NULL just after allocation (Ilia) - Remove () from one strcmp() in the if statement (Ilia) Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Tested-by: Tapani Pälli <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* main: fix goto in program_resource_top_level_array_strideSamuel Iglesias Gonsalvez2015-10-091-2/+2
| | | | | | | | Use found_top_level_array_stride instead of found_top_level_array_size. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_spanTapani Pälli2015-10-091-0/+15
| | | | | | | | | | | Patch adds missing type (used with NV_read_depth) so that it gets handled correctly. This fixes errors seen with following CTS test: ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "11.0" <[email protected]>
* mesa,meta: move gl_texture_object::TargetIndex initializationsBrian Paul2015-10-084-10/+29
| | | | | | | | | | | | | | | Before, we were unconditionally assigning the TargetIndex field in _mesa_BindTexture(), even if it was already set properly. Now we initialize TargetIndex wherever we initialize the Target field, in _mesa_initialize_texture_object(), finish_texture_init(), etc. v2: also update the meta_copy_image code. In make_view() the view_tex_obj->Target field was set, but not the TargetIndex field. Also, remove a second, redundant assignment to view_tex_obj->Target. Add sanity check assertions too. Reviewed-by: Anuj Phogat <[email protected]> Tested-by: Mark Janes <[email protected]>
* mesa: remove unused _mesa_create_nameless_texture()Brian Paul2015-10-082-23/+0
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Tested-by: Mark Janes <[email protected]>
* mesa: remove unneeded error check in create_textures()Brian Paul2015-10-081-9/+2
| | | | | | | | | Callers of create_texture() will either pass target=0 or a validated GL texture target enum so no need to do another error check inside the loop. Reviewed-by: Anuj Phogat <[email protected]> Tested-by: Mark Janes <[email protected]>
* i965: Link compiler unit tests to libi965_compiler.laKristian Høgsberg Kristensen2015-10-081-6/+2
| | | | | | | | | | | | We can now link the unit tests against just libi965_compiler.la. This lets us drop a lot of DRI driver dependencies, but we still pull in all of libmesa and more. This also provides a few standalone users of libi965_compiler.la, which will help us accidentally using i965_dri.so functions from the compiler. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Break out backend compiler to its own libraryKristian Høgsberg Kristensen2015-10-082-77/+81
| | | | | | | | | This introduces a new libtool helper library, libi965_compiler.la. This library is moderately self-contained, but still needs to link to all of libmesa.la among other things. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965/cs: Get max_cs_threads from brw_compiler devinfoKristian Høgsberg Kristensen2015-10-081-2/+3
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move brw_get_shader_time_index() call out of emit functionsKristian Høgsberg Kristensen2015-10-0811-31/+40
| | | | | | | | | | brw_get_shader_time_index() is all tangled up in brw_context state and we can't call it from the compiler. Thanks the Jasons recent refactoring, we can just get the index and pass to the emit functions instead. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move brw_select_clip_planes() to brw_shader.cppKristian Høgsberg Kristensen2015-10-082-25/+26
| | | | | | | We call this from the compiler so move it to brw_shader.cpp. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Use util_next_power_of_two() for brw_get_scratch_size()Kristian Høgsberg Kristensen2015-10-082-13/+6
| | | | | | | | | | | | This function computes the next power of two, but at least 1024. We can do that by bitwise or'ing in 1023 and calling util_next_power_of_two(). We use brw_get_scratch_size() from the compiler so we need it out of brw_program.c. We could move it to brw_shader.cpp, but let's make it a small inline function instead. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move brw_mark_surface_used() to brw_shader.cppKristian Høgsberg Kristensen2015-10-082-10/+10
| | | | | | | | brw_program.c won't be part of the compiler library, but we need brw_mark_surface_used() in the compiler. Move to brw_shader.cpp. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965/cs: Split out helper for building local id payloadKristian Høgsberg Kristensen2015-10-084-78/+77
| | | | | | | | | | | | The initial motivation for this patch was to avoid calling brw_cs_prog_local_id_payload_dwords() in gen7_cs_state.c from the compiler. This commit ends up refactoring things a bit more so as to split out the logic to build the local id payload to brw_fs.cpp. This moves the payload building closer to the compiler code that uses the payload layout and makes it available to other users of the compiler. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move brw_link_shader() and friends to new file brw_link.cppKristian Høgsberg Kristensen2015-10-084-249/+284
| | | | | | | | We want to use the rest of brw_shader.cpp with the rest of the compiler without pulling in the GLSL linking code. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Configure bufmgr debug options from intel_screen.cKristian Høgsberg Kristensen2015-10-083-17/+15
| | | | | | | | | | | | | | We need the debug flag parsing and INTEL_DEBUG in the compiler, but we don't want the dependency on bufmgr (libdrm_intel) in there. Move to intel_screen.c. There are now only two lines left in brw_process_intel_debug_variable(), but we keep it in intel_debug.h to avoid having to expose 'debug_control' as a global variable. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* util: Move DRI parse_debug_string() to utilKristian Høgsberg Kristensen2015-10-089-47/+112
| | | | | | | | | | We want to use intel_debug.c in code that doesn't link to dri common. v2: Remove unnecessary stddef.h include (Topi), use util/debug.h in all DRI driver and remove driParseDebugString() (Iago). Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move brw_dump_ir() out of brw_*_emit() functionsKristian Høgsberg Kristensen2015-10-087-23/+12
| | | | | | | We move these calls one level up into the codegen functions. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* gallium/ddebug: add missing dd_util.h to sources listEmil Velikov2015-10-081-1/+2
| | | | Signed-off-by: Emil Velikov <[email protected]>
* gallium/ddebug: automake: sort sources alphabeticallyEmil Velikov2015-10-081-2/+2
| | | | Signed-off-by: Emil Velikov <[email protected]>
* nir/sweep: Reparent the shader nameJason Ekstrand2015-10-081-0/+2
| | | | | | | | | | Previously the name of the nir shader was being freed prematurely during nir_sweep. Since 756613ed35d the name was later being used to generate filenames for the optimiser debug output and these would end up with garbage from the dangling pointer. Co-authored-by: Neil Roberts <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* c11/threads: initialize timeout structureJan Vesely2015-10-081-0/+6
| | | | | Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* docs/relnotes: document EGL_KHR_create_context on llvmpipe and softpipeBoyan Ding2015-10-081-0/+1
| | | | Signed-off-by: Boyan Ding <[email protected]>
* i965/gs/gen6: Maximum allowed size of SEND messages is 15 (4 bits)Iago Toral Quiroga2015-10-081-12/+18
| | | | | | | Comit d48ac9306619 addressed this for VS, but we forgot to do the same for URB writes generated by the gen6 GS. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Define FIRST_SPILL_MRF and FIRST_PULL_LOAD_MRF only once and in one placeIago Toral Quiroga2015-10-084-7/+6
| | | | | | That should make tracking where we do spills and pull loads a bit easier. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: make pull constant loads in gen6 start at MRFs 16/17Iago Toral Quiroga2015-10-082-3/+6
| | | | | | | So they do not conflict with our (un)spills (MRF 21..23) or our URB writes (MRF 1..15) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix remove_duplicate_mrf_writes so it can handle 24 MRFs in gen6Iago Toral Quiroga2015-10-081-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: include bad type in error string of _mesa_pack_depth_spanTapani Pälli2015-10-081-1/+2
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: add varyings to resource list only with SSOTapani Pälli2015-10-081-4/+7
| | | | | | | | | | | | | | | | | | | | Varyings can be considered inputs or outputs of a program only when SSO is in use. With multi-stage programs, inputs contain only inputs for first stage and outputs contains outputs of the final shader stage. I've tested that fix works for Assault Android Cactus (demo version) and does not cause Piglit or CTS regressions in glGetProgramiv tests. Following ES 3.1 CTS separate shader tests that do query properties of varyings in SSO shader programs pass: ES31-CTS.program_interface_query.separate-programs-vertex ES31-CTS.program_interface_query.separate-programs-fragment Signed-off-by: Tapani Pälli <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92122