summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mesa: move no-change glDepthFunc check earlierBrian Paul2015-06-031-3/+3
| | | | | | | If the incoming func matches the current state it must be a legal value so we can do this before the switch statement. Signed-off-by: Brian Paul <[email protected]>
* mesa: restore GL_EXT_depth_bounds_test state in glPopAttrib()Brian Paul2015-06-031-0/+5
| | | | | | Spotted by inspection. Untested (no piglit test). Signed-off-by: Brian Paul <[email protected]>
* mesa: fix glPushAttrib(0) / glPopAttrib() errorBrian Paul2015-06-031-0/+17
| | | | | | | | | | | If the glPushAttrib() mask value was zero we didn't actually push anything onto the attribute stack. A subsequent glPopAttrib() call would generate a GL_STACK_UNDERFLOW error. Now push a dummy attribute in that case to prevent the error. Mesa now matches nvidia's behavior. Reviewed-by: Jose Fonseca <[email protected]>
* nir: use src for ssa helperTimothy Arceri2015-06-031-5/+1
| | | | | Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: remove extra semicolonTimothy Arceri2015-06-031-1/+1
| | | | | | Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* prog_to_nir: Remove OPCODE_MOV special case.Matt Turner2015-06-021-1/+1
| | | | | | OPCODE_MOV is in the op_trans[] array. Reviewed-by: Kenneth Graunke <[email protected]>
* prog_to_nir: Remove from op_trans[] opcodes handled in the switch.Matt Turner2015-06-021-7/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir: prevent use-after-free condition in should_lower_phi()Eduardo Lima Mitev2015-06-021-0/+5
| | | | | | | | | | | | | | | | | | lower_phis_to_scalar() pass recurses the instruction dependence graph to determine if all the sources of a given instruction are scalarizable. To prevent cycles, it temporary marks the phi instruction before recursing in, then updates the entry with the resulting value. However, it does not consider that the entry value may have changed after a recursion pass, hence causing a use-after-free situation and a crash. This patch fixes this by reloading the entry corresponding to the 'phi' after recursing and before updating its value. The crash can be reproduced ~20% of times with the dEQP test: dEQP-GLES3.functional.shaders.loops.while_constant_iterations.nested_sequence_fragment Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add Gen8+ VS dispatch_mode assertion.Kenneth Graunke2015-06-011-0/+3
| | | | | | | Suggested by Ben Widawsky. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965: Drop LOAD_PAYLOAD workaround in fs_visitor::emit_urb_writes().Kenneth Graunke2015-06-011-12/+4
| | | | | | | | | Now that Jason's LOAD_PAYLOAD improvements have landed, we don't need this. Passing 1 for the number of header registers already takes care of setting force_writemask_all on the header copy. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use proper pitch for scalar GS pull constants and UBOs.Kenneth Graunke2015-06-011-3/+7
| | | | | | | | | See the corresponding code in brw_vs_surface_state.c. v2: const more things (requested by Topi Pohjolainen) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965: Create a shader_dispatch_mode enum to replace VS/GS fields.Kenneth Graunke2015-06-018-22/+24
| | | | | | | | | | | | | We used to store the GS dispatch mode in brw_gs_prog_data while separately storing the VS dispatch mode in brw_vue_prog_data::simd8. This patch introduces an enum to represent all possible dispatch modes, and stores it in brw_vue_prog_data::dispatch_mode, unifying the two. Based on a suggestion by Matt Turner. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965: Drop "Vector Mask Enable" bit from 3DSTATE_GS on Gen8+.Kenneth Graunke2015-06-011-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The documentation makes it pretty clear that we shouldn't use this: "Under normal conditions SW shall specify DMask, as the GS stage will provide a Dispatch Mask appropriate to SIMD4x2 or SIMD8 thread execution (as a function of dispatch mode). E.g., for SIMD4x2 execution, the GS stage will generate a Dispatch Mask that is equal to what the EU would use as the Vector Mask. For SIMD8 execution there is no known usage model for use of Vector Mask (as there is for PS shaders)." I also managed to find descriptions of DMask and VMask, in the "State Register" (sr0.2/3) field descriptions: "Dispatch Mask (DMask). This 32-bit field specifies which channels are active at Dispatch time." "Vector Mask (VMask). This 32-bit field contains, for each 4-bit group, the OR of the corresponding 4-bit group in the dispatch mask." SIMD4x2 shaders process one or two vec4 values, with each 4-bit group corresponding to xyzw channel enables (either all on, or all off). Thus, DMask = VMask in SIMD4x2 mode. But in SIMD8 mode, 4-bit groups are meaningless, so it just messes up your values. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* docs: update GL_ARB_copy_image, GL_ARB_clear_texture gallium statusBrian Paul2015-06-011-2/+2
| | | | | | VMware is working on these. Signed-off-by: Brian Paul <[email protected]>
* gallium/util: silence silence unused var warnings for non-debug buildBrian Paul2015-06-011-2/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* egl/dri2: silence uninitialized variable warningsBrian Paul2015-06-011-2/+4
| | | | | | And update assertions to be more informative. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: silence unused var warnings for non-debug buildBrian Paul2015-06-011-0/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* pipebuffer: silence unused var warnings for non-debug buildBrian Paul2015-06-011-0/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: silence unused var warnings for non-debug buildBrian Paul2015-06-011-0/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* draw: silence unused var warnings for non-debug buildBrian Paul2015-06-011-0/+4
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Remove stub disassemblerSymbolLookupCB.Jose Fonseca2015-06-011-13/+1
| | | | | | | | | It's incompletete -- it wasn't filling ReferenceType so it was causing garbagge on the disassembly. Furthermore it seems impossible to get the jump information through this interface. The solution for function size problem is to effectively book-keep the machine code start and end address while JIT'ing.
* i965: Don't add base_binding_table_index if it's zeroNeil Roberts2015-05-312-2/+4
| | | | | | | | | | | | | | | When calculating the binding table index for non-constant sampler array indexing it needs to add the base binding table index which is a constant within the generated code. Often this base is zero so we can avoid a redundant instruction in that case. It looks like nothing in shader-db is doing non-constant sampler array indexing so this patch doesn't make any difference but it might be worth having anyway. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Acked-by: Ben Widawsky <[email protected]>
* i965: Don't use a temporary when generating an indirect sampleNeil Roberts2015-05-312-26/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously when generating the send instruction for a sample instruction with an indirect sampler it would use the destination register as a temporary store. This breaks when used in combination with the opt_sampler_eot optimisation because that forces the destination to be null. This patch fixes that by avoiding the temp register altogether. The reason the temporary register was needed was because it was trying to ensure the binding table index doesn't overflow a byte by and'ing it with 0xff. The result is then or'd with samper_index<<8. This patch instead just and's the whole thing by 0xfff. This will ensure that a bogus sampler index won't overflow into the rest of the message descriptor but unlike the previous code it won't ensure that the binding table index doesn't overflow into the sampler index. It doesn't seem like that should matter very much though because if the shader is generating a bogus sampler index then it's going to just get garbage out either way. Instead of doing sampler_index<<8|(sampler_index+base_table_index) the new code avoids one operation by doing sampler_index*0x101+base_table_index which should be equivalent. However if we wanted to avoid the multiply for some reason we could do this by adding an extra or instruction still without needing the temporary register. This fixes a number of Piglit tests on Skylake that were using indirect samplers such as: spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Acked-by: Ben Widawsky <[email protected]> Tested-by: Anuj Phogat <[email protected]>
* vc4: Don't bother with safe list traversal in CSE.Eric Anholt2015-05-291-1/+1
| | | | We don't remove or move instructions.
* vc4: Convert from simple_list.h to list.hEric Anholt2015-05-2919-139/+87
| | | | list.h is a nicer and more familiar set of list functions/macros.
* vc4: Make sure we allocate idle BOs from the cache.Eric Anholt2015-05-291-1/+11
| | | | | | | | | | | | | We were returning the most recently freed BO, without checking if it was idle yet. This meant that we generally stalled immediately on the previous frame when generating a new one. Instead, allocate new BOs when the *oldest* BO is still busy, so that the cache scales with how much is needed to keep some frames outstanding, as originally intended. Note that if you don't have some throttling happening, this means that you can accidentally run the system out of memory. The kernel is now applying some throttling on all execs, to hopefully avoid this.
* vc4: Fix return value handling for BO waits.Eric Anholt2015-05-291-12/+15
| | | | | If the wait ever returned -ETIME, we'd abort because the errno was stored in errno and not drmIoctl()'s return value.
* mesa: remove unused function declarationTimothy Arceri2015-05-301-4/+0
| | | | Reviewed-by: Matt Turner <[email protected]>
* dri_util: make version var unsigned to silence warningsBrian Paul2015-05-291-1/+1
| | | | | | | _mesa_override_gl_version_contextless() takes an unsigned version parameter. Reviewed-by: Matt Turner <[email protected]>
* i965: Disable compaction for EOT send messagesBen Widawsky2015-05-291-0/+6
| | | | | | | | | | | | | | | | | | | | | | AFAICT, there is no real way to make sure a send message with EOT is properly ignored from compact, nor can I see a way to actually encode EOT while compacting. Before the single send optimization we'd always bail because we hit the is_immediate && !is_compactable_immediate case. However, with single send, is_immediate is not true, and so we end up trying to compact the un-compactible. Without this, any compacting single send instruction will hang because the EOT isn't there. I am not sure how I didn't hit this when I originally enabled the optimization. I didn't check if some surrounding code changed. I know Neil and Matt were both looking into this. I did a quick search and didn't see any patches out there to handle this. Please ignore if this has already been sent by someone. (Direct me to it and I will review it). Reported-by: Neil Roberts <[email protected]> Reported-by: Mark Janes <[email protected]> Tested-by: Mark Janes <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gallivm: make sampling more robust when the sampler setup is bogusRoland Scheidegger2015-05-291-6/+32
| | | | | | | | | Pure integer formats cannot be sampled with linear tex / mip filters. In GL such a setup would make the texture incomplete. We shouldn't rely on the state tracker though to filter that out, just return all zeros instead of dying in the lerp. Reviewed-by: Jose Fonseca <[email protected]>
* configure.ac: Link mcdisassembler component.Jose Fonseca2015-05-291-1/+1
| | | | | | | | gallivm now depends on it. And depending on particular LLVM version / configure options, the build can fail without this change due to undefined reference to `LLVM*Disasm*' symbols. Trivial.
* configure.ac: Don't bother checking whether LLVM's MCJIT component is available.Jose Fonseca2015-05-291-4/+1
| | | | | | Now that we require LLVM 3.3, MCJIT is guaranteed to be available. Trvial.
* gallivm: Use the LLVM's C disassembly interface.Jose Fonseca2015-05-292-224/+40
| | | | | | | | | | | It doesn't do everything we want. In particular it doesn't allow to detect jumps or return opcodes. Currently we detect the x86's RET opcode. Even though it's worse for LLVM 3.3, it's an improvement for LLVM 3.7, which was totally busted. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Disable frame pointer omission on LLVM 3.7.Jose Fonseca2015-05-291-0/+10
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* configure.ac: enable building GLES1 and GLES2 by defaultMarek Olšák2015-05-291-6/+6
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/dri: fix postprocessing crash when there's no depth bufferMarek Olšák2015-05-291-5/+4
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89131 Cc: 10.6 10.5 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeon/llvm: reset temps_count on deallocationMarek Olšák2015-05-291-0/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeon/llvm: don't use a static array size for radeon_llvm_context::arrays (v2)Marek Olšák2015-05-292-7/+12
| | | | | | v2: - don't use realloc (tgsi_shader_info provides the size) Reviewed-by: Michel Dänzer <[email protected]>
* softpipe: fix offset wrapping calculations (v2)Dave Airlie2015-05-291-78/+68
| | | | | | | | | | | | | | | Roland pointed out my previous attempt was lacking, so I enhanced the texwrap piglit test, and tested them. This fixes the offset calculations in a number of areas by adding the offset first, it also fixes the fastpaths, which I forgot to address in the previous commit. v2: try and avoid divides in most paths, the repeat mirror path really was ugly no matter which way I went, so I left it having the divide. Also fix the gather lod calculation bug. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965/vs: Rework the logic for generating NIR from ARB vertex programsJason Ekstrand2015-05-281-12/+11
| | | | | | | | Whether or not to use NIR is now equivalent to brw->scalar_vs. We can simplify the logic and make it far less confusing. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Remove the ir_visitor codeJason Ekstrand2015-05-283-2228/+2
| | | | | | | Now that everything is running through NIR, this is all dead. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove the old fragment program codeJason Ekstrand2015-05-283-769/+0
| | | | | | | Now that everything is running through NIR, this is all dead. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make NIR non-optional for scalar shadersJason Ekstrand2015-05-282-27/+5
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make fs/vec4_visitor inherit from ir_visitor directlyJason Ekstrand2015-05-283-3/+3
| | | | | | | | This is using multiple inheritance in C++. However, ir_visitor is really just an interface with no data so it shouldn't be so bad. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Rename backend_visitor to backend_shaderJason Ekstrand2015-05-2813-44/+44
| | | | | | | | The backend_shader class really is a representation of a shader. The fact that it inherits from ir_visitor is somewhat immaterial. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Enable ARB_direct_state_access by default for core profileIan Romanick2015-05-281-1/+1
| | | | | | | | And core profile only. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Fredrik Höglund <[email protected]> Cc: "10.6" <[email protected]>
* dispatch_sanity: Validate the compatibility profile dispatch table tooIan Romanick2015-05-281-0/+493
| | | | | | | Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Ilia Mirkin <[email protected]> Cc: Ilia Mirkin <[email protected]> Cc: "10.6" <[email protected]>
* dispatch_sanity: Split list of GL 3.1 functions in to core and commonIan Romanick2015-05-281-71/+342
| | | | | | | | | The next patch will add a test for compatibility profile dispatch, and it seems to make more sense to share the lists. Signed-off-by: Ian Romanick <[email protected]> Cc: Ilia Mirkin <[email protected]> Cc: "10.6" <[email protected]>
* mesa: Don't install glVertexAttribL* functions in compatibility profileIan Romanick2015-05-282-1/+3
| | | | | | | | | | GL_ARB_vertex_attrib_64bit is exclusive to core profile, and none of the other functions added by the extension are advertised in other profiles. Signed-off-by: Ian Romanick <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Ilia Mirkin <[email protected]> Cc: "10.6" <[email protected]>