aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mesa/texcompress: correct mapping of S3TC formats in conversion functionNanley Chery2015-08-311-2/+2
| | | | | | | | | MESA_FORMAT_RGBA_DXT5 should actually be reserved for GL_RGBA[4]_DXT5_S3TC. Also, Gallium and other dri drivers (radeon and nouveau) follow this mapping scheme. Reviewed-by: Chad Versace <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* r600/sb: update last_cf for finalize if.Dave Airlie2015-09-011-0/+3
| | | | | | | | | | | | As Glenn did for finalize_loop we need to update_cf when we add a POP at the end of a shader. I think this fixes one of the earlier shader going off end of memory problems we've stopped. Reviewed-by: Glenn Kennard <[email protected]> Cc: "10.6" "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965/fs: Use greater-equal cmod to implement maximum.Matt Turner2015-08-312-4/+6
| | | | | | | | | | The docs specifically call out SEL with .l and .ge as the implementations of MIN and MAX respectively. Among other things, SEL with these conditional mods are commutative. See commit 3b7f683f. Reviewed-by: Jordan Justen <[email protected]>
* i965/chv|skl: Apply sampler bypass w/aBen Widawsky2015-08-312-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | Certain compressed formats require this setting. The docs don't go into much detail as to why it's needed exactly. This patch introduces no piglit regressions on gen9 (bsw is untested). Note that the SKL "regressions" are fixed tests, and the egl_khr_gl_colorspace tests are WTF. The patch also fixes nothing I can find. http://otc-mesa-ci.jf.intel.com/job/Leeroy/127820/ v2: Reworded commit message (Matt); Added piglit results link. Restructured condition (Matt) Moved check out to function (Nanley). I left the setting of the bit in the surface state open coded because it seems to go better with the existing code. v3: Use and inline function only in gen8_emit_texture_surface_state() (Matt). Cc: Matt Turner <[email protected]> Cc: Nanley Chery <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st/mesa: move to renumbering registers in a groupDave Airlie2015-08-311-19/+38
| | | | | | | | This can be done with a single pass for the instruction base, and takes renumber_registers out of its spot on the profile. Acked-by: Marek Olšák <[email protected] Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: reduce time spent in calculating temp read/writesDave Airlie2015-08-311-74/+79
| | | | | | | | | | | | | The glsl->tgsi convertor does some temporary register reduction however in profiling shader-db this shows up quite highly, so optimise things to reduce the number of loops through all the instructions we do. This drops merge_registers from 4-5% on the profile to 1%. I think this can be reduced further by possibly optimising the renumber pass. Acked-by: Marek Olšák <[email protected] Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: cache tgsi opcode info in the instructionDave Airlie2015-08-311-23/+16
| | | | | | | | | Instead of looking this up lots, lets just cache it in the instruction translation up front. I just noticed this function what high in a profile of shader-db on radeonsi. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: move prim convert from geom shader to function.Dave Airlie2015-08-312-25/+26
| | | | | | | This should avoid C++ fail including this header. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl: remove specical case subroutine type countingTimothy Arceri2015-08-311-3/+2
| | | | | | | Unlike samplers we can get the correct value for subroutines from component_slots() Reviewed-by: Dave Airlie <[email protected]>
* r600g: Use TGSI parse results instead of manually exfiltratingEdward O'Callaghan2015-08-301-1/+1
| | | | | | | | This makes better use of the work that the TGSI API has done for us. Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: Set geometry properties in r600_create_shader_state()Edward O'Callaghan2015-08-303-25/+23
| | | | | | | | | | | The selector is shared by all shader variants, so the individual shaders shouldn't change it. Use tgsi_shader_scan() results to set geometry properties within a r600_create_shader_state() call and treat said propertices in the selector as read-only within r600_shader_from_tgsi(). Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: Move geometry properties state from shader to selectorEdward O'Callaghan2015-08-306-22/+23
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: Remove dead assigment to 'gs_input_prim' in shader stateEdward O'Callaghan2015-08-302-4/+0
| | | | | | | | Note that 'geometry shader properties' should be carried in the selector state over the shader state in any case. Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: don't use the emit qt keyword in si_init_atomMarek Olšák2015-08-291-2/+2
| | | | It confuses my editor.
* radeonsi: remove no-op 32-bit maskingMarek Olšák2015-08-295-7/+7
| | | | Reviewed-by: Alex Deucher <[email protected]>
* gallium/radeon: fix the ADDRESS_HI mask for EVENT_WRITE CIK packetsMarek Olšák2015-08-291-8/+8
| | | | | Cc: [email protected] Reviewed-by: Alex Deucher <[email protected]>
* winsys/radeon: handle non-zero finite timeout when waiting for buffersMarek Olšák2015-08-292-38/+41
| | | | Reviewed-by: Alex Deucher <[email protected]>
* freedreno/a3xx: implement half-z clippingIlia Mirkin2015-08-293-2/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add basic clip plane supportIlia Mirkin2015-08-293-1/+24
| | | | | | | | | The hardware is capable of dealing with GL1-style user clip planes. No clip vertex, no clip distances. Fixes a number of ucp tests, as well as neverball. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nvc0: change prefix of MP performance counters to HW_SMSamuel Pitoiset2015-08-292-149/+149
| | | | | | | According to NVIDIA, local performance counters (MP) are prefixed with SM, while global performance counters (PCOUNTER) are called PM. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: sort performance counter queries by nameSamuel Pitoiset2015-08-292-142/+142
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: make names of performance counter queries consistentSamuel Pitoiset2015-08-292-56/+56
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: use enumerations for driver queriesSamuel Pitoiset2015-08-291-120/+123
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: remove commented out code related to PCOUNTER queriesSamuel Pitoiset2015-08-291-20/+0
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* r600: port si_conv_prim_to_gs_out from radeonsiDave Airlie2015-08-291-15/+16
| | | | | | | | | This code was broken by the tess merge, and I totally missed it until now. I'm not sure this fixes anything but it stops the assert. Cc: "11.0" <[email protected]> Reviewed-by: Glenn Kennard <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: use PRIi64 for some compute debug printfsDave Airlie2015-08-291-4/+4
| | | | | | | | Otherwise this will crash on 32-bit, and it gets rid of warnings building on 32-bit. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/util: fix debug_get_flags_option on 32-bitDave Airlie2015-08-291-3/+4
| | | | | | | | | On 32-bit we need to use PRIu64 flags for printfs, otherwise this segfaults in R600_DEBUG=help otherwise. Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl: provide the option of using BFE for unpack builting loweringIlia Mirkin2015-08-285-14/+100
| | | | | | | | | This greatly improves generated code, especially for the snorm variants, since it is able to get rid of the lshift/rshift for sext, as well as replacing each shift + mask with a single op. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: use bitfield_insert instead of and + shift + or for packingIlia Mirkin2015-08-283-4/+30
| | | | | | | | | It is fairly tricky to detect the proper conditions for using bitfield insert, but easy to just use it up front. This removes a lot of instructions on nvc0 when invoking the packing builtins. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Remove fs_visitor::try_replace_with_sel().Matt Turner2015-08-283-92/+0
| | | | | | | No shader-db changes on g4x, snb, hsw, or bdw. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Replace awful variable names.Matt Turner2015-08-281-40/+40
| | | | | | | | | | | | | | | | | start_to -> dst_start end_to -> dst_end start_from -> src_start end_from -> src_end var_to -> dst_var var_from -> src_var reg_to -> dst_reg reg_to_offset -> dst_reg_offset reg_from -> src_reg Not sure how these made sense to me before. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Skip blocks in register coalescing interference check.Matt Turner2015-08-281-14/+20
| | | | | | | | No need to walk through instructions in blocks we know don't contain our registers' live ranges. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Improve register coalescing interference check.Matt Turner2015-08-281-8/+11
| | | | | | | | | | | | | | | | | | | I always thought that the is_control_flow() -> return false check was a bad hack, and some previous attempts to remove it have failed and have been reverted. The previous two patches fix some problems that caused register coalescing to not notice some interference between registers, which the is_control_flow() check apparently works around. With that fixed, we can calculate interference more accurately. total instructions in shared programs: 6261319 -> 6257917 (-0.05%) instructions in affected programs: 346282 -> 342880 (-0.98%) helped: 1552 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use overwrites_reg() instead of dst.equals().Matt Turner2015-08-281-2/+2
| | | | | | | | equals() returns false for registers with different types, using it isn't appropriate to determine whether an is overwriting a register. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Only consider fixed_hw_reg in equals() if file is HW_REG/IMM.Matt Turner2015-08-282-3/+6
| | | | | | | | | | | | | | Noticed when debugging things that lead to the next patch. On G45 (and presumably ILK) this helps register coalescing: total instructions in shared programs: 4077373 -> 4077340 (-0.00%) instructions in affected programs: 43751 -> 43718 (-0.08%) helped: 52 HURT: 2 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Do not set the size for zero-size uniformsMarta Lofstedt2015-08-281-3/+4
| | | | | | | | | | | | | | | | | | Zero sized uniforms can exist in the list, but they don't get get any space allocated in prog_data->params or in the param_size array, so the size should not be set for them. This was previously fixed in: commit: 781dc7c0e1f41502f18e07c0940af949a78d2792. However, commit: 259f7291de2387aa3ac5f856b39b7b934a1d8e7d removed the fix. Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: return old name for deleted samplers for SAMPLER_BINDING queriesDaniel Scharrer2015-08-281-10/+1
| | | | | | | | | | | | | If the sampler object has been deleted in the same context the binding will have been cleared. If it has been deleted in another context, the spec does not say what should returned. None of the other binding point queries check for deletion in another context. Also, as names of deleted objects are free for reuse, the current code didn't even work reliably. Reviewed-by: Fredrik Höglund <[email protected]> Signed-off-by: Fredrik Höglund <[email protected]>
* mesa: add missing queries for ARB_direct_state_accessDaniel Scharrer2015-08-282-0/+98
| | | | | | | | | | | This adds index queries (glGet*i_v) for GL_TEXTURE_BINDING_* and GL_SAMPLER_BINDING, as well as textue queries (glGetTex{,ture}Parameter*) for GL_TEXTURE_TARGET. CC: "10.6 11.0" <[email protected]> Reviewed-by: Fredrik Höglund <[email protected]> Signed-off-by: Fredrik Höglund <[email protected]>
* docs: Fix a typo in GL3.txt concerning GL_KHR_context_flush_controlNeil Roberts2015-08-281-1/+1
|
* mesa: fix dispatch sanity with GL_OES_texture_storage_multisample_2d_arrayIlia Mirkin2015-08-281-0/+3
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91785 Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Matt Turner <[email protected]>
* ABI-check: Use more portable bash invocation.Vinson Lee2015-08-272-2/+2
| | | | | | | Fixes 'make check' on FreeBSD. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/nir: Make use of nir_opt_undefBoyan Ding2015-08-271-0/+2
| | | | | | | | | | | | Shader-db result on Ivy Bridge: total instructions in shared programs: 145484 -> 145445 (-0.03%) instructions in affected programs: 225 -> 186 (-17.33%) helped: 5 HURT: 0 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Signed-off-by: Boyan Ding <[email protected]>
* glapi: Remove _x86_64_get_get_dispatch symbol from x86-64 assembly.Matt Turner2015-08-271-6/+0
| | | | | | Never used. Reviewed-by: Mark Janes <[email protected]>
* glsl: clean up textureSize prototypeIlia Mirkin2015-08-271-4/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* r600g/sb: Don't crash on empty if jump targetGlenn Kennard2015-08-281-1/+4
| | | | | | Signed-off-by: Glenn Kennard <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g/sb: Don't read junk after EOPGlenn Kennard2015-08-283-1/+6
| | | | | | | | | | | | | Shaders that contain instruction data after an instruction with EOP could end up parsing that as an instruction, leading to various crashes and asserts in SB as it gets very confused if it sees for instance a loop start instruction jumping off to some random point. Add a couple of asserts, and print EOP bit if set in old asm printer. Signed-off-by: Glenn Kennard <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g/sb: Handle undef in read port trackerGlenn Kennard2015-08-281-1/+1
| | | | | | | | | e8e443 missed adding check for undef values also in unreserve function, leading to an assert triggering. Signed-off-by: Glenn Kennard <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: rename rowStride to imageStride in texturesubimage()Brian Paul2015-08-271-4/+4
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: only copy the requested teximage facesIlia Mirkin2015-08-271-2/+2
| | | | | | | | | | | | | | Cube maps are special in that they have separate teximages for each face. We handled that by copying the data to them separately, but in case zoffset != 0 or depth != 6 we would read off the end of the client array or modify the wrong images. zoffset/depth have already been verified by the time the code gets to this stage, so no need to double-check. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.6 11.0" <[email protected]>
* nir: Convert the builder to use the new NIR cursor API.Kenneth Graunke2015-08-2711-61/+38
| | | | | | | | | | | | | | | | | | The NIR cursor API is exactly what we want for the builder's insertion point. This simplifies the API, the implementation, and is actually more flexible as well. This required a bit of reworking of TGSI->NIR's if/loop stack handling; we now store cursors instead of cf_node_lists, for better or worse. v2: Actually move the cursor in the after_instr case. v3: Take advantage of nir_instr_insert (suggested by Connor). v4: vc4 build fixes (thanks to Eric). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1] Reviewed-by: Jason Ekstrand <[email protected]> [v4] Acked-by: Connor Abbott <[email protected]> [v4]