mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965/shader: Get rid of the setup_vec4_uniform_value helper	Jason Ekstrand	2015-10-02	1	-14/+0
\| \| \| \| \| \|	It's not used by anything anymore Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/fs: Print reg and reg_offset separately for ATTR files.	Kenneth Graunke	2015-10-01	1	-1/+1
\| \| \| \| \| \| \| \|	Reading this output was really confusing. reg represents attribute slots; reg_offset is the x/y/z/w component (0..3) within a vec4 slot. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/cs: Add a binding table entry for gl_NumWorkGroups	Jordan Justen	2015-09-29	1	-3/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	If glDispatchComputeIndirect is used, then the value for this variable must be read from the indirect BO. To allow the same generated code to support indirect and glDispatchCompute, we will also setup a BO for the number of work groups using the intel_upload_data mechanism. This will only be required if the gl_NumWorkGroups variable is accessed. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Don't re-layout varyings for separate shader programs.	Kenneth Graunke	2015-09-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, our VUE map code always assigned slots to varyings sequentially, in one contiguous block. This was a bad fit for separate shaders - the GS input layout depended or the VS output layout, so if we swapped out vertex shaders, we might have to recompile the GS on the fly - which rather defeats the point of using separate shader objects. (Tessellation would suffer from this as well - we could have to recompile the HS, DS, and GS.) Instead, this patch makes the VUE map for separate shaders use a fixed layout, based on the input/output variable's location field. (This is either specified by layout(location = ...) or assigned by the linker.) Corresponding inputs/outputs will match up by location; if there's a mismatch, we're allowed to have undefined behavior. This may be less efficient - depending what locations were chosen, we may have empty padding slots in the VUE. But applications presumably use small consecutive integers for locations, so it hopefully won't be much worse in practice. 3% of Dota 2 Reborn shaders are hurt, but only by 2 instructions. This seems like a small price to pay for avoiding recompiles. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965/fs: Implement FS_OPCODE_GET_BUFFER_SIZE	Samuel Iglesias Gonsalvez	2015-09-25	1	-0/+1
\| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Respect stride and subreg_offset for ATTR registers	Kristian Høgsberg Kristensen	2015-09-24	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	When we assign hw regs to attributes, we don't incorporate the stride and subreg_offset from the fs_reg. It's rarely used, but the integer multiplication lowering uses unusual stride and subreg_offset combination breaks when one source is an attribute. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970 Cc: "10.6 11.0" <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation	Iago Toral Quiroga	2015-09-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are some bug reports about shaders failing to compile in gen6 because MRF 14 is used when we need to spill. For example: https://bugs.freedesktop.org/show_bug.cgi?id=86469 https://bugs.freedesktop.org/show_bug.cgi?id=90631 Discussion in bugzilla pointed to the fact that gen6 might actually have 24 MRF registers available instead of 16, so we could use other MRF registers and avoid these conflicts (we still need to investigate why some shaders need up to MRF 14 anyway, since this is not expected). Notice that the hardware docs are not clear about this fact: SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device Hardware" says "Number per Thread" - "24 registers" However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says: "Normal threads should construct their messages in m1..m15. (...) Regardless of actual hardware implementation, the thread should not assume th at MRF addresses above m15 wrap to legal MRF registers." Therefore experimentation was necessary to evaluate if we had these extra MRF registers available or not. This was tested in gen6 using MRF registers 21..23 for spilling and doing a full piglit run (all.py) forcing spilling of everything on the FS backend. It was also tested by doing spilling of everything on both the FS and the VS backends with a piglit run of shader.py. In both cases no regressions were observed. In fact, many of these tests where helped in the cases where we forced spilling, since that triggered the same underlying problem described in the bug reports. Here are some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on gen6 hardware: Using MRFs 13..15 for spilling: crash: 2, fail: 113, pass: 6621, skip: 5461 Using MRFs 21..23 for spilling: crash: 2, fail: 12, pass: 6722, skip: 5461 This patch sets the ground for later patches to implement spilling using MRF registers 21..23 in gen6. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: The barrier send uses only 1 payload register	Jordan Justen	2015-09-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When preparing the barrier payload, the instructions should operate in simd8 mode since we only use 1 payload register. fs_inst::regs_read is also updated to indicate that it only reads one register for SHADER_OPCODE_BARRIER. These issues were flagged by: commit cadd7dd384b33a779d46bd664f456bed4a21a5b7 Author: Jason Ekstrand <[email protected]> Date: Thu Jul 2 15:41:02 2015 -0700 i965/fs: Add a very basic validation pass Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Add a very basic validation pass	Jason Ekstrand	2015-09-15	1	-0/+10
\| \| \| \| \| \| \| \|	Currently the validation pass only validates that regs_read and regs_written are consistent with the sizes of VGRF's. We can add more as we find it to be useful. Reviewed-by: Matt Turner <[email protected]>
*	i965: Move perf_debug code to brw_codegen_*_prog()	Kristian Høgsberg Kristensen	2015-09-14	1	-41/+0
\| \| \| \| \| \| \| \| \| \|	We're trying to avoid a libdrm dependency in the core compiler, so let's move the perf_debug code one level up from the brw__emit() helpers to the brw_codegen__prog() helpers. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
*	i965: Move brw_fs_precompile() to brw_wm.c	Kristian Høgsberg Kristensen	2015-09-14	1	-58/+0
\| \| \| \| \| \| \| \| \|	All other precompile functions live in the brw_<stage>.c files, make fs follow the convention. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
*	i965: Move compute shader code around	Kristian Høgsberg Kristensen	2015-09-14	1	-0/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This moves the compute shader code around in order to make the way the code is split up more consistent. There should be no functional changes. Typically we have a few files per stage: brw_vs.c, brw_wm.c brw_gs.c: code to drive code generation and implement precompiling and cache search. genX_<stage>_state.c gen specific implementation of the state emission for the shader stage. The brw_*_emit() functions are all in the same files as the visitor classes they use (with the exception of VS, which may use either vec4 or fs). To make compute follow this convention, we move the brw_cs_emit() function into brw_fs.cpp. We can then rename brw_cs.cpp to brw_cs.c and do this in C like the other similar files. Finally, move state setup and atoms to gen7_cs_state.c. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
*	i965/cs: Reserve local invocation id in payload regs	Jordan Justen	2015-09-13	1	-0/+10
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/fs: Set first_non_payload_grf in assign_curb_setup	Jordan Justen	2015-09-10	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \|	first_non_payload_grf may be updated in assign_urb_setup for FS or assign_vs_urb_setup for VS. We need to set this in assign_curb_setup for compute shaders since cs does not have an assign_cs_urb_setup like assign_urb_setup (fs) or assign_vs_urb_setup (vs). Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: add support for textureSamples function	Ilia Mirkin	2015-09-10	1	-0/+1
\| \| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> [v2: kayden-supplied code in fs_nir replacing need for logical opcode] Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Move brw_setup_tex_for_precompile to brw_program.[ch].	Kenneth Graunke	2015-09-03	1	-19/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This living in brw_fs.{h,cpp} is a historical artifact of us supporting texturing for fragment shaders before any other stages. It's kind of awkward given that we use it for all stages. This avoids having to include brw_fs.h in geometry shader code in order to access this function. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965/fs: Handle MRF destinations in lower_integer_multiplication().	Matt Turner	2015-09-02	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The lowered code reads from the destination, which isn't possible from message registers. Fixes the following dEQP tests on SNB: dEQP-GLES3.functional.shaders.precision.int.highp_mul_fragment dEQP-GLES3.functional.shaders.precision.int.mediump_mul_fragment dEQP-GLES3.functional.shaders.precision.int.lowp_mul_fragment Cc: "10.6 11.0" <[email protected]> Tested-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Only consider fixed_hw_reg in equals() if file is HW_REG/IMM.	Matt Turner	2015-08-28	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Noticed when debugging things that lead to the next patch. On G45 (and presumably ILK) this helps register coalescing: total instructions in shared programs: 4077373 -> 4077340 (-0.00%) instructions in affected programs: 43751 -> 43718 (-0.08%) helped: 52 HURT: 2 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Split VGRFs after lowering pull constants	Jason Ekstrand	2015-08-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	The split_virtual_grfs code doesn't properly rewrite reladdr so we need to make sure that any uniform indirects are lowered away first. This fixes the glsl-fs-uniform-indexed-by-swizzled-vec4.shader_test in piglit Cc: "10.6" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i964/fs: Refactor assign_constant_locations	Jason Ekstrand	2015-08-27	1	-46/+40
\| \| \| \| \| \| \| \| \|	Now that all constant locations are assigned in a single function, we can refactor it a bit to unify things. In particular, we now handle pull_constant_loc and push_constant_loc more similarly and we only modify stage_prog_data->params[] in one place at the end of the function. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Combine assign_constant_locations and ↵	Jason Ekstrand	2015-08-25	1	-29/+11
\| \| \| \| \| \| \| \| \| \|	move_uniform_array_access_to_pull_constants The comment above move_uniform_array_access_to_pull_constants was completely bogus because it has nothing to do with lowering instructions. Instead, it's assiging locations of pull constants. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Move type_size() methods out of visitor classes.	Kenneth Graunke	2015-08-25	1	-5/+5
\| \| \| \| \| \| \| \|	I want to use C function pointers to these, and they don't use anything in the visitor classes anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Make setup_vec4_uniform_value and _image_uniform_values take an offset	Jason Ekstrand	2015-08-25	1	-3/+4
\| \| \| \| \| \| \| \|	This way they don't implicitly increment the uniforms variable and don't have to be called in-sequence during uniform setup. Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Rename setup_vector_uniform_values to setup_vec4_uniform_value	Jason Ekstrand	2015-08-25	1	-1/+2
\| \| \| \| \| \| \|	The new name more accurately represents what it does: Set up a single vec4 uniform value. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Teach type_size() about the size of an image uniform.	Francisco Jerez	2015-08-11	1	-0/+1
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/fs: Make resolve_source_modifiers consistent with the vec4 version	Jason Ekstrand	2015-08-10	1	-7/+8
\| \| \| \|	Reviewed-by: Matt Turner <[email protected]>
*	i965/fs: Lower arithmetic instructions with register regions of unsupported ↵	Francisco Jerez	2015-08-06	1	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	width. This extends the SIMD lowering pass to enforce the hardware limitation that no directly-addressed source may read more than 2 physical GRFs. One can easily go over this limit when doing 64-bit arithmetic (e.g. FP64 or extended-precision integer MULs) or SIMD32, so it's nice to be able to just emit an instruction of the intended execution size from the visitor and let the lowering pass deal with this restriction transparently. Some hardware arithmetic instructions are not handled here, including all instructions that use the accumulator implicitly (which the SIMD lowering pass deliberately doesn't handle), instructions with non-per-channel sources (e.g. LINE or PLANE) and SEND-like instructions, which need special handling most likely as virtual opcodes. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Fix fs_inst::regs_read() for sources in the ATTR file.	Francisco Jerez	2015-08-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Otherwise it would crash on Gen8 with scalar VS. The issue can easily be reproduced with the following patch, but I don't see any reason why it wouldn't be possible to end up with an ATTR argument here even without it. CC: [email protected] Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Lower the MULH virtual instruction.	Francisco Jerez	2015-08-06	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \|	Translate MULH into the MUL/MACH sequence. This does roughly the same thing that nir_emit_alu() used to do but we can now handle 16-wide by taking advantage of the SIMD lowering pass. The force_sechalf workaround near the bottom is required because the SIMD lowering pass will emit instructions with non-zero quarter control and we need to make sure we avoid that on integer arithmetic instructions with implicit accumulator access due to a known hardware bug on IVB. Reviewed-by: Matt Turner <[email protected]>
*	i965/fs: Indent the implementation of 32x32-bit MUL lowering by one level.	Francisco Jerez	2015-08-06	1	-130/+134
\| \| \| \| \| \| \| \| \|	In order to make room for the code that will lower the MULH virtual instruction. Also move the hardware generation and execution type checks into the same branch, they are going to have to be different for MULH. Reviewed-by: Matt Turner <[email protected]>
*	i965/fs: Lower 32x32 bit multiplication on BXT.	Francisco Jerez	2015-08-06	1	-2/+2
\| \| \| \| \| \| \|	AFAIK BXT has the same annoying alignment limitation as CHV on the source register regions of 32x32 bit MULs, give it the same treatment. Reviewed-by: Matt Turner <[email protected]>
*	i965: Use float calculations when double is unnecessary.	Matt Turner	2015-07-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Literals without an f/F suffix are of type double, and implicit conversion rules specify that the float in (float op double) be converted to a double before the operation is performed. I believe float execution was intended (in nearly all cases) or is sufficient (in the case of gen7_urb.c). Removes a lot of float <-> double conversion instructions and replaces many double instructions with float instructions which are cheaper. text data bss dec hex filename 4928659 195160 26192 5150011 4e953b i965_dri.so before 4928315 195152 26192 5149659 4e93db i965_dri.so after Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/fs: Make the default builder 64-wide before entering the optimization loop.	Francisco Jerez	2015-07-29	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	Not a typo. Replace the default builder with one of bogus width to catch cases in which optimization passes assume that the default dispatch width is good enough. The execution controls of instructions emitted during optimization should in general match the original code that is being manipulated. Many of the problems fixed in this series were caught by the assertions introduced in this patch. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Don't set exec_all on instructions wider than the original in ↵	Francisco Jerez	2015-07-29	1	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	lower_simd_width. This could have led to somewhat increased bandwidth usage for lowered texturing instructions on Gen4 (which is the only case in which lower_width may be greater than inst->exec_size). After the previous patches the invariant mentioned in the comment should no longer be assumed by any of the other optimization and lowering passes, so the exec_all() call shouldn't be necessary anymore. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Initialize a builder explicitly in the gen4 send dependency ↵	Francisco Jerez	2015-07-29	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \|	work-arounds. Instead of relying on the default one. This shouldn't lead to any functional changes because DEP_RESOLVE_MOV overrides the execution size of the instruction anyway and other execution controls are irrelevant. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Switch lower_logical_sends() to the fs_builder constructor from ↵	Francisco Jerez	2015-07-29	1	-3/+1
\| \| \| \| \| \|	instruction. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Switch lower_load_payload() to the fs_builder constructor from ↵	Francisco Jerez	2015-07-29	1	-5/+2
\| \| \| \| \| \|	instruction. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Set up the builder execution size explicitly in opt_sampler_eot().	Francisco Jerez	2015-07-29	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \|	opt_sampler_eot() was relying on the default builder to have the same width as the sampler and FB write opcodes it was eliminating, the channel selects didn't matter because the builder was only being used to allocate registers, no new instructions were being emitted with it. A future commit will change the width of the default builder what will break this assumption, so initialize it explicitly here. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Set execution controls correctly in lower_integer_multiplication().	Francisco Jerez	2015-07-29	1	-1/+1
\| \| \| \| \| \| \| \|	lower_integer_multiplication() was ignoring the execution controls of the original MUL instruction. Fix it by using the new fs_builder constructor. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Set execution controls correctly for lowered pull constant loads.	Francisco Jerez	2015-07-29	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	demote_pull_constants() was ignoring the execution size and channel selects of the instruction that wanted the constant, which doesn't matter for uniform pull constant loads because all channels get the same scalar value, but it might for varying pull constant loads. Fix it by using the new fs_builder() constructor that takes care of setting execution controls compatible with the instruction passed as argument. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Implement lowering of logical surface instructions.	Francisco Jerez	2015-07-29	1	-8/+55
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Hook up SIMD lowering to unroll surface instructions of unsupported ↵	Francisco Jerez	2015-07-29	1	-0/+5
\| \| \| \| \| \|	width. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Define logical typed and untyped surface opcodes.	Francisco Jerez	2015-07-29	1	-0/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each logical variant is largely equivalent to the original opcode but instead of taking a single payload source it expects its arguments separately as individual sources, like: typed_surface_write_logical null, coordinates, source, surface, num_coordinates, num_components This patch defines the opcodes and usual instruction boilerplate, including a placeholder lowering function provided mainly as documentation for their source registers. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Define the setup_vector_uniform_values() backend_visitor interface.	Francisco Jerez	2015-07-29	1	-0/+12
\| \| \| \| \| \| \| \| \|	This cleans up the VEC4 implementation of setup_uniform_values() somewhat and will avoid duplication of the image uniform upload code by having a common interface to upload a vector of uniforms on either back-end. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Hook up SIMD lowering to handle texturing opcodes of unsupported width.	Francisco Jerez	2015-07-29	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should match the set of cases in which we currently call fail() or no16() from the emit_texture_*() methods and the ones in which emit_texture_gen4() enables the SIMD16 workaround. Hint for reviewers: It's not a big deal if I happen to have missed some case here, it will just lead to an assertion failure down the road which is easily fixable, however being stricter than necessary won't cause any visible breakage, it would just decrease performance silently due to the unnecessary message splitting, so feel free to double-check that all cases listed here already cause a SIMD8/16 fall-back with the current texturing code -- You may want to skip over the Gen5-6 cases though if you don't have pencil and paper at hand. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Implement lowering of logical texturing opcodes on Gen4.	Francisco Jerez	2015-07-29	1	-1/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unlike its Gen5 and Gen7 counterparts this patch isn't a plain refactor of the previous Gen4 texturing code, it's more of a rewrite largely based on emit_texture_gen4_simd16(). The reason is that on the one hand the original emit_texture_gen4() code didn't seem easily fixable to be SIMD width-invariant and had plenty of clutter to support SIMD-width workarounds which are no longer required. On the other hand emit_texture_gen4_simd16() was missing a number of SIMD8-only opcodes. This should generalize both and roughly match their current behaviour where there is overlap. Incidentally this will fix the following piglits on Gen4: arb_shader_texture_lod.execution.arb_shader_texture_lod-texgrad arb_shader_texture_lod.execution.tex-miplevel-selection gradarb 2d arb_shader_texture_lod.execution.tex-miplevel-selection gradarb 3d arb_shader_texture_lod.execution.tex-miplevel-selection projgradarb 2d arb_shader_texture_lod.execution.tex-miplevel-selection projgradarb 2d_projvec4 arb_shader_texture_lod.execution.tex-miplevel-selection *projgradarb 3d Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Implement lowering of logical texturing opcodes on Gen5-6.	Francisco Jerez	2015-07-29	1	-0/+103
\| \| \| \| \| \| \| \| \| \|	This should be largely equivalent to emit_texture_gen5() except for slight codestyle changes and the use i965 opcodes instead of the ir_texture_opcode enum, see "i965/fs: Implement lowering of logical texturing opcodes on Gen7+." for the mapping between them. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Lower SHADER_OPCODE_TXF_UMS/MCS_LOGICAL too on Gen7+.	Francisco Jerez	2015-07-29	1	-5/+11
\| \| \| \| \| \| \| \|	These weren't being handled by emit_texture_gen7() but we can easily lower them here for consistency with other texturing opcodes. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Implement lowering of logical texturing opcodes on Gen7+.	Francisco Jerez	2015-07-29	1	-1/+216
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should be largely equivalent to emit_texture_gen7() except that we now get i965 sampling opcodes directly rather than ir_texture_opcode enum values. The mapping is as follows: - ir_tex -> SHADER_OPCODE_TEX - ir_txb -> FS_OPCODE_TXB - ir_txl -> SHADER_OPCODE_TXL - ir_txd -> SHADER_OPCODE_TXD - ir_txf -> SHADER_OPCODE_TXF - ir_txf_ms -> SHADER_OPCODE_TXF_CMS - ir_txs -> SHADER_OPCODE_TXS - ir_query_levels -> SHADER_OPCODE_TXS too, the visitor will make sure that the provided lod value is zero in this case. - ir_lod -> SHADER_OPCODE_LOD - ir_tg4 -> SHADER_OPCODE_TG4_OFFSET if the offset value is not immediate, SHADER_OPCODE_TG4 otherwise. Other than that there are only minor changes and style fixes like the implementation now being factored out in static functions to improve encapsulation. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Fix opt_zero_samples() for texturing ops not matching dispatch_width.	Francisco Jerez	2015-07-29	1	-3/+3
\| \| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>