mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	intel: decoder: expose helper to test header fields	Lionel Landwerlin	2017-11-01	2	-3/+4
\| \| \| \| \| \| \| \|	These fields are of little importance as they're used to recognize instructions. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
*	intel: decoder: don't read qword outside instruction/struct limit	Lionel Landwerlin	2017-11-01	2	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to print invalid data when the last field was being clamped to 32bits due to Dword Length of the whole instruction. Here is an example where the decoder read part of the next instruction instead of stopping at the 32bit limit: 0x000ce0b4: 0x10000002: MI_STORE_DATA_IMM 0x000ce0b4: 0x10000002 : Dword 0 DWord Length: 2 Store Qword: 0 Use Global GTT: false 0x000ce0b8: 0x00045010 : Dword 1 Core Mode Enable: 0 Address: 0x00045010 0x000ce0bc: 0x00000000 : Dword 2 0x000ce0c0: 0x00000000 : Dword 3 Immediate Data: 8791026489807077376 With this change we have the proper value : 0x000ce0b4: 0x10000002: MI_STORE_DATA_IMM (4 Dwords) 0x000ce0b4: 0x10000002 : Dword 0 DWord Length: 2 Store Qword: 0 Use Global GTT: false 0x000ce0b8: 0x00045010 : Dword 1 Core Mode Enable: 0 Address: 0x00045010 0x000ce0bc: 0x00000000 : Dword 2 0x000ce0c0: 0x00000000 : Dword 3 Immediate Data: 0 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
*	intel: decoder: split out getting the next field and decoding it	Lionel Landwerlin	2017-11-01	1	-10/+21
\| \| \| \| \| \| \| \| \|	Due to the new way we handle fields, we need not to forget the first field when decoding instructions. The issue was that the advance function was called first and skipped the first field. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
*	intel: decoder: move field name copy	Lionel Landwerlin	2017-11-01	1	-2/+7
\| \| \| \| \| \| \|	This should be inside the function that actually decodes fields. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
*	intel: decoder: reorder iterator init function	Lionel Landwerlin	2017-11-01	1	-14/+14
\| \| \| \| \| \| \|	Making the next change more readable. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
*	intel: common: print out all dword with field spanning multiple dwords	Lionel Landwerlin	2017-11-01	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For example, we were skipping Dword 3 in this PIPE_CONTROL : 0x000ce130: 0x7a000004: PIPE_CONTROL DWord Length: 4 0x000ce134: 0x00000010 : Dword 1 Flush LLC: false Destination Address Type: 0 (PPGTT) LRI Post Sync Operation: 0 (No LRI Operation) Store Data Index: 0 Command Streamer Stall Enable: false Global Snapshot Count Reset: false TLB Invalidate: false Generic Media State Clear: false Post Sync Operation: 0 (No Write) Depth Stall Enable: false Render Target Cache Flush Enable: false Instruction Cache Invalidate Enable: false Texture Cache Invalidation Enable: false Indirect State Pointers Disable: false Notify Enable: false Pipe Control Flush Enable: false DC Flush Enable: false VF Cache Invalidation Enable: true Constant Cache Invalidation Enable: false State Cache Invalidation Enable: false Stall At Pixel Scoreboard: false Depth Cache Flush Enable: false 0x000ce138: 0x00000000 : Dword 2 Address: 0x00000000 0x000ce140: 0x00000000 : Dword 4 Immediate Data: 0 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
*	intel: decoder: build sorted linked lists of fields	Lionel Landwerlin	2017-11-01	2	-25/+34
\| \| \| \| \| \| \| \| \| \|	The xml files don't always have fields in order. This might confuse our parsing of the commands. Let's have the fields in order. To do this, the easiest way it to use a linked list. It also helps a bit with the iterator. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
*	intel: common: expose gen_spec fields	Lionel Landwerlin	2017-11-01	2	-13/+13
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
*	intel/compiler: Add functions to get prog_data and prog_key sizes for a stage	Jordan Justen	2017-10-31	2	-0/+42
\| \| \| \| \| \| \| \| \|	v2: * Return unsigned instead of size_t. (Ken) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/compiler: Add union types for prog_data and prog_key stages	Jordan Justen	2017-10-31	1	-0/+22
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: Remove final_program_size from brw_compile_*	Jordan Justen	2017-10-31	11	-71/+40
\| \| \| \| \| \| \| \| \|	The caller can now use brw_stage_prog_data::program_size which is set by the brw_compile_* functions. Cc: Jason Ekstrand <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: add new field for storing program size	Carl Worth	2017-10-31	6	-14/+35
\| \| \| \| \| \| \| \| \| \| \| \|	This will be used by the on disk shader cache. v2: * Set in brw_compile_* rather than brw_codegen_*. (Jason) Signed-off-by: Timothy Arceri <[email protected]> [[email protected]: Only add to brw_stage_prog_data] Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/isl: Disable some gen10 CCS_E formats for now	Nanley Chery	2017-10-31	1	-0/+24
\| \| \| \| \| \| \| \| \|	CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2 formats with CCS_E. None of these formats fit within the current blorp_copy framework so disable them until support is added. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/genxml: Fix decoding of groups with fields smaller than a DWord.	Kenneth Graunke	2017-10-30	2	-10/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Groups containing fields smaller than a DWord were not being decoded correctly. For example: <group count="32" start="32" size="4"> <field name="Vertex Element Enables" start="0" end="3" type="uint"/> </group> gen_field_iterator_next would properly walk over each element of the array, incrementing group_iter, and calling iter_group_offset_bits() to advance to the proper DWord. However, the code to print the actual values only considered iter->field->start/end, which are 0 and 3 in the above example. So it would always fetch bits 3:0 of the current DWord when printing values, instead of advancing to each element of the array, printing bits 0-3, 4-7, 8-11, and so on. To fix this, we add new iter->start/end tracking, which properly advances for each instance of a group's field. Caught by Matt Turner while working on 3DSTATE_VF_COMPONENT_PACKING, with a patch to convert it to use an array of bitfields (the example above). This also fixes the decoding of 3DSTATE_SBE's "Attribute Active Component Format" fields. Reviewed-by: Jordan Justen <[email protected]>
*	intel: common: silence compiler warning	Lionel Landwerlin	2017-10-30	1	-1/+1
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: remove unused variable	Eric Engestrom	2017-10-30	1	-3/+0
\| \| \| \| \| \| \| \|	Fixes: 2c873060d3578c7004c0 "i965: Delete unused brw_vs_prog_data::nr_attributes field." Cc: Kenneth Graunke <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
*	glsl: Remove ir_binop_greater and ir_binop_lequal expressions	Ian Romanick	2017-10-30	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NIR does not have these instructions. TGSI and Mesa IR both implement them using < and >=, repsectively. Removing them deletes a bunch of code and means I don't have to add code to the SPIR-V generator for them. v2: Rebase on 2+ years of change... and fix a major bug added in the rebase. text data bss dec hex filename 8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so before 8254235 268856 294072 8817163 868a0b 32-bit i965_dri.so after 7815339 345592 420592 8581523 82f193 64-bit i965_dri.so before 7813995 345560 420592 8580147 82ec33 64-bit i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	i965: fix blorp stage_prog_data->param leak	Tapani Pälli	2017-10-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch uses mem_ctx for allocation to ensure param array gets freed later. ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193 ==6164== at 0x4C2EB6B: malloc (vg_replace_malloc.c:299) ==6164== by 0x12E31C6C: ralloc_size (ralloc.c:121) ==6164== by 0x130189F1: fs_visitor::assign_constant_locations() (brw_fs.cpp:2095) ==6164== by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715) ==6164== by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229) ==6164== by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570) ==6164== by 0x130C4B07: blorp_compile_fs (blorp.c:194) ==6164== by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79) ==6164== by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332) ==6164== by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261) ==6164== by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326) ==6164== by 0x12EFF72B: brw_clear (brw_clear.c:297) Fixes: 8d90e28839 ("intel/compiler: Allocate pull_param in assign_constant_locations") Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
*	i965: Delete brw_wm_prog_key::drawable_height.	Kenneth Graunke	2017-10-29	1	-1/+0
\| \| \| \| \| \|	This has been unused since we switched to nir_lower_wpos_ytransform. Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler/gen9: Pixel shader header only workaround	Topi Pohjolainen	2017-10-28	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes intermittent GPU hangs on Broxton with an Intel internal test case. There are plenty of similar fragment shaders in piglit that do not use any varyings and any uniforms. According to the documentation special timing is needed between pipeline stages. Apparently we just don't hit that with piglit. Even with the failing test case one doesn't always get the hang. Moreover, according to the error states the hang happens significantly later than the execution of the problematic shader. There are multiple render cycles (primitive submissions) in between. I've also seen error states where the ACTHD points outside the batch. Almost as if the hardware writes somewhere that gets used later on. That would also explain why piglit doesn't suffer from this - most tests kick off one render cycle and any corruption is left unseen. v2 (Ken): Instead of enabling push constants, enable one of the inputs (PSIZ). v3 (Ken, Jason): Use LAYER instead making vulkan emit_3dstate_sbe() happy. Cc: "17.3 17.2" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
*	anv: Fix assert about source attrs.	Kenneth Graunke	2017-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \|	Asserting slot >= 2 made sense when the URB read offset was always 1 (pair of slots). Commit 566a0c43f0b9fbf5106161471dd5061c7275f761 made it possible to read from the VUE header in slot 0, by adjusting the offset to be 0. So, this assert is now bogus. Use the one from GL. Reviewed-by: Jason Ekstrand <[email protected]>
*	anv: Drop URB entry output read handling in 3DSTATE_XS.	Kenneth Graunke	2017-10-27	1	-26/+0
\| \| \| \| \| \| \| \| \|	Commit 566a0c43f0b9fbf5106161471dd5061c7275f761 started setting the 3DSTATE_SBE bit to override these values with the one calculated there. So, they're dead. Stop setting them. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Delete unused brw_vs_prog_data::nr_attributes field.	Kenneth Graunke	2017-10-27	2	-2/+0
\| \| \| \|	Reviewed-by: Matt Turner <[email protected]>
*	intel/tools/disasm: correctly observe FILE *out parameter	Kevin Rogovin	2017-10-26	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Kevin Rogovin <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/compiler: brw_validate_instructions to take const void* instead of void*	Kevin Rogovin	2017-10-26	2	-2/+2
\| \| \| \| \| \| \|	The disassembler does not (and should not) be modifying the data. Signed-off-by: Kevin Rogovin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	anv/entrypoints: Dump useful data if mako throws an exception	Jason Ekstrand	2017-10-25	1	-5/+17
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/compiler: Call nir_lower_system_values in brw_preprocess_nir	Jason Ekstrand	2017-10-25	2	-2/+2
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir	Jason Ekstrand	2017-10-25	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	We currently have a bug where nir_lower_system_values gets called before nir_lower_var_copies so it will miss any system value uses which come from a copy_var intrinsic. Moving it to after brw_preprocess_nir fixes this problem. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
*	anv/pipeline: Drop nir_lower_clip_cull_distance_arrays	Jason Ekstrand	2017-10-25	1	-2/+0
\| \| \| \| \| \|	We already handle it in brw_preprocess_nir Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv/pipeline: Dump shader immedately after spirv_to_nir	Jason Ekstrand	2017-10-25	1	-0/+15
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/eu: Use EXECUTE_1 for JMPI	Jason Ekstrand	2017-10-25	2	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The PRM says "The execution size must be 1." In 73137997e23ff6c11, the execution size was set to 1 when it should have been BRW_EXECUTE_1 (which maps to 0). Later, in dc2d3a7f5c217a7cee9, JMPI was used for line AA on gen6 and earlier and we started manually stomping the exeution size to BRW_EXECUTE_1 in the generator. This commit fixes the original bug and makes brw_JMPI just do the right thing. Reviewed-by: Matt Turner <[email protected]> Fixes: 73137997e23ff6c1145d036315d1a9ad96651281
*	i965/fs: Add brw_reg_type_from_bit_size utility method	Alejandro Piñeiro	2017-10-25	1	-5/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Returns the brw_type for a given ssa.bit_size, and a reference type. So if bit_size is 64, and the reference type is BRW_REGISTER_TYPE_F, it returns BRW_REGISTER_TYPE_DF. The same applies if bit_size is 32 and reference type is BRW_REGISTER_TYPE_HF it returns BRW_REGISTER_TYPE_F v2 (Jason Ekstrand): - Use better unreachable() messages - Add Q types Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected] Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs/nir: Use the nir_src_bit_size helper	Jason Ekstrand	2017-10-25	1	-9/+3
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/fs: Handle flag read/write aliasing in needs_src_copy	Jason Ekstrand	2017-10-25	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to implement the ballot intrinsic, we do a MOV from flag register to some GRF. If that GRF is used in a SEL, cmod propagation helpfully changes it into a MOV from the flag register with a cmod. This is perfectly valid but when lower_simd_width comes along, it simply splits into two instructions which both have conditional modifiers. This is a problem since we're reading the flag register. This commit makes us check whether or not flags_written() overlaps with the flag values that we are reading via the instruction source and, if we have any interference, will force us to emit a copy of the source. Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
*	intel/nir: Zero local index const struct for valgrind & nir_serialize	Jordan Justen	2017-10-25	1	-0/+1
\| \| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	meson: extract out variable for nir_algebraic.py	Rob Clark	2017-10-24	1	-1/+1
\| \| \| \| \| \| \| \|	Also needed in freedreno/ir3. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
*	i965: Fix memmem compiler warnings.	Eric Anholt	2017-10-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gcc is throwing this warning in my meson build: ../src/intel/compiler/brw_eu_validate.c:50:11: warning argument 1 null where non-null expected [-Wnonnull] return memmem(haystack.str, haystack.len, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ needle.str, needle.len) != NULL; ~~~~~~~~~~~~~~~~~~~~~~~ The first check for CONTAINS has a NULL error_msg.str and 0 len. The glibc implementation will exit without looking at any haystack bytes if haystack.len < needle.len, so this was safe, but silence the warning anyway by guarding against implementation variablility. Fixes: 122ef3799d56 ("i965: Only insert error message if not already present") Reviewed-by: Matt Turner <[email protected]>
*	anv: don't assert on device init on Cannonlake	Lionel Landwerlin	2017-10-21	1	-2/+4
\| \| \| \| \| \| \|	v2: Warn that support is still in alpha (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv: disable stencil pma fix on Gen > 9	Lionel Landwerlin	2017-10-21	1	-0/+2
\| \| \| \| \| \| \| \|	This workaround isn't listed on Gen10. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	blorp: enable R32G32B32X32 blorp ccs copies	Lionel Landwerlin	2017-10-21	1	-0/+1
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/fs: Use align1 mode on ternary instructions on Gen10+	Matt Turner	2017-10-20	1	-4/+8
\| \| \| \| \| \| \| \| \|	Align1 mode offers some nice features over align16, like access to more data types and the ability to use a 16-bit immediate. This patch does not start using any new features. It just emits ternary instructions in align1 mode. Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Add align1 ternary instruction emission support	Matt Turner	2017-10-20	1	-55/+160
\| \| \| \|	Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Add align1 ternary instruction disassembler support	Matt Turner	2017-10-20	2	-75/+288
\| \| \| \|	Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Add align1 ternary instruction-word support	Matt Turner	2017-10-20	1	-0/+108
\| \| \| \|	Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Add align1 ternary instruction support to conversion functions	Matt Turner	2017-10-20	4	-34/+101
\| \| \| \|	Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Add align1 ternary instruction field encodings	Matt Turner	2017-10-20	1	-0/+35
\| \| \| \|	Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Add functions to abstract access to 3src register types	Matt Turner	2017-10-20	2	-20/+23
\| \| \| \|	Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Rename brw_inst's functions that access the 3src register type	Matt Turner	2017-10-20	3	-18/+18
\| \| \| \| \| \| \| \| \|	Put hw_ in the name so that it's clear these are the hardware encodings. Similar to commit 9fb832332868 ("i965: Rename brw_inst's functions that access the register type") Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Rename brw_inst 3src functions in preparation for align1	Matt Turner	2017-10-20	4	-86/+92
\| \| \| \|	Reviewed-by: Scott D Phillips <[email protected]>
*	i965: Print subreg in units of type-size on ternary instructions	Matt Turner	2017-10-20	1	-5/+26
\| \| \| \| \| \| \| \|	The instruction word contains SubRegNum[4:2] so it's in units of dwords (hence the * 4 to get it in terms of bytes). Before this patch, the subreg would have been wrong for DF arguments. Reviewed-by: Scott D Phillips <[email protected]>