mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	ac: add ac_build_isign()	Samuel Pitoiset	2018-03-05	4	-32/+30
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	ac: add ac_build_fract()	Samuel Pitoiset	2018-03-05	4	-34/+39
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	virgl: add offset alignment values to to v2 caps struct	[email protected]	2018-03-05	3	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glBindBufferRange(..) in vrend_draw_bind_ubo is failing with more than one uniform block. This is due to improper alignment of the start of the second block. Let's query the proper alignment from the driver and pass it back to Mesa. Let's query for the texture alignment too, even though the Virgl renderer doesn't call glTexBufferRange yet. The default values are the widest workable range possible (for example, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256). Fixes: dEQP-GLES3.functional.ubo.* on Nvidia Example test: dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex Note: This is based on "virgl: reduce some default capset limits.", which hasn't landed in Mesa yet but should relatively soon. Signed-off-by: Dave Airlie <[email protected]>
*	virgl: reduce some default capset limits.	Dave Airlie	2018-03-05	1	-8/+8
\| \| \| \| \| \| \| \| \|	Since v2 might take a while to rollout, we should reduce these inside some gathered minimums and then v2 can increase them using host values. Reviewed-by: Stéphane Marchesin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	virgl: handle getting new capsets.	Dave Airlie	2018-03-05	6	-31/+52
\| \| \| \| \| \| \| \|	This checks the kernel api is new enough and asks for the larger caps size since the kernel won't mess it up now. Reviewed-by: Stéphane Marchesin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radeonsi/nir: call ac_lower_indirect_derefs()	Timothy Arceri	2018-03-05	4	-4/+6
\| \| \| \| \| \| \| \|	Fixes piglit tests: tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi: add chip class to compiler_ctx_state	Timothy Arceri	2018-03-05	3	-0/+4
\| \| \| \| \| \|	This will be used in the following patch. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c	Timothy Arceri	2018-03-05	5	-48/+44
\| \| \| \| \| \| \|	Until llvm handles indirects better we will need to use these workarounds in the radeonsi backend also. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: Fix copying from 3D images starting at non-zero depth.	Bas Nieuwenhuizen	2018-03-05	1	-0/+3
\| \| \| \| \|	Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
*	swr/rast: Fix macOS macro.	Vinson Lee	2018-03-04	1	-2/+2
\| \| \| \| \| \| \|	Fixes: a25093de7188 ("swr/rast: Implement JIT shader caching to disk") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-By: George Kyriazis <[email protected]>
*	vbo: Try to reuse the same VAO more often for successive dlists.	Mathias Fröhlich	2018-03-03	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \|	The change tries to catch more opportunities to reuse the same set of VAO's when building up display lists. Instead of checking the offset with respect to the beginning of the vertex buffer object the change tries to apply this same optimization with respect to the previous display list node. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	mesa: Silence unused parameter warnings from TEXSTORE_PARAMS	Ian Romanick	2018-03-02	2	-13/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 1717 warnings to 1547 warnings by silencing 170 instances of things like In file included from ../../SOURCE/master/src/mesa/main/texcompress_bptc.h:30:0, from ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:31: ../../SOURCE/master/src/mesa/main/texcompress_bptc.c: In function ‘_mesa_texstore_bptc_rgba_unorm’: ../../SOURCE/master/src/mesa/main/texstore.h:60:14: warning: unused parameter ‘dstFormat’ [-Wunused-parameter] mesa_format dstFormat, \ ^ ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:1276:32: note: in expansion of macro ‘TEXSTORE_PARAMS’ _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS) ^~~~~~~~~~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	i965: Silence unused parameter warnings in genX_state_upload	Ian Romanick	2018-03-02	1	-20/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 1772 warnings to 1717 warnings by silencing 55 instances of things like ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_vertex_buffer_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:313:41: warning: unused parameter ‘end_offset’ [-Wunused-parameter] unsigned end_offset, ^~~~~~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_sampler_state_pointers_xs’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4689:58: warning: unused parameter ‘brw’ [-Wunused-parameter] genX(emit_sampler_state_pointers_xs)(struct brw_context brw, ^~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4690:62: warning: unused parameter ‘stage_state’ [-Wunused-parameter] struct brw_stage_state stage_state) ^~~~~~~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_upload_default_color’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4730:40: warning: unused parameter ‘format’ [-Wunused-parameter] mesa_format format, GLenum base_format, ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘translate_wrap_mode’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4906:41: warning: unused parameter ‘brw’ [-Wunused-parameter] translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest) ^~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_update_sampler_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4972:37: warning: unused parameter ‘batch_offset_for_sampler_state’ [-Wunused-parameter] uint32_t batch_offset_for_sampler_state) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	isl: Silence unused parameter warnings in __gen_combine_address implementations	Ian Romanick	2018-03-02	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 1808 warnings to 1772 warnings by silencing 36 instances of things like ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’: ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter] __gen_combine_address(void data, void loc, uint64_t addr, uint32_t delta) ^~~~ ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter] __gen_combine_address(void data, void loc, uint64_t addr, uint32_t delta) ^~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	genxml: Silence unused parameter warnings in generated pack code	Ian Romanick	2018-03-02	1	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 1960 warnings to 1808 warnings by silencing 152 instances of things like In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0, from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36: src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’: src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_uint(uint64_t v, uint32_t start, uint32_t end) ^~~ src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’: src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter] __gen_offset(uint64_t v, uint32_t start, uint32_t end) ^~~~~ src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_offset(uint64_t v, uint32_t start, uint32_t end) ^~~ src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’: src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits) ^~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	i965: Silence unused parameter warnings in blorp	Ian Romanick	2018-03-02	3	-17/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 2023 warnings to 1960 warnings by silencing 63 instances of things like In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0: ../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void start, size_t size) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void start, size_t size) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void *start, size_t size) ^~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	nir: Silence unused parameter warnings in generated nir_constant_expressions ↵	Ian Romanick	2018-03-02	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	code Reduces my build from 2075 warnings to 2023 warnings by silencing 52 instances of things like src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’: src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter] evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size, ^~~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	i965: Silence unused parameter warnings in generated OA code	Ian Romanick	2018-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 6301 warnings to 2075 warnings by silencing 4226 instances of things like src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c: In function ‘hsw__render_basic__gpu_core_clocks__read’: src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c:41:62: warning: unused parameter ‘brw’ [-Wunused-parameter] hsw__render_basic__gpu_core_clocks__read(struct brw_context *brw, ^~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	i965: Silence warnings about mixing enum and non-enum in conditional	Ian Romanick	2018-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 6451 warnings to 6301 warnings by silencing 150 instances of ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info, const brw_inst)’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra] unsigned file = __builtin_strcmp("dst", #reg) == 0 ? \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ BRW_GENERAL_REGISTER_FILE : \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ brw_inst_##reg##_reg_file(devinfo, inst); \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’ REG_TYPE(src1) ^~~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	intel/compiler: Silence unused parameter warnings in release builds	Ian Romanick	2018-03-02	2	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 7005 warnings to 6451 warnings by silencing 554 instances of In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0: ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_3src_a1_src0_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_3src_a1_src2_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_set_3src_a1_src0_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_set_3src_a1_src2_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_imm_uq(const struct gen_device_info devinfo, const brw_inst insn) ^~~~~~~ In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0, from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29: ../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’: ../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’: ../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter] unsigned _reg_file, ^~~~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	i965: Silence unused parameter warnings	Ian Romanick	2018-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduces my build from 7119 warnings to 7005 warnings by silencing 114 instances of In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_context.h:46:0, from ../../SOURCE/master/src/mesa/drivers/dri/i965/intel_pixel_read.c:38: ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h: In function ‘brw_bo_unmap’: ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h:258:47: warning: unused parameter ‘bo’ [-Wunused-parameter] static inline int brw_bo_unmap(struct brw_bo *bo) { return 0; } ^~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	intel: Drop program size pointer from vec4/fs assembly getters.	Kenneth Graunke	2018-03-02	9	-25/+17
\| \| \| \| \| \| \| \| \|	These days, we're just passing a pointer to a prog_data field, which we already have access to. We can just use it directly. (In the past, it was a pointer to a separate value.) Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT.	Kenneth Graunke	2018-03-02	1	-1/+3
\| \| \| \| \| \| \| \| \|	This should have no practical impact. For the default uploader, we don't really care, but for others, we may want to append more data as the GPU is reading existing data, which means we need async and persistent flags. Reviewed-by: Chris Wilson <[email protected]>
*	i965: Generalize intel_upload.c to support multiple uploaders.	Kenneth Graunke	2018-03-02	9	-91/+101
\| \| \| \| \| \| \| \| \| \| \| \| \|	I'd like to reuse the upload logic for a new program cache, but the buffers will need to have a different lifetime than the default uploader, and also some address space restrictions. So, we can't use a single uploader for both situations - we'll need two of them. This creates a public 'uploader' structure, and adjusts the interface to take an uploader rather than always using brw->upload. It should have no functional change at the moment. Reviewed-by: Chris Wilson <[email protected]>
*	intel/compiler: Memory fence commit must always be enabled for gen10+	Anuj Phogat	2018-03-02	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Commit bit in the message descriptor (Bit 13) must be always set to true in CNL+ for memory fence messages. It also fixes a piglit GPU hang on cnl+ in simulation environment. Piglit test: arb_shader_image_load_store-shader-mem-barrier See HSD ES # 1404612949 Signed-off-by: Anuj Phogat <[email protected]> Cc: [email protected] Reviewed-by: Francisco Jerez <[email protected]>
*	Revert "i965/fs: Predicate byte scattered writes if needed"	Francisco Jerez	2018-03-02	1	-14/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit a4031bdfa927fb4c3c5d0bdadc70634f3c1a5eac. It's redundant with the sample mask predication done at this point by the common logical send lowering infrastructure, and rather buggy because it wasn't applying the correct sample mask in shaders using discard, since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS doesn't reflect samples discarded by the shader, so it could have led to data corruption in fragment shader invocations that execute discard based on a non-dynamically uniform condition. Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/fs: Handle surface opcode sample masks via predication.	Francisco Jerez	2018-03-02	1	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main motivation is to enable HDC surface opcodes on ICL which no longer allows the sample mask to be provided in a message header, but this is enabled all the way back to IVB when possible because it decreases the instruction count of some shaders using HDC messages significantly, e.g. one of the SynMark2 CSDof compute shaders decreases instruction count by about 40% due to the removal of header setup boilerplate which in turn makes a number of send message payloads more easily CSE-able. Shader-db results on SKL: total instructions in shared programs: 15325319 -> 15314384 (-0.07%) instructions in affected programs: 311532 -> 300597 (-3.51%) helped: 491 HURT: 1 Shader-db results on BDW where the optimization needs to be disabled in some cases due to hardware restrictions: total instructions in shared programs: 15604794 -> 15598028 (-0.04%) instructions in affected programs: 220863 -> 214097 (-3.06%) helped: 351 HURT: 0 The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL laptop with this change. According to Eero this improves performance of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2 desktop. Reviewed-by: Kenneth Graunke <[email protected]> Tested-By: Eero Tamminen <[email protected]>
*	intel/eu: Plumb header present bit to codegen helpers for HDC messages.	Francisco Jerez	2018-03-02	4	-29/+50
\| \| \| \| \| \| \| \| \| \|	This makes sure that the header-present bit of the message descriptor is in sync with the IR instruction fields, which gives the optimizer more control to avoid the overhead of setting up a message header when it's possible to do so. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/ir: Allow arbitrary scratch flag registers for ↵	Francisco Jerez	2018-03-02	3	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	SHADER_OPCODE_FIND_LIVE_CHANNEL. This shouldn't cause any functional change at this point, it changes SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at the IR level instead of the hard-coded f1.0, now that it can be represented in backend_instruction::flag_subreg. This will be necessary for scheduling to behave correctly once more things start making use of f1.0. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/ir: Allow representing additional flag subregisters in the IR.	Francisco Jerez	2018-03-02	7	-14/+24
\| \| \| \| \| \| \| \| \|	This allows representing conditional mods and predicates on f1.0-f1.1 at the IR level by adding an extra bit to the flag_subreg backend_instruction field. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/l3: Don't allocate SLM partition on ICL+.	Francisco Jerez	2018-03-02	1	-1/+1
\| \| \| \| \| \| \| \|	SLM has a chunk of special-purpose memory separate from L3 on ICL+, we shouldn't allocate a partition for it on L3 anymore. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs	Charmaine Lee	2018-03-02	1	-1/+2
\| \| \| \| \| \| \| \|	Since geometry shader also consumes prescale constants, the geometry shader constant buffer will need to be updated when prescale factor is changed. Reviewed-by: Brian Paul <[email protected]>
*	svga: fix blending regression	Brian Paul	2018-03-02	1	-11/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't translate blend state when it's disabled for a colorbuffer") subtly changed the details of gallium's per-RT blend state. In particular, when pipe_rt_blend_state[i].blend_enabled is true, we have to get the src/dst blend terms from pipe_rt_blend_state[i], not [0] as before. We now have to scan the blend targets to find the first one that's enabled (if any). We have to use the index of that target for getting the src/dst blend terms. And note that we have to set identical blend terms for all targets. This fixes the Piglit fbo-drawbuffers2-blend test. VMware bug 2063493. Reviewed-by: Charmaine Lee <[email protected]>
*	svga: check svga_have_vgpu10() in svga_delete_blend_state()	Brian Paul	2018-03-02	1	-1/+1
\| \| \| \| \| \| \|	We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not enabled (bs->id==0 by default), resulting in lots of device errors. Reviewed-by: Neha Bhende<[email protected]>
*	svga: if svga_update_state() fails, skip the draw call	Brian Paul	2018-03-02	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If svga_update_state() fails, we flush the command buffer and retry. If it fails again, it likely means we were unable to translate a shader for some reason (uses too many resources, for example). In that case, let's just skip the draw call. The alternative, just disabling the shader stage in question, would certainly lead to bad rendering anyway, and probably device errors. Fixes failed assertion running Piglit glsl-1.50/execution/ variable-indexing/gs-output-array-vec4-index-wr.shader_test since it uses too many GS output registers (though the test still fails). VMware bug 2063492. v2: also call pipe_debug_message() so apps or apitrace can be notified when this issue occurs. v3: use svga_update_state_retry(). Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
*	svga: let svga_update_state_retry() return a bool	Brian Paul	2018-03-02	2	-6/+9
\| \| \| \| \| \| \|	This will allow minor simplifications elsewhere. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
*	svga: s/unsigned/boolean/ for a few local vars	Brian Paul	2018-03-02	1	-6/+6
\| \| \| \|	Reviewed-by: Charmaine Lee <[email protected]>
*	meson: install vulkan_intel.h header	Dylan Baker	2018-03-02	1	-0/+4
\| \| \| \| \| \| \|	Fixes: d1992255bb29054fa51763376d125183a9f602f3 ("meson: Add build Intel "anv" vulkan driver") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	st/omx_bellagio: add picture profile and entry point	Boyuan Zhang	2018-03-02	1	-0/+2
\| \| \| \| \| \| \| \| \|	Profile and entry point were missing in the picture structure. Therefore, add them back. Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
*	radeonsi: fix radeon create encoder return	Boyuan Zhang	2018-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Previous patch missed a "return" when trying to modify the create encoder function, which made the whole logic fail. Therefore, add the return back. Fixes: b38b208ff8886e799d6a2 "radeonsi:create uvd hevc enc entry" Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	loader: Add support for platform and host1x busses	Thierry Reding	2018-03-02	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ARM SoCs usually have their DRM/KMS devices on the platform bus, so add support for this bus in order to allow use of the DRI_PRIME environment variable with those devices. While at it, also support the host1x bus, which is effectively the same but uses an additional layer in the bus hierarchy. Note that it isn't enough to support the bus that has the rendering GPU because the loader code will also try to construct an ID path tag for a scanout-only device if it is the default that is being opened. The ID path tag for a device can be obtained by running udevadm info on the device node, as shown in this example on NVIDIA Tegra: $ udevadm info /dev/dri/card0 \| grep ID_PATH_TAG E: ID_PATH_TAG=platform-50000000_host1x The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is constructed, can be found in the sysfs "uevent" attribute for the card0 device's parent: $ grep OF_FULLNAME /sys/devices/platform/50000000.host1x/drm/uevent OF_FULLNAME=/host1x@50000000 Similarily, /dev/dri/card1 corresponds to the GPU: $ udevadm info /dev/dri/card1 \| grep ID_PATH_TAG E: ID_PATH_TAG=platform-57000000_gpu and: $ grep OF_FULLNAME /sys/devices/platform/57000000.gpu/uevent OF_FULLNAME=/gpu@57000000 Changes in v2: - avoid confusing pre-increment in strdup() - add examples of tags to commit message Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Thierry Reding <[email protected]>
*	disk cache: Link with -latomic if necessary	Thierry Reding	2018-03-02	2	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The disk cache implementation uses 64-bit atomic operations. For some architectures, such as 32-bit ARM, GCC will not be able to translate these operations into atomic, lock-free instructions and will instead rely on the external atomics library to provide these operations. Check at configuration time whether or not linking against libatomic is necessary and if so, create a dependency that can be used while linking the mesautil library. This is the meson equivalent of 2ef7f23820a6 ("configure: check if -latomic is needed for __atomic_*"). For some background information on this, see: https://gcc.gnu.org/wiki/Atomic/GCCMM Changes in v2: - clarify meaning of lock-free in commit message - fix build if -latomic is not necessary Acked-by: Matt Turner <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Signed-off-by: Thierry Reding <[email protected]>
*	radv: do not set pending_reset_query in BeginCommandBuffer()	Samuel Pitoiset	2018-03-02	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is just useless for two reasons: 1) flush_bits is not set accordingly, so nothing will be flushed in BeginQuery(). 2) we always flush caches in EndCommandBuffer(), so if a reset is done in a previous command buffer we are safe. Cc: "18.0" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	r600/cayman: fix fragcood loading recip generation.	Dave Airlie	2018-03-02	1	-1/+1
\| \| \| \| \| \| \| \|	This fixes some hangs seen where the recip_ieee opcodes would end up split across the wrong slots. Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	i965: Allow 48-bit addressing on Gen8+.	Kenneth Graunke	2018-03-01	7	-18/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows most GPU objects to use the full 48-bit address space offered by Gen8+ platforms, rather than being stuck with 32-bit. This expands the available GPU memory from 4G to 256TB or so. A few objects - instruction, scratch, and vertex buffers - need to remain pinned in the low 4GB of the address space for various reasons. We default everything to 48-bit but disable it in those cases. Thanks to Jason Ekstrand for blazing this trail in anv first and finding the nasty undocumented hardware issues. This patch simply rips off all of his findings. Reviewed-by: Jordan Justen <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965: Shorten the name of the workaround BO.	Kenneth Graunke	2018-03-01	1	-3/+1
\| \| \| \| \|	This makes the name shorter in debug printouts. If "workaround_bo" is good enough for the code, it's probably good enough for debugging.
*	i965: Add debugging code to dump the validation list.	Kenneth Graunke	2018-03-01	1	-0/+22
\| \| \| \| \|	When anything goes wrong with this code, dumping the validation list is a useful way to figure out what's happening.
*	intel/fs: Set up sampler message headers in the visitor on gen7+	Jason Ekstrand	2018-03-01	2	-22/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This gives the scheduler visibility into the headers which should improve scheduling. More importantly, however, it lets the scheduler know that the header gets written. As-is, the scheduler thinks that a texture instruction only reads it's payload and is unaware that it may write to the first register so it may reorder it with respect to a read from that register. This is causing issues in a couple of Dota 2 vertex shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923 Cc: [email protected] Reviewed-by: Francisco Jerez <[email protected]>
*	ac: fix nir_intrinsic_shared_atomic_comp_swap handling	Timothy Arceri	2018-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Following on from 49879f377870 this makes sure we use the correct src index. Fixes cts test: KHR-GL46.compute_shader.atomic-case3 Reviewed-by: Dave Airlie <[email protected]>
*	st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs	Timothy Arceri	2018-03-02	1	-17/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We only need to check for previously processed location on user defined varyings as they are the only ones that support component packing. Therefore a single instance of processed_locs can be shared by regular varyings and patches. For simplicity we make processed_locs an array in order to handle dual source bleanding. Fixes the follow piglit test on radeonsi: tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test Reviewed-by: Dave Airlie <[email protected]>