summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* spirv: reduce array size in vtn_handle_constantKarol Herbst2019-04-141-1/+1
| | | | | | | | we already assert above that there are no more than 3 sources, so it doesn't make sense to use an array of 4 sources Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/loop_analyze: use nir_const_value.b for boolean results, not u32Karol Herbst2019-04-141-1/+1
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/print: Use nir_src_as_int for array indicesJason Ekstrand2019-04-141-3/+2
| | | | Reviewed-by: Karol Herbst <[email protected]>
* nir/builder: Add a nir_imm_zero helperJason Ekstrand2019-04-144-17/+18
| | | | | | v2: replace nir_zero_vec with nir_imm_zero (Karol Herbst) Reviewed-by: Karol Herbst <[email protected]>
* nir/builder: Move nir_imm_vec2 from blorp into the builderKarol Herbst2019-04-142-12/+12
| | | | | | | | While we're here, fix a typo which caused it to actually return a vec4 with the third and fourth components zero. Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* lima: use nir_src_as_floatKarol Herbst2019-04-142-9/+2
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* freedreno/ir3: use nir_src_as_uint in a few placesKarol Herbst2019-04-145-51/+20
| | | | | | | | v2 (Jason Ekstrand): - Add even more places Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/nir: use nir_src_is_const and nir_src_as_uintKarol Herbst2019-04-141-6/+4
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/nir: Take a nir_tex_instr and src index in brw_texture_offsetJason Ekstrand2019-04-144-27/+21
| | | | | This makes things a bit simpler and it's also more robust because it no longer has a hard dependency on the offset being a 32-bit value.
* radv: use nir constant helpersKarol Herbst2019-04-142-20/+10
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* amd/nir: some cleanupsKarol Herbst2019-04-141-20/+9
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* panfrost/midgard: Use shared nir_lower_viewport_transformAlyssa Rosenzweig2019-04-141-101/+4
| | | | | | | v2: Run before lowering I/O. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* nir: Add nir_lower_viewport_transformAlyssa Rosenzweig2019-04-144-0/+105
| | | | | | | | | | | | | | | | | | | | | | | | | On Mali hardware (supported by Panfrost and Lima), the fixed-function transformation from world-space to screen-space coordinates is done in the vertex shader prior to writing out the gl_Position varying, rather than in dedicated hardware. This commit adds a shared NIR pass for implementing coordinate transformation and lowering gl_Position writes into screen-space gl_Position writes. v2: Run directly on derefs before io/vars are lowered to cleanup the code substantially. Thank you to Qiang for this suggestion! v3: Bikeshed continues. v4: Add to Makefile.sources (per Jason's comment). Bikeshed comment. Ian and Qiang's reviews are from v3, but no real functional changes from v4. Rob's review is from v4. Signed-off-by: Alyssa Rosenzweig <[email protected]> Suggested-by: Qiang Yu <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* panfrost: Cleanup indexed draw handlingAlyssa Rosenzweig2019-04-141-52/+28
| | | | | | | | | | As part of this cleanup, we use the newly-exposed u_vbuf_get_minmax_index, deduplicating quite a bit of bookkeeping. We also centralize the draw_flags tracking to make this code cleaner / futureproofed; we have already had bugs regarding this field so we might as well get it right now. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Drop dependence on mesa/stAlyssa Rosenzweig2019-04-142-9/+1
| | | | | | | This was used as a workaround for uniform sizing which was fixed in 771adffe ("st: Lower uniforms in st in the...") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* draw: fix building error in draw_gs_init()Mauro Rossi2019-04-141-1/+1
| | | | | | | | | | | | | | Fixes the following building error happening with Android build system: external/mesa/src/gallium/auxiliary/draw/draw_gs.c:740:79: error: address of array 'draw->gs.tgsi.machine->PrimitiveOffsets' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] if (!draw->gs.tgsi.machine->Primitives[i] || !draw->gs.tgsi.machine->PrimitiveOffsets) ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~ 1 error generated. Fixes: 7720ce3 ("draw: add support to tgsi paths for geometry streams. (v2)") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* lima/gpir: fix alu check miss last store slotQiang Yu2019-04-141-2/+2
| | | | | | Fixes: 92d7ca4b1cd "gallium: add lima driver" Signed-off-by: Qiang Yu <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: fix compile fail when two slot nodeQiang Yu2019-04-143-3/+25
| | | | | | | | Come from glmark2-es2 jellyfish test. Fixes: 92d7ca4b1cd "gallium: add lima driver" Signed-off-by: Qiang Yu <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima: add support for depth/stencil fbo attachments and texturesVasily Khoruzhick2019-04-147-24/+120
| | | | | | | | | Hardware supports writing back Z/S buffers and sampling from them, so add support for that. Signed-off-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Tested-by: Icenowy Zheng <[email protected]>
* lima: use individual tile heap for each GP job.Vasily Khoruzhick2019-04-145-19/+15
| | | | | | | | | Looks like it's somehow used by subsequent PP job, so we have to preserve its contents until PP job is done. Signed-off-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Tested-by: Icenowy Zheng <[email protected]>
* nir: add lower_ftruncChristian Gmeiner2019-04-132-0/+3
| | | | | | | Port TGSI TRUNC lowering to nir Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* android: fix LLVM version string related building errorsMauro Rossi2019-04-131-4/+4
| | | | | | | | | | | | | | | | | | | | Adding \ prior to " in llvm version string fixes the following building errors: external/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1290:14: error: expected ')' ", LLVM " MESA_LLVM_VERSION_STRING ^ <command line>:8:34: note: expanded from here ^ external/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1287:10: note: to match this '(' snprintf(rscreen->renderer_string, sizeof(rscreen->renderer_string), ^ 1 error generated. Fixes: 05b114e ("simplify LLVM version string printing") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: leave the top 4Gb of the high heap VMA unusedLionel Landwerlin2019-04-131-5/+5
| | | | | | | | | | | | | In 628c9ca9089789 I forgot to apply the same -4Gb of the high address of the high heap VMA. This was previously computed in the HIGH_HEAP_MAX_ADDRESS. Many thanks to James for pointing this out. Signed-off-by: Lionel Landwerlin <[email protected]> Reported-by: Xiong, James <[email protected]> Fixes: 628c9ca9089789 ("anv: store heap address bounds when initializing physical device") Reviewed-by: Jason Ekstrand <[email protected]>
* v3d: Use the new lower_to_scratch implementation for indirects on temps.Eric Anholt2019-04-128-11/+193
| | | | | | | | | | | | | We can use the same register spilling infrastructure for our loads/stores of indirect access of temp variables, instead of doing an if ladder. Cuts 50% of instructions and max-temps from 2 KSP shaders in shader-db. Also causes several other KSP shaders with large bodies and large loop counts to not be force-unrolled. The change was originally motivated by NOLTIS slightly modifying register pressure in piglit temp mat4 array read/write tests, triggering register allocation failures.
* nir: Add a pass for selectively lowering variables to scratch spaceJason Ekstrand2019-04-129-1/+216
| | | | | | | | | | This commit adds new nir_load/store_scratch opcodes which read and write a virtual scratch space. It's up to the back-end to figure out what to do with it and where to put the actual scratch data. v2: Drop const_index comments (by anholt) Reviewed-by: Eric Anholt <[email protected]>
* v3d: Detect the correct number of QPUs and use it to fix the spill size.Eric Anholt2019-04-123-4/+13
| | | | | We were missing a * 4 even if the particular hardware matched our assumption.
* v3d: Add missing dumping for the spill offset/size uniforms.Eric Anholt2019-04-121-0/+8
|
* v3d: Add missing base offset to CS shared memory accesses.Eric Anholt2019-04-121-9/+20
| | | | | This code is so touchy, trying to emit the minimum amount of address math. Some day we'll move it all to NIR, I hope.
* v3d: Add Compute Shader compilation support.Eric Anholt2019-04-129-83/+302
| | | | | | | | While waiting for the CSD UABI to get reviewed, I keep having to rebase the CS patch. Just land the compiler side for now to keep it from diverging. For now this covers just GLES 3.1 compute shaders, not CL kernels.
* v3d: Replace the old shader-db env var output with the ARB_debug_output.Eric Anholt2019-04-123-30/+4
| | | | | | | | | We're using ARB_debug_output for the main shader-db, but I had this env var left around from the shader-db-2 support (vc4 apitrace-based). Keep the env var around since it's nice sometimes to get the stats on a shader you're optimizing without having to do a shader-db run, but drop the old formatting that's not useful and keeps tricking me when I go to add another measurement to the shader-db output.
* v3d: Include the number of max temps used in the shader-db output.Eric Anholt2019-04-121-1/+29
| | | | | This gives us finer-grained feedback on how we're doing on register pressure than "did we trigger a new shader to spill or drop thread count?"
* v3d: Drop a note for the future about PIPE_CAP_PACKED_UNIFORMS.Eric Anholt2019-04-121-0/+7
|
* v3d: Add and use a define for the number of channels in a QPU invocation.Eric Anholt2019-04-123-3/+9
| | | | | A shader invocation always executes 16 channels together, so we often end up multiplying things by this magic 16 number. Give it a name.
* nir: Add a comment about how intrinsic definitions work.Eric Anholt2019-04-121-0/+11
| | | | | | I was thinking about a refactor, and needed to read this first. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Drop remaining references to const_index in favor of the call to use.Eric Anholt2019-04-121-5/+5
| | | | | | Please don't make me read a const_index[] expression ever again. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Drop comments about the constant_index slots for load/stores.Eric Anholt2019-04-121-21/+15
| | | | | | | | | | The constant_index slots are named right there in the intrinsic definition, and the comment is just a chance to get out of sync. Noticed while reviewing the lower_to_scratch changes that copy-and-pasted wrong comments, and load_ubo and load_per_vertex_output had incorrect comments currently. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/fs: Remove unused condition from opt_algebraic caseSagar Ghuge2019-04-121-5/+0
| | | | | | | | | We will never hit a condition where we have src1 and src2 as immediate operands. Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Set location on structure-split sampler uniform variablesKenneth Graunke2019-04-121-0/+1
| | | | | | | | | | | | | gl_nir_lower_samplers_as_deref splits structure uniform variables, creating new variables for individual fields. As part of that, it calculates a new location. It then never set this on the new variables. Thanks to Michael Fiano for finding this bug. Fixes crashes on i965 with Piglit's new tests/spec/glsl-1.10/execution/samplers/uniform-struct test, which was reduced from the failing case in Michael's app. Fixes: f003859f97c nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref Reviewed-by: Timothy Arceri <[email protected]>
* panfrost: use os_mmap and os_munmapMateusz Krzak2019-04-121-3/+4
| | | | | | | 32-bit needs mmap64 for 64-bit offsets. We get 64-bit offsets from kernel. Signed-off-by: Mateusz Krzak <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: cast bo_handles pointer to uintptr_t firstMateusz Krzak2019-04-121-1/+1
| | | | | | | Required for 64-bit kernel to interpret the pointer from 32-bit userspace. Signed-off-by: Mateusz Krzak <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* anv/pipeline: Fix MEDIA_VFE_STATE::PerThreadScratchSpace on gen7Jason Ekstrand2019-04-121-3/+23
| | | | | | | | We were always programming it with the Broadwell convention which is too large by a factor of two on Haswell and just plain wrong on IVB and BYT. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* gitlab-ci: add lima to the buildEric Engestrom2019-04-121-1/+1
| | | | | Suggested-by: Karol Herbst <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* ac: use the common helper ac_apply_fmask_to_sampleMarek Olšák2019-04-121-64/+5
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: set AC_FUNC_ATTR_READNONE for image opcodes where it was missingMarek Olšák2019-04-122-0/+5
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa: don't overwrite existing shader files with MESA_SHADER_CAPTURE_PATHMarek Olšák2019-04-121-3/+17
| | | | Reviewed-by: Tapani Pälli <[email protected]>
* glsl: allow the #extension directive within code blocks for the dri optionMarek Olšák2019-04-121-0/+9
| | | | | | for Viewperf 13 Acked-by: Timothy Arceri <[email protected]>
* ac/nir: remove some useless integer casts for ALU operationsSamuel Pitoiset2019-04-121-16/+0
| | | | | | | Sources are always casted to integers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: remove useless integer cast in visit_image_load()Samuel Pitoiset2019-04-121-1/+1
| | | | | | | | ac_build_image_opcode() casts if necessary and buffer images are casted too. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: remove useless integer cast in adjust_sample_index_using_fmask()Samuel Pitoiset2019-04-121-1/+0
| | | | | | | It's already casted if necessary in ac_build_image_opcode(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: remove useles LLVMGetUndef for nir_op_pack_64_2x32_splitSamuel Pitoiset2019-04-121-2/+1
| | | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>