summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: extract DB->CB copy logic into its own functionNicolai Hähnle2016-07-061-36/+61
| | | | | | Also clean up some of the looping. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: sample from flushed depth texture when requiredNicolai Hähnle2016-07-062-8/+46
| | | | | | | Note that this has no effect yet. A case where can_sample_z/s can be false in radeonsi will be added in a later patch. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: replace is_flushing_texture with db_compatibleNicolai Hähnle2016-07-069-19/+24
| | | | | | | | | | | This is a left-over of when I considered generalizing the separate stencil support. I do prefer the new name since it emphasizes what flushing vs. non-flushing means from a functional point-of-view, namely special handling of the texture format. v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add can_sample_z/s flags for texturesNicolai Hähnle2016-07-065-24/+34
| | | | | | v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: correctly mark levels of 3D textures as fully decompressedNicolai Hähnle2016-07-061-2/+2
| | | | | | Account for the fact that max_layer is minified for higher levels. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon/winsyses: remove unused stencil_offsetNicolai Hähnle2016-07-063-5/+0
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: remove redundant null-pointer checkNicolai Hähnle2016-07-061-2/+1
| | | | | | v2: keep using r600_texture_reference Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: print StencilLayout only onceNicolai Hähnle2016-07-061-2/+2
| | | | | | It is the same for all levels. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: flush stdout after printing texture informationNicolai Hähnle2016-07-061-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* glsl: don't try to lower non-gl builtins as if they were gl_FragDataIlia Mirkin2016-07-051-1/+2
| | | | | | | | | | | | | | If a shader has an output array, it will get treated as though it were gl_FragData and rewritten into gl_out_FragData instances. We only want this to happen on the actual gl_FragData and not everything else. This is a small part of the problem pointed out by the below bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "11.2 12.0" <[email protected]>
* glsl: Document and enforce restriction on type valuesIan Romanick2016-07-052-0/+10
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* glsl: Pack integer and double varyings as flat even if interpolation mode is ↵Ian Romanick2016-07-053-6/+15
| | | | | | | | | | | | | | none v2: Also update varying_matches::compute_packing_class(). Suggested by Timothy Arceri. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]> Cc: Gregory Hainaut <[email protected]> Cc: Ilia Mirkin <[email protected]>
* mesa: Strip arrayness from interface block names in some IO validationIan Romanick2016-07-051-8/+90
| | | | | | | | | | | | | | | | | | | Outputs from the vertex shader need to be able to match per-vertex-arrayed inputs of later stages. Acomplish this by stripping one level of arrayness from the names and types of outputs going to a per-vertex-arrayed stage. v2: Add missing checks for TESS_EVAL->GEOMETRY. Noticed by Timothy Arceri. v3: Use a slightly simpler stage check suggested by Ilia. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]> Cc: Gregory Hainaut <[email protected]> Cc: Ilia Mirkin <[email protected]>
* svga: avoid emitting redundant DXSetRenderTargets commandCharmaine Lee2016-07-052-18/+32
| | | | | | | Tested with Lightsmark2008, MTT piglit, glretrace, conform. Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeon/vce: update encRefPic addr and array mode to tiledLeo Liu2016-07-051-0/+1
| | | | | Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon/vce: increase cpb height alignmentLeo Liu2016-07-051-1/+1
| | | | | | | | Height should be aligned with 2 macroblocks, thus making safer for tiled mode Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* i965: Remove trailing whitespaceIago Toral Quiroga2016-07-051-1/+1
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make inline function staticIago Toral Quiroga2016-07-051-1/+1
| | | | | | Without this the i965 driver fails to load. Reviewed-by: Topi Pohjolainen <[email protected]>
* anv: install the intel_icd.json to ${datarootdir} by defaultEmil Velikov2016-07-051-1/+1
| | | | | | | | | | As mentioned by the spec (and used by Archlinux and Debian) default to ${datarootdir} as opposed to ${sysconfdir} for the default location. Cc: Jason Ekstrand <[email protected]> Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* swr: automake: don't ship LLVM version specific generated sourcesEmil Velikov2016-07-051-2/+43
| | | | | | | | | | | | | | | Otherwise things will fail to build, if the builder is using another version of LLVM. v2: annotate all the dependencies of builder_gen.h v3: clean the generated files as needed v4: comment cleanups (Tim) Cc: "12.0" <[email protected]> Tested-by: Tim Rowley <[email protected]> Tested-by: Chuck Atkins <[email protected]> (v2) Reported-by: Chuck Atkins <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* automake: don't mandate git_sha1.h/MESA_GIT_SHA1Emil Velikov2016-07-051-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | It has proven subtle to get it right both from the build side POV (see commit list below) and builders due to their varying workflows. Furthermore it does not fully fulfil the reason why it was enforced - to detect uniqueness between different builds, in order to distinguish and invalidate Vulkan/GL caches. With that having a much better solution (previous commit) we can drop this solution. This effectively reverts the following commits: 359d9dfec33 ("mesa: automake: add directory prefix for git_sha1.h") 2c424e00c39 ("mesa: automake: ensure that git_sha1.h.tmp has the right attributes") b7f7ec78435 ("mesa: automake: distclean git_sha1.h when building OOT") 8229fe68b5d ("automake: get in-tree `make distclean' working again.") Cc: Timo Aaltonen <[email protected]> Cc: Haixia Shi <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: automake: indent with tabs and not spacesEmil Velikov2016-07-051-4/+4
| | | | Signed-off-by: Emil Velikov <[email protected]>
* anv: use cache uuid based on the build timestamp.Emil Velikov2016-07-053-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not rely on the git sha1: - its current truncated form makes it less unique - it does not attribute for local (Vulkand or otherwise) changes Use a timestamp produced at the time of build. It's perfectly unique, unless someone explicitly thinkers with their system clock. Even then chances of producing the exact same one are very small, if not zero. v2: Remove .tmp rule. Its not needed since we want for the header to be regenerated on each time we call make (Eric). v3: - Honour SOURCE_DATE_EPOCH, to make the build reproducible (Michel) - Replace the generated header with a define, to prevent needless builds on consecutive `make' and/or `make install' calls. (Dave) v4: - Keep the timestamp generation at make time. (Jason) v5: - Ensure that file is regenerated on incremental builds. Cc: Michel Dänzer <[email protected]> Cc: Dave Airlie <[email protected]> Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* clover: conditionally use MESA_GIT_SHA1Emil Velikov2016-07-052-2/+8
| | | | | | | | | | | | | | | Considering how hard/annoying it was for many peoples' workflow to properly generate the macro, it will be demoted to conditionally available with follow-up commits. v2: Kill off gracious blank line (Vedran). Cc: [email protected] Cc: Vedran Miletić <[email protected]> Cc: Francisco Jerez <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1) Reviewed-by: Vedran Miletić <[email protected]>
* mesa: stop copying SamplerUnits twiceTimothy Arceri2016-07-051-4/+0
| | | | | | | The call to _mesa_update_shader_textures_used() already takes care of copying for us. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* mesa: make attribute binding message more usefulTimothy Arceri2016-07-051-1/+2
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: make more effective use of SamplersUsedTimothy Arceri2016-07-057-19/+10
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* glsl: stop allocating memory for UBOs during linkingTimothy Arceri2016-07-051-5/+8
| | | | | | | | | This just stops counting and assigning a storage location for these uniforms, the count is only used to create the uniform storage. These uniform types don't use this storage. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* glsl: mark link_uniform_blocks_are_compatible() as staticTimothy Arceri2016-07-052-5/+1
| | | | | | Missed this when doing 6d1a59d15b. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* mesa: fix build errorTimothy Arceri2016-07-051-1/+1
| | | | Fix build error cased by 6a524c76f5.
* mesa: faster validation of sampler unit mapping for SSOGregory Hainaut2016-07-051-38/+31
| | | | | | | | | | | | | | | | Code was inspired from _mesa_update_shader_textures_used However unlike _mesa_update_shader_textures_used that only check for a single stage, it will check all stages. It avoids to loop on all uniforms, only active samplers are checked. For my use case: high FS frequency switches with few samplers. Perf event (relative to nouveau_dri.so) goes from 5.01% to 1.68% for the _mesa_sampler_uniforms_pipeline_are_valid function. Signed-off-by: Gregory Hainaut <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* Revert "st/glsl_to_tgsi: don't increase immediate index by 1."Dave Airlie2016-07-051-1/+1
| | | | | | | | | | | | | | | This reverts commit 27d456cc87a01998c6fe1dbf45937e2ca6128495. DOH, what seems right and what is right with fp64 are always two different things. This regressed: spec@arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-mixed-shader on radeonsi Reported-by: Michel Dänzer <[email protected]> Cc: "11.2 12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nvc0/ir: rename NVE4_SU_INFO_XXX to NVC0_SU_INFO_XXXSamuel Pitoiset2016-07-051-49/+49
| | | | | | | | While we are at it, fix a typo inside the comment which describes what those constants are for. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: reset the base offset for indirect images accessesSamuel Pitoiset2016-07-051-2/+4
| | | | | | | | | | In presence of an indirect image access, the base offset should be zeroed because the stride will be computed twice. This is a pretty rare situation but it can happen when tex.r > 0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]>
* gm107/ir: fix sign bit emission for FADD32ISamuel Pitoiset2016-07-051-3/+6
| | | | | | | | | | | | When emitting OP_SUB, the sign bit for FADD and FADD32I is not at the same position. It's at position 45 for FADD but 51 for FADD32I. This fixes the following piglit test: tests/spec/arb_fragment_program/fdo30337b.shader_test Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* vc4: Regularize instruction emit macrosEric Anholt2016-07-042-39/+50
| | | | | | ALU0 didn't have the _dest variant, and ALU2 didn't unset the def the way ALU1 did. This should make the ALU[012] macros much clearer, by moving most of their contents to vc4_qir.c
* vc4: Enable dead CF elimination.Eric Anholt2016-07-041-0/+1
| | | | | | Now that we're about to start generating control flow in our NIR, we want this in place. It optimizes things frequently in the CS, when the GL VS has control flow that doesn't affect the vertex position.
* vc4: Optimize out redundant SF updates.Eric Anholt2016-07-042-6/+78
| | | | | | | | | | | Tiny change on shader-db currently, but it will be important when we start emitting a lot of SFs from the same variable as part of control flow support. total instructions in shared programs: 89463 -> 89430 (-0.04%) instructions in affected programs: 1522 -> 1489 (-2.17%) total estimated cycles in shared programs: 250060 -> 250015 (-0.02%) estimated cycles in affected programs: 8568 -> 8523 (-0.53%)
* vc4: Move SF removal to a separate peephole pass.Eric Anholt2016-07-045-17/+85
| | | | | | | | | The DCE pass is going to change significantly to handle control flow, while we don't really need to change it for the SF handling. We also need to add some more SF peephole optimization for SF updates generated by control flow support. No change on shader-db.
* vc4: DCE instructions with a NULL destination.Eric Anholt2016-07-041-2/+3
| | | | | | | | I'm going to add an optimization for redundant SF update removal, which will just remove the SF and leave us (in many cases) with an instruction with a NULL destination and no side effects. Rather than teaching that pass whether the whole instruction can be removed, leave that responsibility to this pass.
* vc4: Mark texturing setup instructions as having side effects.Eric Anholt2016-07-041-5/+5
| | | | | | | We need to not DCE them even though they don't have a destination in QIR. We also shouldn't relocate them in vc4_opt_vpm. Neither of these things happen, but I'm about to make DCE consider instructions with a NULL destination.
* vc4: Fix a pasteo in scheduling condition flag usage.Eric Anholt2016-07-041-1/+1
| | | | | | | Noticed by code inspection. This hasn't been too big of a deal, because our cond usages all start out as adder ops, either MOVs or the FTOI for Z writes. MOVs *can* get converted to mul ops during scheduling, but apparently we hadn't hit this.
* vc4: Drop the dead QIR_PACK() macro.Eric Anholt2016-07-041-8/+0
| | | | | This isn't used since we switched to using the dst.pack field instead of custom instructions.
* radeonsi: do compilation from si_create_shader_selector asynchronouslyMarek Olšák2016-07-054-7/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | Main shader parts and geometry shaders are compiled asynchronously by util_queue. si_create_shader_selector doesn't wait and returns. si_draw_vbo(si_shader_select) waits for completion. This has the best effect when shaders are compiled at app-loading time. It doesn't help much for shaders compiled on demand, even though VS+PS compilation should take as much as time as the bigger one of the two. If an app creates more shaders, at most 4 threads will be used to compile them. Debug output disables this for shader stats to be printed in the correct order. (We could go even further and build variants asynchronously too, then emit draw calls without waiting and emit incomplete shader states, then force IB chaining to give the compiler more time, then sync the compilation at the IB flush and patch the IB with correct shader states. This is great for compilation before draw calls, but there are some difficulties such as scratch and tess states requiring the compiler output, and an on-disk shader cache will likely be a much better and simpler solution.) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't lock shader cache mutex during compilationMarek Olšák2016-07-051-6/+16
| | | | | | | | | | to allow multiple shaders to be compiled simultaneously. ALso, shader-db can again use all 4 cores. v2: Remove the pipe_mutex_unlock call in the error path. Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
* radeonsi: separate the compilation chunk of si_create_shader_selectorMarek Olšák2016-07-053-80/+110
| | | | | | | The function interface is ready to be used by util_queue. Also, si_shader_select_with_key can no longer accept si_context. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move LLVMTargetMachineRef creation to a separate functionMarek Olšák2016-07-051-14/+18
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add and use radeon_info::max_alloc_size (v2)Marek Olšák2016-07-056-10/+16
| | | | | | | | | | v2: - squashed the patches - use INT_MAX - clamp max_const_buffer_size - check the DRM version in radeon Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Vedran Miletić <[email protected]>
* radeonsi: print LLVM IRs to ddebug logsMarek Olšák2016-07-056-1/+26
| | | | | | | Getting LLVM IRs of hanging shaders have never been easier. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable string markers and record apitrace call numbersMarek Olšák2016-07-053-1/+24
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>