summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* tgsi/scan: add an assert for the size of the samplers_declared bitfieldNicolai Hähnle2016-04-071-1/+2
| | | | | | | The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. Reviewed-by: Brian Paul <[email protected]>
* draw/aaline: stronger guard against no free samplers (v2)Nicolai Hähnle2016-04-071-2/+4
| | | | | | | | | | | | | | | | Line anti-aliasing will fail when there is no free sampler available. Make the corresponding guard more robust in preparation of raising PIPE_MAX_SAMPLERS to 32. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (v1) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Marek Olšák <[email protected]> (v1)
* util/pstipple: stronger guard against no free samplers (v2)Nicolai Hähnle2016-04-071-2/+4
| | | | | | | | | | | | | | | | When hasFixedUnit is false, polygon stippling will fail when there is no free sampler available. Make the corresponding guard more robust in preparation of raising PIPE_MAX_SAMPLERS to 32. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (v1) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Marek Olšák <[email protected]> (v1)
* gallium: Put no.of {samples,layers} into pipe_framebuffer_stateEdward O'Callaghan2016-04-072-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we store the number of samples and layers directly in the pipe_framebuffer_state so that in the case of ARB_framebuffer_no_attachment we may make use of them directly. Further, we adjust various gallium/auxiliary helper functions accordingly. V2: Convert branches in util_framebuffer_get_num_layers() and util_framebuffer_get_num_samples() to their canonical form. V3: 'git stash pop' the typo fix of 'cbufs' which should be 'nr_cbufs' that was missing in V2, woops! Thanks Marek for pointing this out yet again. V4: Squash in the following patch: 'gallium/util: Ensure util_framebuffer_get_num_samples() is valid' Upon context creation, internal driver structures are malloc()'ed and memset() to zero them. This results in a invalid number of samples 'by default'. Handle this in the simplest way to avoid elaborate and probably equally sub-optimial solutions. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallivm: Introduce lp_format_intrinsic.Jose Fonseca2016-04-043-14/+54
| | | | | | | | | | For adding .v4f32 like suffixes to intrinsics, taking special care for scalar case, which was being often neglected. This fixes invalid IR when doing mipmap filtering on SSE2 (the only case where we'd use intrinsics with scalars.) Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use llvm.fabs.Jose Fonseca2016-04-031-8/+3
| | | | | | Exactly the same code. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Prefer backend agnostic intrinsic for rounding.Jose Fonseca2016-04-031-7/+39
| | | | | | | | | We could unconditionally use these instrinsics, but performance with SSE2 would suck, as LLVM falls back to calling libm. lp_test_arit. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Add debug option to force SSE2.Jose Fonseca2016-04-031-11/+14
| | | | | | For simulating less capable machines. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Fix performance regressions due to vector selects.Jose Fonseca2016-04-031-22/+18
| | | | | | | | | LLVM often can't determine the mask elements are all ones/zeros, and there doesn't seem to be a good way to hint that. Thanks to Roland Scheidegger for spotting and analyzing the issue. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Remove lp_build_load_volatile.Jose Fonseca2016-04-032-12/+0
| | | | | | | No longer needed. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards.Jose Fonseca2016-04-038-25/+37
| | | | | | | | | Only provide a fallback for LLVM 3.3. One less dependency on LLVM C++ interface. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: add simple tgsi_is_msaa_target() helperBrian Paul2016-04-021-0/+8
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallium: add threads per block TGSI propertyBas Nieuwenhuizen2016-04-021-0/+3
| | | | | | | | | | The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallivm: Prevent disassembly debug output from being truncated.Jose Fonseca2016-04-011-9/+9
| | | | | | | | | | | By using os_log_message directly, as _debug_vprintf truncates messages to 4K. Also cleanup the disassemble interface. Spotted by Roland. Trivial.
* gallivm: Use vector selects on LLVM 3.3+.Jose Fonseca2016-04-011-3/+5
| | | | | | | | | | | | This is an old patch I had around. Vector selects seem to work well from LLVM 3.3. Using them should improve code quality, as it might make constant propagation pass more effective. Tested lp_test_* Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: silence compiler warning in fetch_sampler_unit()Samuel Pitoiset2016-04-011-1/+1
| | | | | | | | | The unit variable can be used uninitialized. Fixes: 24e77cb09 ("tgsi: handle indirect sampler arrays. (v2)") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* tgsi: fix out of bounds access in exec_atomop()Samuel Pitoiset2016-04-011-1/+1
| | | | | | | | | The number of channels must be 4 for all RGBA components. Fixes: 22d129601 ("tgsi: add support for image operations to tgsi_exec. (v2.1)") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* tgsi: split tgsi_util_get_texture_coord_dim() function into twoBrian Paul2016-03-313-41/+38
| | | | | | | | | | | | | | | It was kind of overloaded, returning two different things. Now get the index of the shadow reference src register with a new tgsi_util_get_shadow_ref_src_index() function. To verify the new code, I added some temp/debug code which looped over all TGSI_TEXTURE_x values, calling the old function and new and checking that the returned indexes matched. Also tested piglit "shadow" tests with softpipe/llvmpipe. No testing of ilo and radeonsi changes. Reviewed-by: Dave Airlie <[email protected]>
* tgsi: skip texture query opcodes when examining texture targetsBrian Paul2016-03-311-1/+15
| | | | | | | | | | Should fix the assertion in piglit spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the TXQ instruction specifies a 2D target but the sampler view was declared as SHADOW2D. Reviewed-by: Michel Dänzer <[email protected]> Tested-by: Michel Dänzer <[email protected]>
* softpipe: add image support to softpipe (v3)Dave Airlie2016-03-311-1/+3
| | | | | | | | | | This adds support for ARB_shader_image_load_store to softpipe. v2: add RESQ support (Ilia) v3: constify, cleanup internals, add some comments (Brian). Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* draw: add support for passing images to vs/gs shaders.Dave Airlie2016-03-315-2/+29
| | | | | | | | This just adds support for passing through images to the tgsi execution stage. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: add support for image operations to tgsi_exec. (v2.1)Dave Airlie2016-03-314-4/+317
| | | | | | | | | | | | This adds support for load/store/atomic operations on images along with image tracking support. v2: add RESQ support. (Ilia) v2.1: constify interface (Brian) split get_image_coord_dim (Brian) Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: introduce NonHelperMaskDave Airlie2016-03-312-0/+5
| | | | | | | | | This is a mask of which of the current 2x2 grid are non-helper invocations. This allows us to mask off the helper invocations later for the image operations. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi_exec: handle execmask when doing indirect lookupsDave Airlie2016-03-311-3/+9
| | | | | Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi_exec: add support for up to 3 address registers (v2)Dave Airlie2016-03-311-2/+3
| | | | | | | v2: be consistent with other definitions. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: (trivial) only verify target for is_tex instructionsRoland Scheidegger2016-03-301-8/+7
| | | | | | | | | | | d3d10 state tracker does not encode (valid) target (only offsets are really used from the texture bits), since that information always comes from the sview dcl, and not the instruction (note the meaning of target is actually slightly different between gl and d3d10 in any case, because d3d10 target does never include shadow bit). Also move the msaa sampler identification as well - would need to set that on the sview not sampler, so while this does not fix it make it at least obvious it won't work with sample instructions.
* tgsi: simplify tgsi_shader_info::is_msaa_sampler checkingBrian Paul2016-03-291-3/+2
| | | | | | | | | We assert that fullinst->Instruction.Texture != 0 above so no need to check it in the conditional. We also have the fullinst->Texture.Texture value in a local variable, so use it. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: collect texture sampler target info in tgsi_scan_shader()Brian Paul2016-03-292-2/+37
| | | | | | | | | | | | | | | Texture sample instructions specify a sampler unit and texture target such as "1D", "2D", "CUBE", etc. Sampler view declarations also specify the sampler unit and texture target. This patch checks that the texture instructions agree with the declarations and collects the texture target type for each sampler unit. v2: only compare instruction's texture target to the sampler view declaration target if the instruction is a TEX instruction, not a SAMPLE instruction. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: Format code in pb_buffer_fenced.c according to style guide.Rovanion Luckey2016-03-291-129/+97
| | | | | | | | | | | | | | This is a tiny housekeeping patch which does the following: * Replaced tabs with three spaces. * Formatted oneline and multiline code comments. Some doxygen comments weren't marked as such and some code comments were marked as doxygen comments. * Spaces between if- and while-statements and their parenthesis. According to the mesa coding style guidelines. Reviewed-by: Brian Paul <[email protected]>
* gallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2)Marek Olšák2016-03-281-5/+5
| | | | | | v2: move the nr_cbufs check above the loop Reviewed-by: Ilia Mirkin <[email protected]> (v1)
* ttn: remove stray global from headerRob Clark2016-03-241-2/+0
| | | | Signed-off-by: Rob Clark <[email protected]>
* tgsi: drop unused set_exec/kill_mask interfaces.Dave Airlie2016-03-223-37/+0
| | | | | | | | | These don't get used and haven't been in git history from what I can see, so drop them. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi/scan: add writes_memory to flag presence of stores or atomicsNicolai Hähnle2016-03-212-4/+9
| | | | Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: track which shader images are really buffersNicolai Hähnle2016-03-212-0/+7
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: add images_writemaskNicolai Hähnle2016-03-212-2/+21
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: declare sampler view in util_make_fs_blit_msaa_depthstencil()Brian Paul2016-03-211-1/+2
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* postprocess: declare sampler views in shadersBrian Paul2016-03-212-0/+9
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* hud: add sampler view declaration in text fragment shaderBrian Paul2016-03-211-0/+1
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium/tgsi: pass TGSI tex target to tgsi_transform_tex_inst()Brian Paul2016-03-213-17/+20
| | | | | | | Instead of hard-coded 2D tex target in tgsi_transform_tex_2d_inst() Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* tgsi: Add support for global / private / input MEMORYHans de Goede2016-03-215-20/+38
| | | | | | | | | | | | | | | | Extend the MEMORY file support to differentiate between global, private and shared memory, as well as "input" memory. "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a special memory type is added for this, since the actual storage of these (e.g. UBO-s) may differ per implementation. The uploading of kernel parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers to use an access mechanism for parameter reads which matches with the upload method. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi textHans de Goede2016-03-211-0/+6
| | | | | | | | | When support for decl.Atomic and .Shared was added, tgsi_build_declaration was not updated to propagate these properly. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* tgsi: Fix return of uninitialized memory in tgsi_*_instruction_memoryHans de Goede2016-03-201-0/+4
| | | | | | | | | | | | | tgsi_default_instruction_memory / tgsi_build_instruction_memory were returning uninitialized memory for tgsi_instruction_memory.Texture and tgsi_instruction_memory.Format. Note 0 means not set, and thus is a correct default initializer for these. Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory") Cc: Nicolai Hähnle <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add TGSI property NEXT_SHADERMarek Olšák2016-03-193-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | Radeonsi needs to know which shader stage will execute after a shader in order to make the best decision about which shader variant to compile first. This is only set for VS and TES, because we don't need it elsewhere. VS has 3 variants: - next shader is FS - next shader is GS - next shader is TCS TES has 2 variants: - next shader is FS - next shader is GS Currently, radeonsi always assumes the next shader is FS, which is suboptimal, since st/mesa always knows which shader is next if the GLSL program is not a "separate shader". By default, ureg always sets "next shader is FS". Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi: add tgsi_transform_op3_inst() functionBrian Paul2016-03-181-0/+34
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* nir: add a bit_size parameter to nir_ssa_dest_initConnor Abbott2016-03-171-7/+7
| | | | | | | | | | | | | | | | | | | | | | v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: rename nir_const_value fields to include bitsize informationIago Toral Quiroga2016-03-171-1/+1
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* draw: fix line stipplingRoland Scheidegger2016-03-151-15/+15
| | | | | | | | | | | | | | | | The logic was comparing actual ints, not true/false values. This meant that it was emitting always multiple line segments instead of just one even if the stipple test had the same result, which looks inefficient, and the segments also overlapped thus breaking line aa as well. (In practice, with the no-op default line stipple pattern, for a 10-pixel long line from 0-9 it was emitting 10 segments, with the individual segments ranging from 0-1, 0-2, 0-3 and so on.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94193 Reviewed-by: Jose Fonseca <[email protected]> CC: <[email protected]>
* tgsi: add tgsi_full_src_register_from_dst helper functionNicolai Hähnle2016-03-142-0/+20
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_inlines: add util_copy_image_viewNicolai Hähnle2016-03-141-0/+10
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add TGSI_PROPERTY_FS_EARLY_DEPTH_STENCILNicolai Hähnle2016-03-141-0/+1
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>