summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* svga: remove unused svga_compile_key::texture_msaa fieldBrian Paul2016-04-022-2/+0
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* svga: check TXF instruction's target to determine MSAABrian Paul2016-04-021-1/+1
| | | | | | | | | | Rather than the currently bound texture. This goes along with the earlier patch to get away from examining bound textures and sampler views during shader translation. Fixes VMware bug 1632739. Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: add simple tgsi_is_msaa_target() helperBrian Paul2016-04-021-0/+8
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallium: distinguish between shader IR in get_compute_paramBas Nieuwenhuizen2016-04-0211-33/+47
| | | | | | | | | | | | | For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add global buffer memory barrier bitBas Nieuwenhuizen2016-04-022-0/+3
| | | | | | | | | Currently radeonsi synchronizes after every dispatch and Clover does nothing to synchronize. This is overzealous, especially with GL compute, so add a barrier for global buffers. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add threads per block TGSI propertyBas Nieuwenhuizen2016-04-023-1/+13
| | | | | | | | | | The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add compute shader IR typeBas Nieuwenhuizen2016-04-024-1/+6
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nvc0: enable compute shaders on GK104 and GM107+Samuel Pitoiset2016-04-011-1/+2
| | | | | | | | | Compute support on GK110 is still unstable for weird reasons, but this can be fixed later as the NVF0_COMPUTE envvar prevent using compute. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bump the maximum number of UBOs for compute on KeplerSamuel Pitoiset2016-04-012-3/+0
| | | | | | | | The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS) per compute program must be at least 12. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: do not lower shared+atomics on GM107+Samuel Pitoiset2016-04-011-7/+10
| | | | | | | | For Maxwell, the ATOMS instruction can be used to perform atomic operations on shared memory instead of this load/store lowering pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add atomics support on shared memory for KeplerSamuel Pitoiset2016-04-012-1/+108
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix wrong pred emission for ld lock on GK104Samuel Pitoiset2016-04-011-1/+4
| | | | | | | | This fixes 84b9b8f (nvc0/ir: add missing emission of locked load predicate). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for compute UBOs on KeplerSamuel Pitoiset2016-04-012-1/+57
| | | | | | | | Make sure to avoid out of bounds access in presence of indirect array indexing by loading the size from the driver constant buffer. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add indirect compute support on KeplerSamuel Pitoiset2016-04-011-34/+77
| | | | | | | | | | The grid size is stored as three 32-bits integers in the indirect buffer but the launch descriptor uses a 32-bits integer for both griddim_y and griddim_z like this (z << 16) | y. To make it work, the 16 high bits of griddim_y are overwritten by griddim_z. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: reduce likelihood of collision for real buffers on KeplerSamuel Pitoiset2016-04-011-2/+2
| | | | | | | | | | | Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: store ubo info to the driver constbuf on KeplerSamuel Pitoiset2016-04-014-1/+30
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind user uniforms for compute on KeplerSamuel Pitoiset2016-04-012-27/+55
| | | | | | | | | | | Uniform buffer objects will be sticked to the driver constant buffer like buffers because the launch descriptor only allows 8 CBs. Input kernel parameters for OpenCL are still uploaded to screen->parm which is bound on c0, but this will be changed later with a new series. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind shader buffers for compute on KeplerSamuel Pitoiset2016-04-012-3/+39
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind driver cb for compute on c7[] for KeplerSamuel Pitoiset2016-04-014-45/+37
| | | | | | | | | | | | Instead of using the screen->parm buffer object which will be removed, upload auxiliary constants to uniform_bo to be consistent regarding what we already do for Fermi. This breaks surfaces support (for compute only) but this will be properly re-introduced later for ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallivm: Prevent disassembly debug output from being truncated.Jose Fonseca2016-04-011-9/+9
| | | | | | | | | | | By using os_log_message directly, as _debug_vprintf truncates messages to 4K. Also cleanup the disassemble interface. Spotted by Roland. Trivial.
* radeonsi: use util_strchrnul() to fix android build errorMauro Rossi2016-04-011-1/+2
| | | | | | | | | | | | | | Android Bionic does not support strchrnul() string function, gallium auxiliary util/u_string.h provides util_strchrnul() This change avoids the following building error: external/mesa/src/gallium/drivers/radeonsi/si_shader.c:3863: error: undefined reference to 'strchrnul' collect2: error: ld returned 1 exit status Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]>
* gallivm: Use vector selects on LLVM 3.3+.Jose Fonseca2016-04-011-3/+5
| | | | | | | | | | | | This is an old patch I had around. Vector selects seem to work well from LLVM 3.3. Using them should improve code quality, as it might make constant propagation pass more effective. Tested lp_test_* Reviewed-by: Roland Scheidegger <[email protected]>
* nv50,nvc0: add PIPE_BIND_LINEAR support to is_format_supportedIlia Mirkin2016-03-312-0/+18
| | | | | | | vdpau has recently come to rely on this, so make sure to check it properly. Signed-off-by: Ilia Mirkin <[email protected]>
* tgsi: silence compiler warning in fetch_sampler_unit()Samuel Pitoiset2016-04-011-1/+1
| | | | | | | | | The unit variable can be used uninitialized. Fixes: 24e77cb09 ("tgsi: handle indirect sampler arrays. (v2)") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* tgsi: fix out of bounds access in exec_atomop()Samuel Pitoiset2016-04-011-1/+1
| | | | | | | | | The number of channels must be 4 for all RGBA components. Fixes: 22d129601 ("tgsi: add support for image operations to tgsi_exec. (v2.1)") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* tgsi: split tgsi_util_get_texture_coord_dim() function into twoBrian Paul2016-03-316-47/+46
| | | | | | | | | | | | | | | It was kind of overloaded, returning two different things. Now get the index of the shadow reference src register with a new tgsi_util_get_shadow_ref_src_index() function. To verify the new code, I added some temp/debug code which looped over all TGSI_TEXTURE_x values, calling the old function and new and checking that the returned indexes matched. Also tested piglit "shadow" tests with softpipe/llvmpipe. No testing of ilo and radeonsi changes. Reviewed-by: Dave Airlie <[email protected]>
* tgsi: skip texture query opcodes when examining texture targetsBrian Paul2016-03-311-1/+15
| | | | | | | | | | Should fix the assertion in piglit spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the TXQ instruction specifies a 2D target but the sampler view was declared as SHADOW2D. Reviewed-by: Michel Dänzer <[email protected]> Tested-by: Michel Dänzer <[email protected]>
* nv50/ir: Check for valid insn instead of def sizePierre Moreau2016-03-311-2/+2
| | | | | | | | | | | | | This fixes a null pointer dereference during the register allocation pass, if a function had arguments. Functions arguments get a definition from the function itself, a definition which is therefore not linked to any instruction. If a value ends up having a definition but no linked instruction, the register allocation pass doesn't need to consider whether that value is generated by an instruction that can only handle "short" registers (on nv50). Signed-off-by: Pierre Moreau <[email protected]>
* softpipe: add image support to softpipe (v3)Dave Airlie2016-03-3114-12/+928
| | | | | | | | | | This adds support for ARB_shader_image_load_store to softpipe. v2: add RESQ support (Ilia) v3: constify, cleanup internals, add some comments (Brian). Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* draw: add support for passing images to vs/gs shaders.Dave Airlie2016-03-315-2/+29
| | | | | | | | This just adds support for passing through images to the tgsi execution stage. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: add support for image operations to tgsi_exec. (v2.1)Dave Airlie2016-03-315-6/+319
| | | | | | | | | | | | This adds support for load/store/atomic operations on images along with image tracking support. v2: add RESQ support. (Ilia) v2.1: constify interface (Brian) split get_image_coord_dim (Brian) Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: add support for explicit early depth testingDave Airlie2016-03-316-12/+21
| | | | | | | | | | | | ARB_shader_image_load_store adds support for explicit early depth testing. However we need to make sure we don't overwrite values using the shader written values in this case. This fixes early depth testing in softpipe to conform with those requirements. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: introduce NonHelperMaskDave Airlie2016-03-312-0/+5
| | | | | | | | | This is a mask of which of the current 2x2 grid are non-helper invocations. This allows us to mask off the helper invocations later for the image operations. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi_exec: handle execmask when doing indirect lookupsDave Airlie2016-03-311-3/+9
| | | | | Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi_exec: add support for up to 3 address registers (v2)Dave Airlie2016-03-311-2/+3
| | | | | | | v2: be consistent with other definitions. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: ignore PIPE_BIND_LINEAR in *_is_format_supportedChristian König2016-03-302-0/+10
| | | | | | | | Similar to radeonsi linear layout should work for all not compressed or depth/stencil formats. Fixes issues with VDPAU on r600. Signed-off-by: Christian König <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* st/vdpau: correct null checkThomas Hindoe Paaboel Andersen2016-03-301-4/+4
| | | | | | | The null check of result was the wrong way around. Also, move memset and dereference of result after the null check. Reviewed-by: Christian König <[email protected]>
* tgsi: (trivial) only verify target for is_tex instructionsRoland Scheidegger2016-03-301-8/+7
| | | | | | | | | | | d3d10 state tracker does not encode (valid) target (only offsets are really used from the texture bits), since that information always comes from the sview dcl, and not the instruction (note the meaning of target is actually slightly different between gl and d3d10 in any case, because d3d10 target does never include shadow bit). Also move the msaa sampler identification as well - would need to set that on the sview not sampler, so while this does not fix it make it at least obvious it won't work with sample instructions.
* tgsi: simplify tgsi_shader_info::is_msaa_sampler checkingBrian Paul2016-03-291-3/+2
| | | | | | | | | We assert that fullinst->Instruction.Texture != 0 above so no need to check it in the conditional. We also have the fullinst->Texture.Texture value in a local variable, so use it. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: collect texture sampler target info in tgsi_scan_shader()Brian Paul2016-03-292-2/+37
| | | | | | | | | | | | | | | Texture sample instructions specify a sampler unit and texture target such as "1D", "2D", "CUBE", etc. Sampler view declarations also specify the sampler unit and texture target. This patch checks that the texture instructions agree with the declarations and collects the texture target type for each sampler unit. v2: only compare instruction's texture target to the sampler view declaration target if the instruction is a TEX instruction, not a SAMPLE instruction. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/docs: s/gven/given/Brian Paul2016-03-291-1/+1
|
* gallium: Format code in pb_buffer_fenced.c according to style guide.Rovanion Luckey2016-03-291-129/+97
| | | | | | | | | | | | | | This is a tiny housekeeping patch which does the following: * Replaced tabs with three spaces. * Formatted oneline and multiline code comments. Some doxygen comments weren't marked as such and some code comments were marked as doxygen comments. * Spaces between if- and while-statements and their parenthesis. According to the mesa coding style guidelines. Reviewed-by: Brian Paul <[email protected]>
* svga: emit sampler declarations in the helper function for non vgpu10Charmaine Lee2016-03-293-3/+23
| | | | | | | | | | | | | With commit dc9ecf58c0c5c8a97cd41362e78c2fcd9f6e3b80, we are now getting the sampler target from the sampler view declaration. But since a sampler view declaration can be defined after a sampler declaration, we need to emit the sampler declarations in the pre-helpers function, otherwise, the sampler target might not have defined yet for the sampler declaration. Fixes viewperf maya-03 and various gl trace regressions in hwv11. Reviewed-by: Brian Paul <[email protected]>
* svga: avoid freeing non-malloced memoryBrian Paul2016-03-291-10/+2
| | | | | | | | | | | svga_shader_expand() will fall back to using non-malloced memory for emit.buf if malloc fails. We should check if the memory is malloced before freeing it in the error path of svga_tgsi_vgpu9_translate. Original patch by Thomas Hindoe Paaboel Andersen <[email protected]>. Remove trivial svga_destroy_shader_emitter() function, by BrianP. Signed-off-by: Brian Paul <[email protected]>
* nvc0/ir: move load/store lowering pass to handleLDST()Samuel Pitoiset2016-03-292-54/+61
| | | | | | | Having all this code in a big switch is not really a good pratice. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* st/vdpau: implement the new DMA-buf based interop v2Christian König2016-03-294-3/+116
| | | | | | | | | That should allow us to get away from passing internal structures around. v2: rebased Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* st/vdpau: move FormatRGBAToPipe into the interopChristian König2016-03-295-28/+73
| | | | | | | We are going to need that in the Mesa state tracker as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* st/vdpau: add new interop interfaceChristian König2016-03-292-1/+100
| | | | | | | | Use DMA-buf for the VDPAU interop interface instead of using internal structures. Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* st/vdpau: use linear layout for output surfacesChristian König2016-03-291-1/+2
| | | | | | | | Works around a bug in radeonsi and tiling is actually not very beneficial in this use case. Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeonsi: ignore PIPE_BIND_LINEAR in si_is_format_supported v2Christian König2016-03-291-0/+5
| | | | | | | | | Linear layout should work for all not compressed or depth/stencil formats. v2: restrict it a bit more Signed-off-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>