summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: Lower TGSI_OPCODE_LOAD down to LLVM op (v3)Nicolai Hähnle2016-03-211-0/+139
| | | | | | | v2: new signature style for buffer intrinsics (offsets) v3: new signature style for llvm.amdgcn.buffer.load.format (overloaded return) Reviewed-by: Marek Olšák <[email protected]> (v2)
* radeonsi: extract the LLVM type name construction into its own functionNicolai Hähnle2016-03-211-7/+19
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Lower TGSI_OPCODE_RESQ down to LLVM opNicolai Hähnle2016-03-211-0/+129
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract TXQ buffer size computation into its own functionNicolai Hähnle2016-03-211-20/+35
| | | | | | This will allow it to be reused for RESQ. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: decompress shader imagesNicolai Hähnle2016-03-211-3/+33
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: update shader image descriptor for invalidated bufferNicolai Hähnle2016-03-211-1/+21
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement set_shader_images (v2)Nicolai Hähnle2016-03-216-29/+254
| | | | | | | | | Whether DCC is disabled depends on the access flags with which the image is bound: image_load supports DCC, but store and atomic don't. v2: remove an unnecessary masking of images->desc.enabled_mask Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: make r600_texture_disable_dcc externally accessibleNicolai Hähnle2016-03-212-2/+4
| | | | | | We will need it in radeonsi for shader images. Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: track which shader images are really buffersNicolai Hähnle2016-03-212-0/+7
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: add images_writemaskNicolai Hähnle2016-03-212-2/+21
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add additional PIPE_BARRIER_* bitsNicolai Hähnle2016-03-211-0/+7
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* svga: add svga_winsys_context::pipe_debug_callback pointerBrian Paul2016-03-212-2/+9
| | | | | | | | The svga winsys modules can use this to send debug messages to the state tracker and Mesa. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* svga: Fix the index buffer rebind regressionCharmaine Lee2016-03-211-7/+2
| | | | | | | | | | | | | The index buffer handle saved in the hw_state structure could be invalid after the index buffer is destroyed. Instead of rebinding the index buffer using the saved index buffer handle, we will reset the index buffer handle in the hw_state structure to force resending of the index buffer. Fixes bug 1593320 Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* svga: rebind stream output targetsCharmaine Lee2016-03-213-0/+27
| | | | | | | | | To ensure stream output target surfaces are available for the draw commands, we need to rebind the current stream output targets at the first draw in the command buffer. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* svga: rebind index bufferCharmaine Lee2016-03-212-0/+9
| | | | | | | | | | | Similar to other resources, current index buffer needs to be rebound at the first draw of the current command buffer to make sure the buffer is available for the draw command. Fixes bug 1587263. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* svga: minor formatting fix, comment additionBrian Paul2016-03-211-1/+4
| | | | | | To sync with our internal tree. Signed-off-by: Brian Paul <[email protected]>
* svga: optimize constant buffer uploadsCharmaine Lee2016-03-211-1/+11
| | | | | | | | | | | | | | | | | When a constant buffer slot is allocated in the upload buffer, the allocated slot size is always in multiple of 256. But the actual buffer size might not be in multiple of 256. This causes a gap between the ending offset of a slot and the starting offset of the next slot. The gap will prevent the two slots to be updated in a single update command. In order to maximize the chance of merging the contiguous dirty ranges, when a slot is to be allocated in the constant upload buffer, specify a buffer size in multiple of 256. There is about 10% performance improvement with Lightsmark2008 and 30% with Cinebench R11. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Thomas Hellstrom <[email protected]>
* svga: add a few more resource updates HUD queryCharmaine Lee2016-03-216-22/+91
| | | | | | | | | | | | This patch adds the following HUD queries: .num-resource-updates -- number of resource update. Commands include UPDATE_SUBRESOURCE, UPDATE_GB_IMAGE. .num-buffer-uploads -- number of buffer uploads. .num-const-buf-updates -- number of set constant buffer. .num-const-updates -- number of set shader constant. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Thomas Hellstrom <[email protected]>
* svga: add new num-readbacks HUD queryCharmaine Lee2016-03-215-7/+24
| | | | | | | To find out how many image readback command is issued. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Thomas Hellstrom <[email protected]>
* svga: use shader sampler view declarationsBrian Paul2016-03-216-47/+74
| | | | | | | | | | | | | | Previously, we looked at the bound textures (via the pipe_sampler_views) to determine texture dimensions (1D/2D/3D/etc) and datatype (float vs. int). But this could fail in out of memory conditions. If we failed to allocate a texture and didn't create a pipe_sampler_view, we'd default to using 0 (PIPE_BUFFER) as the texture type. This led to device errors because of inconsistent shader code. This change relies on all TGSI shaders having an SVIEW declaration for each SAMP declaration. The previous patch series does that. Reviewed-by: Charmaine Lee <[email protected]>
* gallium/tests: declare sampler views in shadersBrian Paul2016-03-213-0/+3
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium/util: declare sampler view in util_make_fs_blit_msaa_depthstencil()Brian Paul2016-03-211-1/+2
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* postprocess: declare sampler views in shadersBrian Paul2016-03-212-0/+9
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* hud: add sampler view declaration in text fragment shaderBrian Paul2016-03-211-0/+1
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium/tgsi: pass TGSI tex target to tgsi_transform_tex_inst()Brian Paul2016-03-213-17/+20
| | | | | | | Instead of hard-coded 2D tex target in tgsi_transform_tex_2d_inst() Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium: Remove unused TGSI_RESOURCE_ definesHans de Goede2016-03-211-9/+0
| | | | | | | | | These magic file-index defines where only ever used in the nouveau code and that no longer uses them. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]> (v2)
* nouveau: codegen: Do not silently fail in handeLOAD / handleSTORE / handleATOMHans de Goede2016-03-211-9/+18
| | | | | | | | | handeLOAD / handleSTORE / handleATOM can only handle TGSI_FILE_BUFFER and TGSI_FILE_MEMORY. Make things fail explictly when another register-file is used in these functions. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* nouveau: codegen: Disable more old resource handling codeHans de Goede2016-03-211-3/+12
| | | | | | | | | | | | | | Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses") disabled / commented out some of the old resource handling code, but not all of it. Effectively all of it is dead already, if we ever enter the old code paths in handeLOAD / handleSTORE / handleATOM we will get an exception due to trying to access the now always zero-sized resources vector. Disable all the dead code. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* nouveau: codegen: gk110: Make emitSTORE offset handling identical to emitLOADHans de Goede2016-03-211-3/+1
| | | | | | | | | | Make the store offset handling in CodeEmitterGK110::emitSTORE identical to the one in CodeEmitterGK110::emitLOAD handling. This is just a cleanup, it does not cause any functional changes. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: codegen: Slightly refactor Source::scanInstruction() dst handlingHans de Goede2016-03-211-6/+6
| | | | | | | | | | | | Use the dst temp variable which was used in the TGSI_FILE_OUTPUT case everywhere. This makes the code somewhat easier to reads and helps avoiding going over 80 chars with upcoming changes. This also brings the dst handling more in line with the src handling. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: codegen: Add support for clover / OpenCL kernel input parametersHans de Goede2016-03-211-3/+15
| | | | | | | | Add support for clover / OpenCL kernel input parameters. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* tgsi: Add support for global / private / input MEMORYHans de Goede2016-03-217-25/+50
| | | | | | | | | | | | | | | | Extend the MEMORY file support to differentiate between global, private and shared memory, as well as "input" memory. "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a special memory type is added for this, since the actual storage of these (e.g. UBO-s) may differ per implementation. The uploading of kernel parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers to use an access mechanism for parameter reads which matches with the upload method. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi textHans de Goede2016-03-211-0/+6
| | | | | | | | | When support for decl.Atomic and .Shared was added, tgsi_build_declaration was not updated to propagate these properly. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* tgsi: Fix return of uninitialized memory in tgsi_*_instruction_memoryHans de Goede2016-03-201-0/+4
| | | | | | | | | | | | | tgsi_default_instruction_memory / tgsi_build_instruction_memory were returning uninitialized memory for tgsi_instruction_memory.Texture and tgsi_instruction_memory.Format. Note 0 means not set, and thus is a correct default initializer for these. Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory") Cc: Nicolai Hähnle <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/omx/dec: Correct the timestampingNishanth Peethambaran2016-03-204-8/+34
| | | | | | | | | Attach the timestamp to the dpb buffer and use that timestamp while pushing buffer from dpb list to the omx client. Reviewed-by: Christian König <[email protected]> Signed-off-by: Nishanth Peethambaran <[email protected]> Cc: "11.1 11.2" <[email protected]>
* st/omx: Remove trailing spacesNishanth Peethambaran2016-03-203-31/+31
| | | | | | Reviewed-by: Christian König <[email protected]> Signed-off-by: Nishanth Peethambaran <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: fix indirect texturing for non-array textures on nvc0Ilia Mirkin2016-03-201-3/+7
| | | | | | | | | | | | | | | | | | | If a layer parameter is provided, we want to flip it to position 0 (and combine it with any indirect params). However if the target is not an array, there is no layer, so we have to shift all of the arguments down by one to make room for it. This fixes situations where there were non-coordinate parameters, such as bias, lod, depth compare, explicit derivatives. Instead of adding a new parameter at the front for the indirect reference, we would swap one of those in its place. Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.uniform.compute.*shadow Signed-off-by: Ilia Mirkin <[email protected]> Reported-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: normalize cube coordinates after derivatives have been computedIlia Mirkin2016-03-204-15/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In "manual" derivative mode (always used on nv50 and sometimes on nvc0 but always for cube), the idea is that using the quadop instruction, we set up the "other" quads to have values such that the derivatives work out, and then run the texture instruction as if nothing were strange. It pulls values from the other lanes, and does its magic. However cube coordinates have to be normalized - one of the 3 coords has to be 1, to determine which is the major axis, to say which face is being sampled. We were normalizing the coordinates first, and then adding the derivatives. This is wrong for two reasons: - the coordinates got normalized by a scaling factor but the derivatives didn't - the result of the addition didn't end up normalized To resolve this, we flip the logic around to normalize *after* the per-lane coordinates are set up. This fixes a bunch of textureGrad cube dEQP tests. NOTE: nv50 cube arrays with explicit derivatives are still broken, to be resolved at a later date. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* gallium/radeon: remove remnants of R600 TGSI->LLVMMarek Olšák2016-03-202-20/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: flatten if (1) statement after removal of TGSI->LLVMMarek Olšák2016-03-201-81/+79
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: remove TGSI->LLVM translationMarek Olšák2016-03-209-1107/+89
| | | | | | | | | | It was useful for testing and as a prototype for radeonsi bringup, but it's not used anymore and doesn't support OpenGL 3.3 even. v2: try to fix OpenCL build Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Jan Vesely <[email protected]>
* gallium/radeon: remove old CS tracingMarek Olšák2016-03-2020-476/+23
| | | | | | | | | | | | | | Cons: - it was only integrated in r600g - it doesn't work with GPUVM - it records buffer contents at the end of IBs instead of at the beginning, so the replay isn't exact - it lacks an IB parser and user-friendliness A better solution is apitrace in combination with gallium/ddebug, which has a complete IB parser and can pinpoint hanging CP packets. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: process TGSI property NEXT_SHADERMarek Olšák2016-03-192-3/+33
| | | | | | | | | | | | This allows compiling the main shader part as ES or LS. If we get the correct hint, non-separable GLSL shaders no longer have to be compiled as VS first, followed by LS or ES compiled on demand. The result is that fewer shaders are compiled by piglit, but it doesn't improve piglit running time. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add TGSI property NEXT_SHADERMarek Olšák2016-03-195-1/+32
| | | | | | | | | | | | | | | | | | | | | | | | | Radeonsi needs to know which shader stage will execute after a shader in order to make the best decision about which shader variant to compile first. This is only set for VS and TES, because we don't need it elsewhere. VS has 3 variants: - next shader is FS - next shader is GS - next shader is TCS TES has 2 variants: - next shader is FS - next shader is GS Currently, radeonsi always assumes the next shader is FS, which is suboptimal, since st/mesa always knows which shader is next if the GLSL program is not a "separate shader". By default, ureg always sets "next shader is FS". Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0/ir: Use double constant in handleSQRTPierre Moreau2016-03-191-1/+1
| | | | | | Fixes: a100d89d0998 (nv50,nvc0: Fix invalid constant.) Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: force-enable derivatives on TXD opsIlia Mirkin2016-03-192-1/+4
| | | | | | | | | This matters especially in vertex shaders, where derivatives are disabled by default. This fixes textureGrad in vertex shaders on nv50. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50: reset TFB bufctx when we no longer hold a reference to the buffersIlia Mirkin2016-03-192-3/+3
| | | | | | | | | | | | | This fix is analogous to commit ff085d014. This fixes some use-after-free situations in dEQP when an xfb state is removed, and then a clear is triggered, which only does a partial validation. It would attempt to read the no-longer-valid buffers, resulting in crashes. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0: avoid using magic numbers for the uniform_bo offsetsSamuel Pitoiset2016-03-198-44/+73
| | | | | | | | | | | | | | | Instead make use of constants to improve readability. The first 32 bytes of the driver constant buffer are unknown... This doesn't seem to be used in the codegen part, but if the texBindBase offset is shifted from 0x20 to 0x00, this breaks the universe for really weird reasons. This sounds like to be related to textures. Anyway, name this NVC0_CB_AUX_UNK_INFO and add a todo should be enough for now. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nv50/ir: make use of auxCBSlot instead of magic numbersSamuel Pitoiset2016-03-192-2/+4
| | | | | | | | | | | | | This avoids using magic numbers for the driver constbuf slot which is always 15 except for compute shaders on gk104+ where the slot 0 is used. For gk104+, some special compute-related values like the thread index are uploaded to screen->parm which is currently bound on c0. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Acked-by: Pierre Moreau <[email protected]>
* nv50,nvc0: replace resInfoCBSlot by auxCBSlotSamuel Pitoiset2016-03-196-16/+10
| | | | | | | | | Having two different variables for the driver constant buffer slot is confusing and really useless. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Acked-by: Pierre Moreau <[email protected]>