summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nouveau: codegen: Slightly refactor Source::scanInstruction() dst handlingHans de Goede2016-03-211-6/+6
| | | | | | | | | | | | Use the dst temp variable which was used in the TGSI_FILE_OUTPUT case everywhere. This makes the code somewhat easier to reads and helps avoiding going over 80 chars with upcoming changes. This also brings the dst handling more in line with the src handling. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: codegen: Add support for clover / OpenCL kernel input parametersHans de Goede2016-03-211-3/+15
| | | | | | | | Add support for clover / OpenCL kernel input parameters. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* tgsi: Add support for global / private / input MEMORYHans de Goede2016-03-218-26/+51
| | | | | | | | | | | | | | | | Extend the MEMORY file support to differentiate between global, private and shared memory, as well as "input" memory. "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a special memory type is added for this, since the actual storage of these (e.g. UBO-s) may differ per implementation. The uploading of kernel parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers to use an access mechanism for parameter reads which matches with the upload method. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi textHans de Goede2016-03-211-0/+6
| | | | | | | | | When support for decl.Atomic and .Shared was added, tgsi_build_declaration was not updated to propagate these properly. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1) Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* doc: document spilling options accepted by INTEL_DEBUGIago Toral Quiroga2016-03-211-0/+2
| | | | Reviewed-by: Jordan Justen <[email protected]>
* tgsi: Fix return of uninitialized memory in tgsi_*_instruction_memoryHans de Goede2016-03-201-0/+4
| | | | | | | | | | | | | tgsi_default_instruction_memory / tgsi_build_instruction_memory were returning uninitialized memory for tgsi_instruction_memory.Texture and tgsi_instruction_memory.Format. Note 0 means not set, and thus is a correct default initializer for these. Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory") Cc: Nicolai Hähnle <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: report correct precision information for low/medium/high intsIlia Mirkin2016-03-201-0/+7
| | | | | | | | | | | | | | | | | When we have native integers, these have full precision. Whether they're low/medium/high isn't piped through the TGSI yet, but eventually those might have differing precisions. For now they're just 32-bit ints. Fixes the following dEQP tests: dEQP-GLES3.functional.state_query.shader.precision_vertex_highp_int dEQP-GLES3.functional.state_query.shader.precision_fragment_highp_int which expected highp ints to have full 32-bit precision, not the default 23-bit float precision. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* st/omx/dec: Correct the timestampingNishanth Peethambaran2016-03-204-8/+34
| | | | | | | | | Attach the timestamp to the dpb buffer and use that timestamp while pushing buffer from dpb list to the omx client. Reviewed-by: Christian König <[email protected]> Signed-off-by: Nishanth Peethambaran <[email protected]> Cc: "11.1 11.2" <[email protected]>
* st/omx: Remove trailing spacesNishanth Peethambaran2016-03-203-31/+31
| | | | | | Reviewed-by: Christian König <[email protected]> Signed-off-by: Nishanth Peethambaran <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: fix indirect texturing for non-array textures on nvc0Ilia Mirkin2016-03-201-3/+7
| | | | | | | | | | | | | | | | | | | If a layer parameter is provided, we want to flip it to position 0 (and combine it with any indirect params). However if the target is not an array, there is no layer, so we have to shift all of the arguments down by one to make room for it. This fixes situations where there were non-coordinate parameters, such as bias, lod, depth compare, explicit derivatives. Instead of adding a new parameter at the front for the indirect reference, we would swap one of those in its place. Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.uniform.compute.*shadow Signed-off-by: Ilia Mirkin <[email protected]> Reported-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2" <[email protected]>
* st/mesa: only minify depth for 3d targetsIlia Mirkin2016-03-201-1/+4
| | | | | | | | | | | | | | | | We make sure that that image depth matches the level's depth before copying it into place. However we should only be minifying the first level's depth for 3d textures - array textures have the same depth for all levels. This fixes tests such as dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.* and I suspect account for a number of other odd situations I've run into where level > 0 of array textures was messed up. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: normalize cube coordinates after derivatives have been computedIlia Mirkin2016-03-204-15/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In "manual" derivative mode (always used on nv50 and sometimes on nvc0 but always for cube), the idea is that using the quadop instruction, we set up the "other" quads to have values such that the derivatives work out, and then run the texture instruction as if nothing were strange. It pulls values from the other lanes, and does its magic. However cube coordinates have to be normalized - one of the 3 coords has to be 1, to determine which is the major axis, to say which face is being sampled. We were normalizing the coordinates first, and then adding the derivatives. This is wrong for two reasons: - the coordinates got normalized by a scaling factor but the derivatives didn't - the result of the addition didn't end up normalized To resolve this, we flip the logic around to normalize *after* the per-lane coordinates are set up. This fixes a bunch of textureGrad cube dEQP tests. NOTE: nv50 cube arrays with explicit derivatives are still broken, to be resolved at a later date. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* gallium/radeon: remove remnants of R600 TGSI->LLVMMarek Olšák2016-03-202-20/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: flatten if (1) statement after removal of TGSI->LLVMMarek Olšák2016-03-201-81/+79
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: remove TGSI->LLVM translationMarek Olšák2016-03-2010-1121/+90
| | | | | | | | | | It was useful for testing and as a prototype for radeonsi bringup, but it's not used anymore and doesn't support OpenGL 3.3 even. v2: try to fix OpenCL build Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Jan Vesely <[email protected]>
* gallium/radeon: remove old CS tracingMarek Olšák2016-03-2020-476/+23
| | | | | | | | | | | | | | Cons: - it was only integrated in r600g - it doesn't work with GPUVM - it records buffer contents at the end of IBs instead of at the beginning, so the replay isn't exact - it lacks an IB parser and user-friendliness A better solution is apitrace in combination with gallium/ddebug, which has a complete IB parser and can pinpoint hanging CP packets. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: process TGSI property NEXT_SHADERMarek Olšák2016-03-192-3/+33
| | | | | | | | | | | | This allows compiling the main shader part as ES or LS. If we get the correct hint, non-separable GLSL shaders no longer have to be compiled as VS first, followed by LS or ES compiled on demand. The result is that fewer shaders are compiled by piglit, but it doesn't improve piglit running time. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: set TGSI property NEXT_SHADERMarek Olšák2016-03-191-0/+36
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add TGSI property NEXT_SHADERMarek Olšák2016-03-195-1/+32
| | | | | | | | | | | | | | | | | | | | | | | | | Radeonsi needs to know which shader stage will execute after a shader in order to make the best decision about which shader variant to compile first. This is only set for VS and TES, because we don't need it elsewhere. VS has 3 variants: - next shader is FS - next shader is GS - next shader is TCS TES has 2 variants: - next shader is FS - next shader is GS Currently, radeonsi always assumes the next shader is FS, which is suboptimal, since st/mesa always knows which shader is next if the GLSL program is not a "separate shader". By default, ureg always sets "next shader is FS". Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0/ir: Use double constant in handleSQRTPierre Moreau2016-03-191-1/+1
| | | | | | Fixes: a100d89d0998 (nv50,nvc0: Fix invalid constant.) Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Disallow GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME on winsys FBO.Kenneth Graunke2016-03-191-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: dEQP-GLES3.functional.negative_api.state.get_framebuffer_attachment_parameteriv Apparently, GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME is not allowed when GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is GL_FRAMEBUFFER_DEFAULT, and is expected to result in a GL_INVALID_ENUM error. No GL specification actually defines what GL_FRAMEBUFFER_DEFAULT means. It probably means the window system FBO. It also doesn't mention the behavior of any queries for that type. Various ARB folks seem fairly confused about it too. For now, just do something vaguely like what dEQP expects. I think we probably need to check the visual bits against 0 for the attachment, but we haven't been doing that thusfar, and given how confusingly this is specified, I can't imagine anyone relying on it. v2: Improve comments, move error condition above the _mesa_get_fb0_attachment call, add forgotten "return" (all suggested/caught by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nv50/ir: force-enable derivatives on TXD opsIlia Mirkin2016-03-192-1/+4
| | | | | | | | | This matters especially in vertex shaders, where derivatives are disabled by default. This fixes textureGrad in vertex shaders on nv50. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50: reset TFB bufctx when we no longer hold a reference to the buffersIlia Mirkin2016-03-192-3/+3
| | | | | | | | | | | | | This fix is analogous to commit ff085d014. This fixes some use-after-free situations in dEQP when an xfb state is removed, and then a clear is triggered, which only does a partial validation. It would attempt to read the no-longer-valid buffers, resulting in crashes. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0: avoid using magic numbers for the uniform_bo offsetsSamuel Pitoiset2016-03-198-44/+73
| | | | | | | | | | | | | | | Instead make use of constants to improve readability. The first 32 bytes of the driver constant buffer are unknown... This doesn't seem to be used in the codegen part, but if the texBindBase offset is shifted from 0x20 to 0x00, this breaks the universe for really weird reasons. This sounds like to be related to textures. Anyway, name this NVC0_CB_AUX_UNK_INFO and add a todo should be enough for now. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nv50/ir: make use of auxCBSlot instead of magic numbersSamuel Pitoiset2016-03-192-2/+4
| | | | | | | | | | | | | This avoids using magic numbers for the driver constbuf slot which is always 15 except for compute shaders on gk104+ where the slot 0 is used. For gk104+, some special compute-related values like the thread index are uploaded to screen->parm which is currently bound on c0. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Acked-by: Pierre Moreau <[email protected]>
* nv50,nvc0: replace resInfoCBSlot by auxCBSlotSamuel Pitoiset2016-03-196-16/+10
| | | | | | | | | Having two different variables for the driver constant buffer slot is confusing and really useless. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Acked-by: Pierre Moreau <[email protected]>
* nv50/ir: fix compilation warning in handleSharedATOM()Samuel Pitoiset2016-03-191-0/+1
| | | | | | | | In release build mode only, op may be used uninitialized because the assertion has been removed. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: Fix invalid constant.Vinson Lee2016-03-181-1/+1
| | | | | | | | | | | | | Fix clang build error. CXX codegen/nv50_ir_lowering_nvc0.lo codegen/nv50_ir_lowering_nvc0.cpp:1783:42: error: invalid suffix 'd' on floating constant Value *zero = bld.loadImm(NULL, 0.0d); ^ Fixes: c1e4a6bfbf01 ("nv50,nvc0: handle SQRT lowering inside the driver") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Do proper format error checks for GenerateMipmap in ES 3.x.Kenneth Graunke2016-03-181-0/+14
| | | | | | | | | | | | | | | | | | | | | | | According to the OpenGL ES 3.2 spec's description of GenerateMipmap: "An INVALID_OPERATION error is generated if the levelbase array was not specified with an unsized internal format from table 8.3 or a sized internal format that is both color-renderable and texture-filterable according to table 8.10." Similar text exists in the ES 3.0 specification as well. Our existing rules are pretty close, but miss a few things. The OpenGL specification actually doesn't have any text about internal format checking - our existing code comes from a Khronos bug report. The ES 3.x spec provides a clearer description. Fixes dEQP-GLES3.functional.negative_api.texture.generatemipmap and dEQP-GLES2.functional.negative_api.texture.generatemipmap_zero_level _array_compressed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa: Add color renderable/texture filterable format info for ES 3.x.Kenneth Graunke2016-03-182-0/+90
| | | | | | | | | | | | OpenGL ES 3.x contains a table of sized internal formats and their required properties. In particular, each format is marked as "Color Renderable" or "Texture Filterable". This patch introduces two functions that can be used to query the information from that table. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Stop XY clipping point and line primitives.Kenneth Graunke2016-03-181-1/+7
| | | | | | | | | | | | | | | | | | | | | | | Wide points and lines are not supposed to be clipped by the viewport. Rather, they should be rendered, and any fragments outside of the viewport should be discarded. The traditional use case for this behavior is rendering moving wide point particles. When the center of the point approaches the viewport edge, clipping would make it pop out of view early. Fixes: - dEQP-GLES2.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_center - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_corner Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Scissor to the viewport when rendering points/lines.Kenneth Graunke2016-03-182-5/+8
| | | | | | | | | | | | | | We're about to start allowing wide points/lines whose vertices are outside the viewport past the clipper. This scissoring hack ensures that any fragments generated are still restricted to the viewport. It is not necessary on Gen8+ as those platforms already discard fragments which are outside the viewport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Include the viewport in the scissor rectangle.Kenneth Graunke2016-03-181-4/+4
| | | | | | | | | | We'll need to use scissoring to restrict fragments to the viewport soon. It seems harmless to include it generally, so let's do that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Introduce an is_drawing_lines() helper.Kenneth Graunke2016-03-181-0/+30
| | | | | | | | | Similar to is_drawing_points(). v2: Account for isoline tessellation output topology. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Move is_drawing_points to brw_state.h.Kenneth Graunke2016-03-182-24/+24
| | | | | | | | | I need to use this in multiple source files. v2: Rebase on TES output domain fix. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Fix gl_TessLevelOuter[] for isolines.Kenneth Graunke2016-03-182-6/+22
| | | | | | | | | | | | | | | | | | | | | | | Thanks to James Legg for finding this! From the ARB_tessellation_shader spec: "The number of isolines generated is derived from the first outer tessellation level; the number of segments in each isoline is derived from the second outer tessellation level." According to the PRM, "TF.LineDensity determines # lines" while "TF.LineDetail determines # segments". Line Density is stored at DWord 6, while Line Detail is at DWord 7. So, they're not reversed like they are for triangles and quads. Fixes Piglit's spec/arb_tessellation_shader/execution/isoline, and about 24 dEQP isoline tests (with GL_EXT_tessellation_shader hacked on - it's not normally enabled). Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94524 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Decode non-normalized coordinates bit in SAMPLER_STATE.Kenneth Graunke2016-03-181-2/+3
| | | | | | | We weren't printing this for some reason. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965: Account for TES in is_drawing_points().Kenneth Graunke2016-03-181-1/+8
| | | | | | | | | | | | | Now that we implement tessellation shaders, the TES might be the last stage enabled. If it's outputting points, then the primitive type reaching the SF is points. We need to account for this. Caught by Ilia Mirkin. v2: Update dirty bit comment above caller (caught by Iago) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nv50: Mark compute states as dirty on context switchPierre Moreau2016-03-191-0/+1
| | | | | | Signed-off-by: Pierre Moreau <[email protected]> [ Samuel Pitoiset: Trivial rebase conflict ] Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: print SUBFM subopsSamuel Pitoiset2016-03-191-0/+9
| | | | | | | Only 3d subop is currently emitted. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50: add a new validation path for computeSamuel Pitoiset2016-03-191-12/+13
| | | | | | | | | This makes use of the new state validation interface to be consistent with 3d. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Tested-by: Pierre Moreau <[email protected]>
* nv50: rework nv50_compute_validate_program()Samuel Pitoiset2016-03-193-31/+17
| | | | | | | | | | | Reduce the amount of duplicated code by re-using nv50_program_validate(). While we are at it, change the prototype to return void. We don't check anymore if the translation fails but improving the state validation is a long process. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Tested-by: Pierre Moreau <[email protected]>
* nv50: rework the validation path for 3DSamuel Pitoiset2016-03-194-16/+36
| | | | | | | | | This exposes an interface for state validation that will be also used to rework the compute validation path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Tested-by: Pierre Moreau <[email protected]>
* nv50: rename 3d binding points to NV50_BIND_3D_XXXSamuel Pitoiset2016-03-198-46/+46
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Tested-by: Pierre Moreau <[email protected]>
* nv50: rename 3d dirty flags to NV50_NEW_3D_XXXSamuel Pitoiset2016-03-198-112/+112
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Tested-by: Pierre Moreau <[email protected]>
* nv50: rename NV50_COMPUTE to NV50_CPSamuel Pitoiset2016-03-193-52/+52
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Tested-by: Pierre Moreau <[email protected]>
* nv50: rename nv50_context::dirty to nv50_context::dirty_3dSamuel Pitoiset2016-03-198-57/+57
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Tested-by: Pierre Moreau <[email protected]>
* st/mesa: clean up st_translate_texture_target()Brian Paul2016-03-181-25/+44
| | | | | | Reformat code. Improve assertion. Reviewed-by: Charmaine Lee <[email protected]>
* st/mesa: simplify drawpixels shader code with tgsi transform helper functionsBrian Paul2016-03-181-64/+18
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* st/mesa: simplify bitmap shader code with tgsi transform helper functionsBrian Paul2016-03-181-37/+8
| | | | Reviewed-by: Charmaine Lee <[email protected]>