summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* main: array stride for unsized arrays of arrays are calculated like recordsSamuel Iglesias Gonsalvez2015-10-061-1/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: add std430 layout support for AoASamuel Iglesias Gonsalvez2015-10-061-5/+7
| | | | | Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: add EXT_polygon_offset_clamp support to gen4/gen5Ilia Mirkin2015-10-057-9/+30
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* meta: Update comment about unsupported texture types.Matt Turner2015-10-051-2/+1
| | | | | | | Ken added support for 2DArray (commit ec23d5197e) and 1DArray (commit 14ca61125) last year. Reviewed-by: Anuj Phogat <[email protected]>
* glx: Drop CRAY support.Matt Turner2015-10-052-102/+0
| | | | | | It couldn't have worked anyway. There were calls to undefined functions. Reviewed-by: Emil Velikov <[email protected]>
* glsl: Remove CSE pass.Matt Turner2015-10-054-475/+0
| | | | | | | | | | | | | | | | With NIR, it actually hurts things. total instructions in shared programs: 6529329 -> 6528888 (-0.01%) instructions in affected programs: 14833 -> 14392 (-2.97%) helped: 299 HURT: 1 In all affected programs I inspected (including the single hurt one) the pass CSE'd some multiplies and caused some reassociation (e.g., caused (A * B) * C to be A * (B * C)) when the original intermediate result was reused elsewhere. Acked-by: Kenneth Graunke <[email protected]>
* i965: Generalize predicated break pass for use in vec4 backend.Matt Turner2015-10-055-12/+16
| | | | | | | instructions in affected programs: 44204 -> 43762 (-1.00%) helped: 221 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Use backend_instruction in predicated break peephole.Matt Turner2015-10-051-4/+4
| | | | | | | We're not using any fs_inst fields, and the next commit will make the peephole used by the vec4 backend. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Remove SNB embedded-comparison support from optimizations.Matt Turner2015-10-052-32/+3
| | | | | | | | | We never emit IF instructions with an embedded comparison (lost in the switch to NIR), so this code is not used. If we want to readd support, we should have a pass that merges a CMP instruction with an IF or a WHILE instruction after other optimizations have run. Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: Add missing _mm_mfence() before streaming loads.Matt Turner2015-10-051-0/+3
| | | | | | | | | | | | | | | | | According to the Intel Software Development Manual (Volume 1: Basic Architecture, 12.10.3 Streaming Load Hint Instruction): Streaming loads may be weakly ordered and may appear to software to execute out of order with respect to other memory operations. Software must explicitly use fences (e.g. MFENCE) if it needs to preserve order among streaming loads or between streaming loads and other memory operations. That is, a memory fence is needed to preserve the order between the GPU writing the buffer and the streaming loads reading it back. Reported-by: Joseph Nuzman <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Fix intel_miptree_is_fast_clear_capable()Chad Versace2015-10-051-5/+19
| | | | | | | | | | | | | | | | | | | | | | | | There are three types of fast clears: a. fast depth clears b. fast singlesample color clears c. fast multisample color clears Function intel_miptree_is_fast_clear_capable() checks if a miptree supports fast clears of type (b). Rename the function to disambiguate what it does: old: intel_miptree_is_fast_clear_capable new: intel_miptree_supports_non_msrt_fast_clear The functionally accidentally rejected multisampled color surfaces because it thought they were singlesample array surfaces. Fix that by explicitly rejecting surfaces with samples > 1. This fix would have been needed before we enabled layered fast singlesample color clears (introduced in gen8), which we want to do eventually. For now, though, this patch changes no behavior; it just fixes how the driver chooses its behavior. Reviewed-by: Anuj Phogat <[email protected]>
* i965/mt: Declare some functions as staticChad Versace2015-10-052-7/+3
| | | | | | | | intel_tiling_supports_non_msrt_mcs() and intel_miptree_is_fast_clear_capable() are not used outside of intel_mipmap_tree.c. Reviewed-by: Anuj Phogat <[email protected]>
* i965: Make vec4_visitor's destructor virtualIago Toral Quiroga2015-10-051-1/+1
| | | | | | | | | | | | | | | | | | We need a virtual destructor when at least one of the class' methods is virtual. Failure to do so might lead to undefined behavior when destructing derived classes. Fixes the following warning: brw_vec4_gs_visitor.cpp: In function 'const unsigned int* brw::brw_gs_emit(brw_context*, gl_shader_program*, brw_gs_compile*, void*, unsigned int*)': brw_vec4_gs_visitor.cpp:703:11: warning: deleting object of polymorphic class type 'brw::vec4_gs_visitor' which has non-virtual destructor might cause undefined behaviour [-Wdelete-non-virtual-dtor] delete gs; Curro: This shouldn't be causing any actual bugs at the moment because gen6_gs_visitor is the only subclass of vec4_visitor destroyed through a pointer of a base class (vec4_gs_visitor *) and its destructor is basically the same as its parent's. Anyway it seems sensible to change this so it doesn't bite us in the future. Reviewed-by: Francisco Jerez <[email protected]>
* glsl: set glsl error if binding qualifier used on global scopeTapani Pälli2015-10-051-0/+11
| | | | | | | | Fixes following Piglit test: global-scope-binding-qualifier.frag Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Assert on the number of combined UBO and SSBO binding table entriesIago Toral Quiroga2015-10-052-0/+4
| | | | | | | | In theory we can't break this assertion since the compiler frontend checks that we don't exceed any of the individual limits, but it does not hurt to be extra safe. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Reserve binding table space for SSBO surfacesIago Toral Quiroga2015-10-051-0/+1
| | | | | | | These share the space with UBO surfaces but we need to make sure we allocate enough space for both sets (12 of each) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Define BRW_MAX_SSBOIago Toral Quiroga2015-10-052-7/+10
| | | | | | Instead of using hard-coded values. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Define BRW_MAX_UBOIago Toral Quiroga2015-10-052-3/+6
| | | | | | Instead of using hard-coded values. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Remove more dead visitor/vertex program code.Matt Turner2015-10-043-23/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't print line numbers with INTEL_DEBUG=optimizer.Matt Turner2015-10-041-2/+4
| | | | | | | The thing you want to do with the output files is diff them, which is made more difficult by line numbers changing. Reviewed-by: Alejandro Piñeiro <[email protected]>
* nv30: always go through translate module on big-endianIlia Mirkin2015-10-041-0/+4
| | | | | | | | | | It seems like things are either coming in slighly wrong, or perhaps uploaded incorrectly, but either way passing them through the translate module seems to fix everything. Eventually we should figure out what's going wrong and fix it "for real", but this should do for now. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv30: pretend to have packed texture/surface formatsIlia Mirkin2015-10-041-12/+12
| | | | | | | | | This puts us in line with what the DDX/DRI2 st are expecting. It also happens to work... no idea why, but seems better to have it work than to ask lots of questions. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* st/dri: Use packed RGB formatsMichel Dänzer2015-10-042-17/+17
| | | | | | | | | | Fixes Gallium based DRI drivers failing to load on big endian hosts because they can't find any matching fbconfigs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Ilia Mirkin <[email protected]>
* glsl: reduce memory footprint of uniform_storage structTimothy Arceri2015-10-0511-51/+43
| | | | | | | | The uniform will only be of a single type so store the data for opaque types in a single array. Cc: Francisco Jerez <[email protected]> Cc: Ilia Mirkin <[email protected]>
* i965: Remove shader_prog from vec4_gs_visitor.Kenneth Graunke2015-10-043-9/+9
| | | | | | | Unfortunately it has to stay in gen6_gs_visitor. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use nir->has_transform_feedback_varyings to avoid shader_prog.Kenneth Graunke2015-10-041-1/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Add a nir_shader_info::has_transform_feedback_varyings flag.Kenneth Graunke2015-10-042-0/+5
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.Kenneth Graunke2015-10-045-43/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | Geometry and tessellation shaders process multiple vertices; their inputs are arrays indexed by the vertex number. While GLSL makes this look like a normal array, it can be very different behind the scenes. On Intel hardware, all inputs for a particular vertex are stored together - as if they were grouped into a single struct. This means that consecutive elements of these top-level arrays are not contiguous. In fact, they may sometimes be in completely disjoint memory segments. NIR's existing load_input intrinsics are awkward for this case, as they distill everything down to a single offset. We'd much rather keep the vertex ID separate, but build up an offset as normal beyond that. This patch introduces new nir_intrinsic_load_per_vertex_input intrinsics to handle this case. They work like ordinary load_input intrinsics, but have an extra source (src[0]) which represents the outermost array index. v2: Rebase on earlier refactors. v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lower_io: Make get_io_offset() return a nir_ssa_def * for indirects.Kenneth Graunke2015-10-041-42/+20
| | | | | | | | | | | | | | | get_io_offset() already walks the dereference chain and discovers whether or not we have an indirect; we can just return that rather than computing it a second time via deref_has_indirect(). This means moving the call a bit earlier. By returning a nir_ssa_def *, we can pass back both an existence flag (via NULL checking the pointer) and the value in one parameter. It also simplifies the code somewhat. nir_lower_samplers works in a similar fashion. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: fix whitespaceTimothy Arceri2015-10-041-1/+1
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* radeonsi: enable PIPE_CAP_FORCE_PERSAMPLE_INTERPMarek Olšák2015-10-031-1/+1
| | | | | | Now st/mesa won't generate 2 variants for this state. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: do force_persample_interp in shaders for non-trivial casesMarek Olšák2015-10-033-19/+117
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement the simple case of force_persample_interpMarek Olšák2015-10-034-1/+37
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move SPI_PS_INPUT_ENA/ADDR registers to a separate stateMarek Olšák2015-10-034-14/+29
| | | | | | | This will be a derived state used for changing center->sample and centroid->sample at runtime. Reviewed-by: Michel Dänzer <[email protected]>
* tgsi/scan: add interpolation info into tgsi_shader_infoMarek Olšák2015-10-032-3/+101
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* st/mesa: automatically set per-sample interpolation if using SampleID/PosMarek Olšák2015-10-032-2/+10
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: set force_persample_interp if ARB_sample_shading is usedMarek Olšák2015-10-034-0/+12
| | | | | | | This is only a half of the work. The next patch will handle gl_SampleID/SamplePos, which is the other half of ARB_sample_shading. Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: add per-sample interpolation control into rasterizer statOAeMarek Olšák2015-10-0316-0/+24
| | | | | | | | Required by ARB_sample_shading for drivers that don't want a shader variant in st/mesa. Reviewed-by: Ilia Mirkin <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* st/mesa: add ST_DEBUG=precompile support for tessellation shadersMarek Olšák2015-10-031-0/+20
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: remove Driver.BindImageTextureMarek Olšák2015-10-032-15/+0
| | | | | | | Nothing sets it. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: remove Driver.DeleteSamplerObjectMarek Olšák2015-10-032-20/+10
| | | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: remove Driver.EndCallListMarek Olšák2015-10-036-22/+2
| | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]>
* mesa: remove Driver.BeginCallListMarek Olšák2015-10-036-12/+2
| | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]>
* mesa: remove Driver.EndListMarek Olšák2015-10-036-11/+2
| | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]>
* mesa: remove Driver.NewListMarek Olšák2015-10-036-11/+2
| | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]>
* mesa: remove Driver.NotifySaveBeginMarek Olšák2015-10-037-16/+3
| | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]>
* mesa: remove Driver.SaveFlushVerticesMarek Olšák2015-10-037-10/+5
| | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]>
* mesa: remove Driver.FlushVerticesMarek Olšák2015-10-037-16/+14
| | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]>
* mesa: remove Driver.BeginVerticesMarek Olšák2015-10-033-8/+2
| | | | | | Nothing overrides it. Reviewed-by: Brian Paul <[email protected]>
* mesa: remove Driver.BindArrayObjectMarek Olšák2015-10-033-18/+0
| | | | | | | Nothing sets it. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>