aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* gallium/util: Don't implement u_bit_scan64 on MSVC.Jose Fonseca2015-02-041-0/+2
| | | | | | | | | | As ffsll doesn't exist in MSVC yet, and u_bit_scan64 is only used by radeonsi which is never built with MSVC. This is just a stop-gap fix to unbreak MSVC build until we refactor these mathematical portability wrappers into src/util. Trivial.
* gallium/util: Define ffsll on MinGW.Jose Fonseca2015-02-041-0/+1
| | | | | | Trivial. (Fixing MSVC will be far less so, as _BitScanForward64 is only supported on x64.)
* radeonsi: implement polygon stipplingMarek Olšák2015-02-047-5/+79
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add polygon stipple texture slotMarek Olšák2015-02-041-5/+8
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: deduce rasterizer primitive type at the beginning of draw_vboMarek Olšák2015-02-042-13/+17
| | | | | | I will need this for polygon stippling. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: allow 64 descriptors per arrayMarek Olšák2015-02-042-34/+34
| | | | | | | We need a slot for the stipple texture and the pixel shader already uses 32 textures (16 API slots + 16 FMASK slots). Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add support for sampler views where resource = NULLMarek Olšák2015-02-042-6/+22
| | | | | | | The hardware obeys swizzles even if the resource is NULL. This will be used by set_polygon_stipple. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add support for NULL texture sampler views that return (0,0,0,1)Marek Olšák2015-02-041-2/+28
| | | | | | This used to hang. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix a crash when binding a NULL sampler view listMarek Olšák2015-02-041-1/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move the buffer descriptor to the end of the image descriptorMarek Olšák2015-02-044-7/+9
| | | | | | This will allow supporting NULL textures. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't use tgsi_parse_context to get processor typeMarek Olšák2015-02-041-7/+1
| | | | | | Also remove unused "tokens". Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix instanced arrays with non-zero start instanceMarek Olšák2015-02-041-3/+3
| | | | | | | Fixes piglit ARB_base_instance/arb_base_instance-drawarrays. Cc: 10.3 10.4 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: don't append to streamout buffers that haven't been used yetMarek Olšák2015-02-042-1/+4
| | | | | | | | | | | | The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it. Instead, use offset = 0, which is what we always do when not appending. This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*. Yes, the test does use transform feedback. Cc: 10.3 10.4 <[email protected]> Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gallium: set PIPE_MAX_SAMPLERS to 18Marek Olšák2015-02-041-1/+1
| | | | | | | For drivers that use higher slots not to crash in tgsi_shader_info. Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/u_pstipple: add ability to specify a fixed texture unitMarek Olšák2015-02-043-9/+21
| | | | | | | E.g. r600g can use slot 17, which is outside of the API range. Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/util: add u_bit_scan64Marek Olšák2015-02-041-0/+7
| | | | | | | Same as u_bit_scan, but for uint64_t. Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* tgsi: add tgsi_get_processor_type helper from radeonMarek Olšák2015-02-043-11/+14
| | | | | Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965/fs: Fix saturate on MAD and LRP with the NIR backend.Kenneth Graunke2015-02-041-2/+4
| | | | | | | | | Fixes misrendering in "Witcher 2" with INTEL_USE_NIR=1, and probably many other programs. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Fix _mesa_format_convert fallback path when src is not an array formatIago Toral Quiroga2015-02-041-2/+2
| | | | | | | | | | | When a rebase swizzle is provided and we call _mesa_swizzle_and_convert after unpacking the source format we were always passing normalized=false. We should pass true or false depending on the formats involved in the conversion for the byte and float paths (the integer path cannot ever be normalized). Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Mark Janes <[email protected]>
* st/osmesa: Fix osbuffer->textures indexingPark, Jeongmin2015-02-031-1/+1
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930 Cc: 10.4 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965/nir: use redundant phi optimizationConnor Abbott2015-02-031-0/+2
| | | | | | Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: add an optimization to remove useless phi nodesConnor Abbott2015-02-033-0/+112
| | | | | | | | | | | | | | | | | | | | This removes phi nodes whose sources all point to the same thing. Shader-db results: total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%) NIR instructions in affected programs: 126564 -> 122480 (-3.23%) helped: 615 HURT: 0 total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%) FS instructions in affected programs: 24622 -> 23174 (-5.88%) helped: 138 HURT: 0 Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir/validate: Ensure that phi sources are SSA-onlyJason Ekstrand2015-02-031-10/+3
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/validate: Validate that only float ALU outputs are saturatedJason Ekstrand2015-02-031-0/+8
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_source_mods: Don't lower saturate for non-float outputsJason Ekstrand2015-02-031-0/+4
| | | | Reviewed-by: Connor Abbott <[email protected]>
* i965/fs_nir: Get rid of get_alu_srcJason Ekstrand2015-02-032-59/+75
| | | | | | | | | | | | | | | | | Originally, get_alu_src was supposed to handle resolving swizzles and things like that. However, now that basically every instruction we have only takes scalar sources, we don't really need it anymore. The only case where it's still marginally useful is for the mov and vecN operations that are left over from SSA form. We can handle those cases as a special case easily enough. As a side-effect, we don't need the vec_to_movs pass anymore. v2 Jason Ekstrand <[email protected]>: - Rework the way we detect if we need an extra copy for swizzling. The old code involved a pile of confusing switch fall-throughs; we now use a loop. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use NIR's scalarizing abilities and stop handling vectorsJason Ekstrand2015-02-032-349/+161
| | | | | | | | | | | | | | | | | | | | Now that we can scalarize with NIR, there's no need for all this code anymore. Let's get rid of it and just do scalar operations. v2: run copy prop before lowering phi nodes v3: Get rid of the "emit(...)->saturate = foo" pattern v4: Run alu_to_scalar as an optimization pass total instructions in shared programs: 5998321 -> 5974070 (-0.40%) instructions in affected programs: 732075 -> 707824 (-3.31%) helped: 3137 HURT: 191 GAINED: 18 LOST: 0 Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a pass to lower vector phi nodes to scalar phi nodesJason Ekstrand2015-02-033-0/+293
| | | | | | | | | | | | | | | | | | | | | | v2 Jason Ekstrand <[email protected]>: - Add better comments - Use nir_ssa_dest_init and nir_src_for_ssa more places - Fix some void * casts v3 Jason Ekstrand <[email protected]>: - Rework the way we determine whether or not to sccalarize a phi node to make the recursion non-bogus - Treat load_const instructions as scalarizable v4 Jason Ekstrand <[email protected]>: - Allow uniform and input loads to be scalarizable v5 Jason Ekstrand <[email protected]>: - Also consider loads of inputs (varying, uniform, or ubo) to be scalarizable. We were already doing this for load_var on uniforms and inputs. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for constant propagating into sources with modifiers.Matt Turner2015-02-031-6/+12
| | | | | | | | | | | All but 16 of the programs helped were ARB fp programs. total instructions in shared programs: 5949286 -> 5945470 (-0.06%) instructions in affected programs: 275162 -> 271346 (-1.39%) helped: 1197 GAINED: 1 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Use abs/negate functions in const propagation.Matt Turner2015-02-031-13/+5
| | | | | | No changes in shader-db. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add function to take the abs of immediates.Matt Turner2015-02-032-0/+40
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add function to negate immediates.Matt Turner2015-02-032-0/+40
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Mark UB/B immediates as unreachable.Matt Turner2015-02-031-4/+1
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* gallium/util: Don't use __builtin_clrsb in util_last_bit().Matt Turner2015-02-031-4/+0
| | | | | | | | Unclear circumstances lead to undefined symbols on x86. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916 Cc: [email protected] Reviewed-by: Ilia Mirkin <[email protected]>
* glsl/list: Note that exec_lists may not be realloc'd.Matt Turner2015-02-031-0/+4
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: mark constant array of swizzles as static constNils Wallménius2015-02-041-1/+1
| | | | | | | This saves about 0.5k in the text section for a gallium driver on amd64. Reviewed-by: Chris Forbes <[email protected]>
* mesa: Returns a GL_INVALID_VALUE error on several APIs when buffer size is ↵Eduardo Lima Mitev2015-02-033-12/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | negative Section 2.3.1 (Errors) of the OpenGL 4.5 spec says: "If a negative number is provided where an argument of type sizei or sizeiptr is specified, an INVALID_VALUE error is generated. This patch adds checks for negative buffer size values passed to different APIs. It also moves up the check on other APIs that already had it, making it the first error check performed in the function, for consistency. While there may be other APIs throughtout the code lacking this check (or at least not at the beginning of the function), this patch focuses on the cases that break the dEQP tests listed below. It could be a good excersize for the future to check all other cases, and improve consistency in the order of the checks throughout the whole Mesa code base. This fixes 5 dEQP test: * dEQP-GLES3.functional.negative_api.state.get_attached_shaders * dEQP-GLES3.functional.negative_api.state.get_shader_source * dEQP-GLES3.functional.negative_api.state.get_active_uniform * dEQP-GLES3.functional.negative_api.state.get_active_attrib * dEQP-GLES3.functional.negative_api.shader.program_binary Reviewed-by: Ian Romanick <[email protected]>
* mesa: fix error value in GetFramebufferAttachmentParameteriv for OpenGL ES 3.0Samuel Iglesias Gonsalvez2015-02-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Section 6.1.13 "Framebuffer Object Queries" of OpenGL ES 3.0 spec: "If the default framebuffer is bound to target, then attachment must be BACK, identifying the color buffer; DEPTH, identifying the depth buffer; or STENCIL, identifying the stencil buffer." OpenGL ES 3.0, section 2.5 (GL Errors): "If a command that requires an enumerated value is passed a symbolic constant that is not one of those specified as allowable for that command, an INVALID_ENUM error is generated." Then change the returned error to INVALID_ENUM. Fixes: dEQP-GLES3.functional.fbo.api.attachment_query_default_fbo Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Improve precision of mod(x,y)Iago Toral Quiroga2015-02-038-34/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error ----------------------------------------------------- mod(-1.951171875, 1.9980468750) 0.0000000447 mod(121.57, 13.29) 0.0000023842 mod(3769.12, 321.99) 0.0000762939 mod(3769.12, 1321.99) 0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.0312500000 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick <[email protected]>
* mesa: Allow querying for GL_PRIMITIVE_RESTART_FIXED_INDEX under GLES 3Eduardo Lima Mitev2015-02-031-0/+1
| | | | | | | | | | | | | | GLES 3.0.0 spec introduces context state PRIMITIVE_RESTART_FIXED_INDEX (2.8.1 Transferring Array Elements, page 26) which is not currently possible to query using glGet*() funcs. Fixes 4 dEQP tests: * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getboolean * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger64 * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getfloat Reviewed-by: Ian Romanick <[email protected]>
* glsl: can't have 'const' qualifier used with struct or interface block membersIago Toral Quiroga2015-02-031-0/+7
| | | | | | | | Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_fragment Reviewed-by: Ian Romanick <[email protected]>
* glsl: interface blocks must be declared at global scopeIago Toral Quiroga2015-02-031-0/+8
| | | | | | | | Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_fragment Reviewed-by: Ian Romanick <[email protected]>
* i965: Fix negate with unsigned integersIago Toral Quiroga2015-02-032-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For code such as: uint tmp1 = uint(in0); uint tmp2 = -tmp1; float out0 = float(tmp2); We produce code like: mov(8) g5<1>.xF -g9<4,4,1>.xUD which does not produce correct results. This code produces the results we would expect if tmp1 and tmp2 were signed integers instead. It seems that a similar problem was detected and addressed when using negations with unsigned integers as part of condionals, but it looks like the problem has a wider impact than that. This patch fixes the problem by preventing copy-propagation of negated UD registers in all scenarios, not only in conditionals. Fixes the following 24 dEQP tests: dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uint_* dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec2_* dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec3_* dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec4_* Reviewed-by: Anuj Phogat <[email protected]>
* st/mesa: add EXT_polygon_offset_clamp supportIlia Mirkin2015-02-022-0/+2
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add a cap to determine whether the driver supports offset_clampIlia Mirkin2015-02-0215-1/+20
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965/gen6+: enable EXT_polygon_offset_clampIlia Mirkin2015-02-024-3/+4
| | | | | | | Replace the hard-coded 0's with the context clamp value. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: add support for GL_EXT_polygon_offset_clampIlia Mirkin2015-02-0212-16/+73
| | | | | | | | | | Nothing enables the extension yet, but the values are now available. The spec calls for it to only be exposed for GL 3.3+, which is core-only in mesa. Instead we allow any driver to enable it, including in a compat context for any GL version. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]>
* glapi: add GL_EXT_polygon_offset_clampIlia Mirkin2015-02-024-1/+24
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]>
* glsl: Pick ast_conditional branch regardless of op1/2 being constant.Kenneth Graunke2015-02-021-4/+2
| | | | | | | | | | | | | | | If the ?: operator's condition is a constant value, and both branches were pure expressions, we can just make the resulting value one or the other. Previously, we only did this if op[1] and op[2] were also constant values - but there's no actual reason for that restriction. No changes in shader-db, probably because we usually optimize this later anyway. But it does make us generate less stupid code up front. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add a better PRM citation for the IMS dimension mangling.Kenneth Graunke2015-02-021-1/+22
| | | | | | | | | | | | | | | Paul originally had to reverse engineer these formulas based on the description about how the sampler works. The description here is not the easiest to follow - especially given that it's from the Sandybridge era, when the hardware only did 4x multisampling. Jordan and I recently found another part of the documentation where they simply state that IMS dimensions must be adjusted by a set of formulas. Quoting this section provides an easy to follow explanation for the code, including 2x/4x/8x/16x. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>