summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: do SEL optimization only when src type for MOV matchesTapani Pälli2014-01-081-0/+6
| | | | | | | | | Fixes a bug where then branch operates with ivec4 while else uses vec4. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72379 Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize pow(2, x) --> exp2(x).Kenneth Graunke2014-01-071-0/+11
| | | | | | | | | | | | | | | | | On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using POW requires putting 2.0 in a register, while EXP2 doesn't. I believe that EXP2 will be faster than POW on basically all GPUs, so it makes sense to optimize it. Looking at the savage2 subset of shader-db: total instructions in shared programs: 113225 -> 113179 (-0.04%) instructions in affected programs: 2139 -> 2093 (-2.15%) instances of 'math pow': 795 -> 749 (-6.14%) instances of 'math exp': 389 -> 435 (11.8%) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Refactor is_zero/one/negative_one into an is_value() method.Kenneth Graunke2014-01-072-68/+23
| | | | | | | | | | | | | | | This patch creates a new generic is_value() method, which checks if an ir_constant has a particular value. (For vectors, it must have the single value repeated across all components.) It then rewrites the is_zero/is_one/is_negative_one methods to use this generic helper. All three were basically identical except for the value they checked for. The other difference is that is_negative_one rejects boolean types. The new is_value function maintains this behavior, only allowing boolean types when checking for 0 or 1. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize pow(1.0, X) --> 1.0.Kenneth Graunke2014-01-071-0/+6
| | | | | | | Surprisingly, this helps one vertex shader in 3DMMES. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Use get_local_param_pointer in glProgramLocalParameters4fvEXT().Kenneth Graunke2014-01-071-19/+11
| | | | | | | | | | | | | | | | | | | Using the get_local_param_pointer helper ensures that the LocalParams arrays have actually been allocated before attempting to use them. glProgramLocalParameters4fvEXT needs to do a bit of extra checking, but it can be simplified since the helper has already validated the target. Fixes crashes in programs that use Cg (for example, Awesomenauts, Rocketbirds: Hardboiled Chicken, and Tiny and Big: Grandpa's Leftovers) since commit e5885c119de1e508099cc1111e1c9f8ff00fab88 (mesa: Dynamically allocate the storage for program local parameters.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73136 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]> Tested-by: Laurent Carlier <[email protected]>
* llvmpipe: Basic implementation of pipe_context::set_sample_mask.José Fonseca2014-01-075-7/+20
| | | | | | | | | | | | | | | | | We don't support MSAA (ie, number of samples is always one) therefore sample_mask boils down to a synonym of the rasterizer_discard flag. Also, this change makes setup actually use the value received in lp_setup_set_rasterizer_discard instead of reaching out to llvmpipe upper layers to re-fetch it. Based on Si Chen's draft. With this patch `wgf11multisample Coverage passes 100%` on the UMD D3D10 state tracker. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Si Chen <[email protected]>
* cso_context: Fix cso_context::sample_mask initial value.José Fonseca2014-01-071-1/+1
| | | | | | | | | | | | The initial value of cso_context::sample_mask_saved is irrelevant as it will be overwritten with cso_context::sample_mask in cso_save_sample_mask. Therefore it is cso_context::sample_mask that needs to be properly initialized. This fixes regressions in blits and mipmap generation after adding support for sample_mask to llvmpipe. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Implement alpha_to_coverage for non-MSAA framebuffers.Si Chen2014-01-073-1/+59
| | | | | | | | Implement Alpha to Coverage by discarding a fragment alpha component is less than 0.5. This is a joint work of Jose and Si. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* swrast: fix delayed texel buffer allocation regression for OpenMPAndreas Fänger2014-01-071-0/+12
| | | | | | | | | | | Commit 9119269ca14ed42b51c7d8e2e662500311b29fa3 moved the texel buffer allocation to _swrast_texture_span(), however, when compiled with OpenMP support this code already runs multi-threaded so a critical section is required to prevent multiple allocations and rendering errors. Cc: "10.0" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/draw: remove double semicolonDave Airlie2014-01-071-1/+1
| | | | | | code cleanup. Signed-off-by: Dave Airlie <[email protected]>
* glsl: rename min(), max() functions to fix MSVC buildBrian Paul2014-01-063-7/+7
| | | | | | | | Evidently, there's some other definition of "min" and "max" that causes MSVC to choke on these function names. Renaming to min2() and max2() fixes things. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove unused PIPE_CONTROL defines.Kenneth Graunke2014-01-061-10/+0
| | | | | | | | | | | | | Both brw_defines.h and intel_reg.h defined PIPE_CONTROL fields, which had similar names, but couldn't be used in the same way. (One had built-in shifts, and the other didn't...) Delete the unused set to preserve sanity. (Eric wrote an almost identical patch back in August, so I believe he approves.) Signed-off-by: Kenneth Graunke <[email protected]>
* mesa: enable AMD_shader_trinary_minmaxMaxence Le Doré2014-01-062-1/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: implement mid3 built-in functionMaxence Le Doré2014-01-061-0/+38
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: implement max3 built-in functionMaxence Le Doré2014-01-061-0/+38
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Implement min3 built-in functionMaxence Le Doré2014-01-061-0/+38
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add min() and max() functions to builder.cppMaxence Le Doré2014-01-062-0/+13
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add a shader_trinary_minmax predicateMaxence Le Doré2014-01-061-0/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add extension tracking for AMD_shader_trinary_minmaxMaxence Le Doré2014-01-064-0/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* haiku libGL: Move from gallium target to src/hglAlexander von Gluck IV2014-01-0611-1/+10
| | | | | | | | | | | | * The Haiku renderers need to link to libGL to function properly in all usage contexts. As mesa drivers build before gallium targets, we couldn't properly link the mesa swrast driver to the gallium libGL target for Haiku. * This is likely better as it mimics how glx is laid out ensuring the Haiku libGL is better understood. * All renderers properly link in libGL now. Acked-by: Brian Paul <[email protected]>
* haiku: Fix missing HaikuGL header pathsAlexander von Gluck IV2014-01-063-0/+3
| | | | Acked-by: Brian Paul <[email protected]>
* mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) queryBrian Paul2014-01-064-0/+47
| | | | | | | | | This is part of the GL_EXT_packed_float extension. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096 Cc: 10.0 <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Warning fixEric Anholt2014-01-061-2/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Delete unused INTEL_WRITE_{PART,FULL} and INTEL_READ #defines.Kenneth Graunke2014-01-061-4/+0
| | | | | | | These are just software flag values (not hardware specific values), and aren't used anywhere. Delete them to avoid confusion. Signed-off-by: Kenneth Graunke <[email protected]>
* radeonsi: calculate NUM_BANKS for DB correctly on CIKMarek Olšák2014-01-063-4/+27
| | | | | | NUM_BANKS is not constant on CIK. Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: set correct pipe config for Hawaii in DBMarek Olšák2014-01-063-15/+26
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: disable HTILE for 1D-tiled depth-stencil buffersMarek Olšák2014-01-061-0/+5
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* glx: check memory allocations in __glXInitVertexArrayState()Juha-Pekka Heikkila2014-01-061-4/+23
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glx: Add missing null check in __glXNewIndirectAPI()Juha-Pekka Heikkila2014-01-061-0/+2
| | | | | | | | Add extra null check in auto generated indirect_init.c via src/mapi/glapi/gen/glX_proto_send.py Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: set size of txf_mcs payload vgrf properlyChris Forbes2014-01-041-0/+1
| | | | | | | | | | | | | | | | | | | | | Previously we left the size of this vgrf as 1, which caused register allocation to be subtly broken. If we were lucky we would explode in the post-alloc instruction scheduler; if we were unlucky we'd just stomp on someone else and get broken rendering. Fixes crash when running `tesseract` with the following settings: msaa 4 glineardepth 0 Also fixes the piglit test: arb_sample_shading-builtin-gl-sample-id Signed-off-by: Chris Forbes <[email protected]> Cc: Anuj Phogat <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72859 Reviewed-by: Kenneth Graunke <[email protected]>
* glcpp: error on multiple #else/#elif directivesErik Faye-Lund2014-01-026-1/+51
| | | | | | | | | | | | | | | | | | | | | The preprocessor currently accepts multiple else/elif-groups per if-section. The GLSL-preprocessor is defined by the C++ specification, which defines the following parse-rule: if-section: if-group elif-groups(opt) else-group(opt) endif-line This clearly only allows a single else-group, that has to come after any elif-groups. So let's modify the code to follow the specification. Add test to prevent regressions. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Carl Worth <[email protected]> Cc: 10.0 <[email protected]>
* glcpp: Replace multi-line comment with a space (even as part of macro ↵Carl Worth2014-01-028-9/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | definition) The preprocessor has always replaced multi-line comments with a single space character, (as required by the specification), but as of commit bd55ba568b301d0f764cd1ca015e84e1ae932c8b the lexer also emitted a NEWLINE token for each newline within the comment, (in order to preserve line numbers). The emitting of NEWLINE tokens within the comment broke the rule of "replace a multi-line comment with a single space" as could be exposed by code like the following: #define FOO a/* */b FOO Prior to commit bd55ba568b301d0f764cd1ca015e84e1ae932c8b, this code defined the macro FOO as "a b" as desired. Since that commit, this code instead defines FOO as "a" and leaves a stray "b" in the output. In this commit, we fix this by not emitting the NEWLINE tokens while lexing the comment, but instead merely counting them in the commented_newlines variable. Then, when the lexer next encounters a non-commented newline it switches to a NEWLINE_CATCHUP state to emit as many NEWLINE tokens as necessary (so that subsequent parsing stages still generate correct line numbers). Of course, it would have been more clear if we could have written a loop to emit all the newlines, but flex conventions prevent that, (we must use "return" for each token we emit). It similarly would have been clear to have a new rule restricted to the <NEWLINE_CATCHUP> state with an action much like the body of this if condition. The problem with that is that this rule must not consume any characters. It might be possible to write a rule that matches a single lookahead of any character, but then we would also need an additional rule to ensure for the <EOF> case where there are no additional characters available for the lookahead to match. Given those considerations, and given that the SKIP-state manipulation already involves a code block at the top of the lexer function, before any rules, it seems best to me to go with the implementation here which adds a similar pre-rule code block for the NEWLINE_CATCHUP. Finally, this commit also changes the expected output of a few, existing glcpp tests. The change here is that the space character resulting from the multi-line comment is now emitted before the newlines corresponding to that comment. (Previously, the newlines were emitted first, and the space character afterward.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686 Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glcpp: Add a more descriptive comment for the SKIP state manipulationCarl Worth2014-01-021-5/+36
| | | | | | | | | | | | | | | | | | | Two things make this code confusing: 1. The uncharacteristic manipulation of lexer start state outside of flex rules. 2. The confusing semantics of the skip_stack (including the "lexing_if" override and the SKIP_NO_SKIP state). This new comment is intended to bring a bit more clarity for any readers. There is no intended beahvioral change to the code here. The actual code changes include better indentation to avoid an excessively-long line, and using the more descriptive INITIAL rather than 0. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Enhance intel_texsubimage_tiled_memcpy() to support all levelsCourtney Goeltzenleuchter2013-12-301-2/+5
| | | | | | | | | | | | | | | | | | | | | | Support all levels of a supported texture format. Using 1024x1024, RGBA 8888 source, mipmap internal-format Before (MB/sec) mipmap (MB/sec) GL_RGBA 627.15 615.90 GL_RGB 456.35 611.53 512x512 GL_RGBA 597.00 619.95 GL_RGB 440.62 611.28 256x256 GL_RGBA 487.80 587.42 GL_RGB 376.63 585.00 Benchmark has been sent to mesa-dev list: teximage_enh Signed-off-by: Courtney Goeltzenleuchter <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Add XRGB to intel_texsubimage_tiled_memcpy()Courtney Goeltzenleuchter2013-12-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MESA_FORMAT_XRGB8888 is equivalent to MESA_FORMAT_ARGB8888 in terms of storage on the device, so okay to use this optimized copy routine. This series builds on work from Frank Henigman to optimize the process of uploading a texture to the GPU. This series adds support for MESA_XRGB_8888 and full miptrees where were found to be common activities in the Smokin' Guns game. The issue was found while profiling the app but that part is not benchmarked. Smokin-Guns uses mipmap textures with an internal format of GL_RGB (MESA_XRGB_8888 in the driver). These changes need a performance tool to run against to show how they improve execution performance for specific texture formats. Using this benchmark I've measured the following improvement on my Ivybridge Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz. 1024x1024 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 628.15 627.15 GL_RGB 265.95 456.35 512x512 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 600.23 597.00 GL_RGB 255.50 440.62 256x256 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 489.08 487.80 GL_RGB 229.03 376.63 Benchmark has been sent to mesa-dev list: teximage Signed-off-by: Courtney Goeltzenleuchter <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* glsl: Fix gl_type of usamplerCube built-in type.Paul Berry2013-12-301-1/+1
| | | | | | | I'm not aware of any piglit tests that this fixes, but the old code was obviously wrong. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add an assertion to _mesa_program_index_to_target().Paul Berry2013-12-301-2/+3
| | | | | | | Only a Mesa bug could cause this function to be called with an out-of-range index, so raise an assertion if that ever happens. Reviewed-by: Brian Paul <[email protected]>
* mesa: Improve static error checking of arrays sized by MESA_SHADER_TYPES.Paul Berry2013-12-303-8/+16
| | | | | | | | | | | | | | | | | | | | | This patch replaces the following pattern: foo bar[MESA_SHADER_TYPES] = { ... }; With: foo bar[] = { ... }; STATIC_ASSERT(Elements(bar) == MESA_SHADER_TYPES); This way, when a new shader type is added in a future version of Mesa, we will get a compile error to remind us that the array needs to be updated. Reviewed-by: Brian Paul <[email protected]>
* glsl: Remove extraneous shader_type argument from analyze_clip_usage().Paul Berry2013-12-301-4/+5
| | | | | | | | This argument was carrying the name of the shader target (as a string). We can get this just as easily by calling _mesa_shader_enum_to_string(). Reviewed-by: Brian Paul <[email protected]>
* glsl: Get rid of hardcoded arrays of shader target names.Paul Berry2013-12-302-15/+9
| | | | | | | We already have a function for converting a shader type index to a string: _mesa_shader_type_to_string(). Reviewed-by: Brian Paul <[email protected]>
* main: Remove unused function _mesa_shader_index_to_type().Paul Berry2013-12-301-15/+0
| | | | Reviewed-by: Brian Paul <[email protected]>
* Rename overloads of _mesa_glsl_shader_target_name().Paul Berry2013-12-3010-38/+38
| | | | | | | | | | | | Previously, _mesa_glsl_shader_target_name() had an overload for GLenum and an overload for the gl_shader_type enum, each of which behaved differently. However, since GLenum is a synonym for unsigned int, and unsigned ints are often used in place of gl_shader_type (e.g. in loop indices), there was a big risk of calling the wrong overload by mistake. This patch gives the two overloads different names so that it's always clear which one we mean to call. Reviewed-by: Brian Paul <[email protected]>
* i965: Remove unused depth_mode parameter from translate_tex_format().Kenneth Graunke2013-12-294-4/+0
| | | | | | | | | | | | According to git blame, this hasn't been used in over two years: commit d2235b0f4681f75d562131d655a6d7b7033d2d8b Author: Eric Anholt <[email protected]> Date: Thu Nov 17 17:01:58 2011 -0800 i965: Always handle GL_DEPTH_TEXTURE_MODE through the shader. Signed-off-by: Kenneth Graunke <[email protected]>
* i965/blorp: unit test compiling integer typed texture fetchesTopi Pohjolainen2013-12-271-0/+86
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling simple gen6 zero-src sampledTopi Pohjolainen2013-12-271-0/+51
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling gen6 msaa-8 cms alpha blendTopi Pohjolainen2013-12-271-0/+57
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling bilinear filteredTopi Pohjolainen2013-12-271-0/+49
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling simple zero-src sampledTopi Pohjolainen2013-12-271-0/+56
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling unaligned msaa-8Topi Pohjolainen2013-12-271-0/+135
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling msaa-8 cms alpha blendTopi Pohjolainen2013-12-271-0/+145
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>