summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Avoid segfault in gen6_upload_stateCarl Worth2013-02-211-1/+1
| | | | | | | | | | | | | | | This fixes a bug introduced in commit 258453716f001eab1288d99765213 and triggered whenever "rb" is NULL. Fixes at least one cause bug #59445: [SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault https://bugs.freedesktop.org/show_bug.cgi?id=59445 (Though segfaults are still possible in that test case, but they have been present since before commit 258453716f which is what's being fixed here.) Reviewed-by: Eric Anholt <[email protected]>
* i965: Consign COORD_REPLACE VS hacks to Pre-Gen6.Paul Berry2013-02-203-11/+34
| | | | | | | | | | | | | | | | | | | | | | | | | Pre-Gen6, the SF thread requires exact matching between VS output slots (aka VUE slots) and FS input slots, even when the corresponding VS output slot is unused due to being overwritten by point coordinate replacement (glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)). As a result, we have a special hack in the VS to ensure when any texture coordinate is subject to point coordinate replacement, it is always allocated space in the VUE, even if it isn't written to by the VS. This hack isn't needed from Gen6 onwards, since SF (Gen7: SBE) swizzling has the ability to insert the point coordinate into gl_TexCoord[] without needing a corresponding unused VUE slot. Note that no modification of SF setup code is required for this patch--get_attr_override() already does the right thing. However, we make a slight comment change to clarify why this works. In addition to eliminating unnecessary VS recompiles and saving precious URB space on Gen6+, this will save us the trouble of having to adjust this hack when we implement geometry shaders. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Don't install glEvalMesh in the beginend dispatch tableIan Romanick2013-02-203-9/+16
| | | | | | | | NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59740 Reviewed-by: Eric Anholt <[email protected]>
* gles2: a stub implementation for GL_EXT_discard_framebufferTapani Pälli2013-02-206-1/+64
| | | | | | | | | | | This patch implements a stub for GL_EXT_discard_framebuffer with required checks listed by the extension specification. This extension is required by GLBenchmark 2.5 when compiled with OpenGL ES 2.0 as the rendering backend. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-and-tested-by: Chad Versace <[email protected]>
* i965/fs: Enable CSE on uniform pull constant loads.Eric Anholt2013-02-191-0/+3
| | | | | | | | Improves on a major performance regression for the dolphin wii emulator from its move to using UBOs. Performance in the UBO codepath (as replayed through apitrace) is up 21.1% +/- 2.3% (n=26/29). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Only do CSE when the dst types match.Eric Anholt2013-02-191-1/+2
| | | | | | | | | | We could potentially do some CSE even when the dst types aren't the same on gen6 where there is no implicit dst type conversion iirc, or in the case of uniform pull constant loads where the dst type doesn't impact what's stored. But it's not worth worrying about. Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for the 9.1 branch.
* i965/fs: Delay setup of uniform loads until after pre-regalloc scheduling.Eric Anholt2013-02-193-27/+66
| | | | | | | | This should fix the register allocation explosion on the GLES 3.0 test on gen6. It also gives us an instruction that will fit our CSE handling. Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for the 9.1 branch.
* i965/fs: Fix copy propagation with smearing.Eric Anholt2013-02-191-1/+2
| | | | | | | | | | We were correctly relaying the smear from MOV's src, but if the MOV didn't do a smear, we don't want to smash the smear value from the instruction being propagated into. Prevents a regression in the upcoming UBO change. Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for the 9.1 branch.
* i965/fs: Add a bit more instruction dumping useful for upcoming work.Eric Anholt2013-02-191-1/+30
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove unused userclip flags.Paul Berry2013-02-193-5/+0
| | | | | | | | | | brw_vs_prog_data::userclip hasn't been used since commit f0cecd4 (i965: Move VUE map computation to once at VS compile time). brw_gs_prog_key::userclip_active hasn't been used since commit 9f3d321 (i965: Make the userclip flag for the VUE map come from VS prog data). Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: implement glBitmap unpacking from a PBO, for the cache pathBrian Paul2013-02-191-2/+11
| | | | | | | | | We weren't mapping the PBO when using the bitmap cache (but we had the PBO code for the non-cache path.) Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61026 Note: This is a candidate for the stable branches.
* st/mesa: remove what is left from u_blitMarek Olšák2013-02-186-29/+0
| | | | Reviewed-by: Brian Paul <[email protected]>
* st/mesa: simplify and improve CopyTexSubImageMarek Olšák2013-02-183-260/+99
| | | | | | | | | | | | | | | | | | | | It has become a bit messy. Changes: - finally correct checking for transfer ops depending on the base format - making sure the base internal format and the texture format match (we were ignoring it, but it's important for correctness) - the way-too-strict rule that both src and dst base formats must be the same was dropped; ensuring the simpler and more permissive rule mentioned above is enough - stop using util_blit_pixels; pipe->blit is flexible enough, and now that we have RGBX and red-alpha formats, pipe->blit can be used for more cases Reviewed-by: Brian Paul <[email protected]>
* st/mesa: don't do sRGB conversion in CopyTexSubImageMarek Olšák2013-02-181-2/+2
| | | | | | | | Assuming I understand EXT_texture_sRGB correctly. NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]>
* st/mesa: implement blit-based TexImage and TexSubImageMarek Olšák2013-02-183-4/+239
| | | | | | | | | | | | | | | | | | | A temporary texture is created such that it matches the format and type combination and pixels are copied to it using memcpy. Then the blit is used to copy the temporary texture to the texture image being modified by TexImage or TexSubImage. The blit takes care of the format and type conversion and swizzling. The result is a very fast texture upload involving as little CPU as possible. This improves performance in apps which upload textures during rendering. An example is the Wine OpenGL backend for DirectDraw, which I used to test the game StarCraft. Profiling had shown that TexSubImage was taking 50% of CPU time without this patch, which was the main motivation for this work, and now TexSubImage only takes 14% of CPU time. I had to underclock my CPU to see any difference in the game and this patch does make the game a lot faster if the CPU is slow (or using the powersave cpufreq profile). Reviewed-by: Brian Paul <[email protected]>
* st/mesa: fix blit-based GetTexImage for 1D array texturesMarek Olšák2013-02-181-19/+52
| | | | | | | | | | | | | This is not easy to hit, because we have 3 code paths now (tried in this order): - memcpy-based (skips the blit) -> _mesa_tex_getimage - blit-based - slow pixel packing -> _mesa_tex_getimage The main difference later in the code is the parameters of _mesa_image_address3d. Reviewed-by: Brian Paul <[email protected]>
* st/mesa: fix blit-based GetTexImage for depth/stencil formatsMarek Olšák2013-02-181-1/+1
| | | | | | BTW, we have 0 tests for glGetTexImage(format=GL_DEPTH*). Reviewed-by: Brian Paul <[email protected]>
* st/mesa: factor out code for determining blit.mask from CopyTexSubImageMarek Olšák2013-02-181-42/+66
| | | | | | I'll need this later. Reviewed-by: Brian Paul <[email protected]>
* i965: Fix leak in blorp CopyTexSubImage2DChristopher James Halse Rogers2013-02-161-2/+2
| | | | | | | | | | | | | | | _mesa_delete_renderbuffer does not call the driver-specific renderbuffer delete function, so the blorp code was leaking the Intel-specific bits, including some GEM objects. Call the renderbuffer's ->Delete() method instead, which does the right thing. Fixes Unity rapidly sending the machine into the arms of the OOM-killer Note: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Use PROGRAM_ERROR_STRING_ARB instead of the _NV nameMatt Turner2013-02-151-1/+1
| | | | | | | Since NV_fragment_program is now gone. No functional change, since the values are identical. Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: fix format query for GL_ARB_texture_rgBrian Paul2013-02-151-3/+4
| | | | | | | | | | | The GL_ARB_texture_rg spec says that we need to support both texturing and rendering for the GL_RED and GL_RG formats. So move the format check up into the rendertarget_mapping[] list. Also, add PIPE_FORMAT_R8_UNORM to the list of formats required. Note: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <[email protected]>
* i965/fs: Do a general SEND dependency workaround for the original 965.Eric Anholt2013-02-153-42/+229
| | | | | | | | | | | | | We'd been ad-hoc inserting instructions in some SEND messages with no knowledge of when it was required (so extra instructions), but not all SENDs (so not often enough). This should do much better than that, though it's still flow-control-ignorant. v2: Use BRW_MAX_MRF instead of magic numbers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58960 Reviewed-by: Kenneth Graunke <[email protected]> NOTE: Candidate for the stable branches.
* i965/gen7: Set up all samplers even if samplers are sparsely used.Eric Anholt2013-02-141-1/+1
| | | | | | | | | | | | | | | | | In GLSL, sampler indices are allocated contiguously from 0. But in the case of ARB_fragment_program (and possibly fixed function), an app that uses texture 0 and 2 will use sampler indices 0 and 2, so we were only allocating space for samplers 0 and 1 and setting up sampler 0. We would read garbage for sampler 2, resulting in flickering textures and an angry simulator. Fixes bad rendering in 0 A.D. and ETQW. This was fixed for pre-gen7 by 28f4be9eb91b12a2c6b1db6660cca71a98c486ec Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680 Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for stable branches.
* st/mesa: try to find exact format matching user format and type for DrawPixelsMarek Olšák2013-02-144-37/+59
| | | | Reviewed-by: Brian Paul <[email protected]>
* intel: Allow blit readpixels even when the pack alignment is set.Eric Anholt2013-02-131-9/+4
| | | | | | | | | | The default alignment is 4, so this fast path was rarely hit. Rather than introduce logic to handle alignment, just use the Mesa core function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46632 Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove writemask support from brw_SAMPLE().Eric Anholt2013-02-135-109/+18
| | | | | | | | | | The code was rather broken for non-XYZW on 8-wide, but all of our callers were using XYZW anyway. For my experiments with using writemask on texturing, I've been using manual header setup in the compiler backends, since we want to actually know what registers are written for optimization and register allocation. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use a helper function for checking for flow control instructions.Eric Anholt2013-02-133-23/+22
| | | | | | | In 2 of our checks, we were missing BREAK and CONTINUE. NOTE: Candidate for the stable branches. Reviewed-by: Kenneth Graunke <[email protected]>
* shaderapi: Fix AttachShader errorbma2013-02-131-0/+14
| | | | | | | | | | | | | | | Detect a duplicate Shader type as and error instead of silently allowing it, restrict to ES2 API. v2: Tapani Pälli <[email protected]> - make the check run time instead of compile time v3: chadv - Quote spec on which error to generate. Signed-off-by: bma <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-and-tested-by: Chad Versace <[email protected]>
* i965: Re-enable the -RHW workaround for original gen4 chips.Eric Anholt2013-02-131-12/+8
| | | | | | | | Fixes broken clipping in supertuxkart and presumably many other applications. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51471 NOTE: Candidate for the stable branches. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen4: Work around missing sRGB RGB DXT1 support.Eric Anholt2013-02-133-4/+20
| | | | | | | | | | | The hardware just doesn't support it. I suspect this was a regression from the move to fixed MESA_FORMATs for compressed textures and that previously we were storing uncompressed for this or something. Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor swizzled" on my GM965. Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: fix texture buffer objectsMarek Olšák2013-02-131-4/+10
| | | | | | Broken by 624528834f53f54c7a934f929769b7e6b230a0b1. Reviewed-by: Brian Paul <[email protected]>
* i965: Use derived state for Haswell's 3DSTATE_VF packet.Kenneth Graunke2013-02-121-2/+2
| | | | | | | | | | | Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX. Fixes gles3conform's primitive_restart_mode test. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: accelerate glGetTexImage for all formats using a blitMarek Olšák2013-02-132-49/+152
| | | | | | | | | | This commit allows using glGetTexImage during rendering and still maintain interactive framerates. This improves performance of WarCraft 3 under Wine. The framerate is improved from 25 fps to 39 fps in the main menu, and from 0.5 fps to 32 fps in the game. v2: fix choosing the format for decompression
* CopyTexImage: Don't check sRGB vs LINEAR for desktop GLJordan Justen2013-02-121-18/+10
| | | | | | | | | | | | | | In OpenGL 4.3, new language was added that would require this check. But, if this check results in broken applications then perhaps it will be reversed. For now, remove this check and re-evaluate when desktop GL 4.3 is closer. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* radeon: Remove dead STANDALONE_MMIO definesMatt Turner2013-02-112-3/+0
| | | | | | | | | | | | These were, at some point in the past, used to request that Xorg's compiler.h export a static inline xf86ReadMmio32 instead of a function pointer. compiler.h only has this option for DEC Alpha. But Xorg's compiler.h isn't being included by either of these two files and the radeon driver still works on Alpha, so the definitions are dead and not needed. Reviewed-by: Michel Dänzer <[email protected]>
* i965: Add missing dirty bits to INTEL_DEBUG=state arrays.Kenneth Graunke2013-02-111-0/+7
| | | | | | | | These are more recent additions, and no one remembered to update the INTEL_DEBUG=state code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Reorganize brw_bits to match the order in brw_context.h.Kenneth Graunke2013-02-111-5/+5
| | | | | | | | | | This reorders the "brw_bits" array in brw_state_upload.c to match the order of the #defines in brw_context.h. Otherwise, it's really hard to see if any are missing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use BRW_NEW_CONTEXT for gen7_disable rather than BRW_NEW_BATCH.Kenneth Graunke2013-02-111-1/+1
| | | | | | | | These don't need to be re-disabled on every batch if we're using hardware contexts. (If we're not, this is equivalent.) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vbo: Merge GL_QUADS drawing requests in display lists.Eric Anholt2013-02-111-0/+43
| | | | | | | | | | | minecraft apparently has its piles of display lists each contain 6 instances of glBegin(GL_QUADS)/verts/glEnd(), which appear in the compiled list as 6 prims of 4 verts each in one draw call. We can reduce driver overhead even more by making that one prim of 24 verts. Improves minecraft performance by 1.6% +/- .25% (n=446) Reviewed-by: Jordan Justen <[email protected]>
* vbo: Print display list debug using printf() like dlist.c does.Eric Anholt2013-02-111-8/+8
| | | | | | | Otherwise, the stderr and stdout debug end up interleaved wrong when I pipe them to a file. Reviewed-by: Jordan Justen <[email protected]>
* i965: Remove some stale comments about the brw_constant_buffer atom.Eric Anholt2013-02-112-12/+0
| | | | | | | These have been wrong since f428255bde93a452a7cdd48fba21839c99beb6cb back in 2009! Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Simplify VS push constant upload code since removal of old path.Eric Anholt2013-02-111-7/+11
| | | | | | | | | | We used to have clip planes optionally included in the push constants, resulting in a variable amount of data uploaded, but no more. This also means less wasted space in the batch for our push constants. v2: Update _NEW_TRANSFORM state bit information. Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* i965: Add perf debug for a corner case.Eric Anholt2013-02-111-0/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix access mode of index buffer rebase.Eric Anholt2013-02-111-1/+1
| | | | | | | It doesn't matter with our current implementation of MapBufferRange, but it was wrong -- the result pointer is read by intel_upload_data(). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix indentation of index buffer rebase code.Eric Anholt2013-02-111-9/+9
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: fix GetTexImage if mesa format and internal format don't matchMarek Olšák2013-02-112-0/+71
| | | | | | | | Tested with softpipe only exposing RGBA formats. NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]>
* mesa: don't use memcpy fast path for GetTexImage if base format is differentMarek Olšák2013-02-111-4/+6
| | | | | | | | | | | The Mesa format can be RGBA8888_REV, the format/type can be GL_RGBA/GL_UNSIGNED_BYTE, but the actual texture internal format can be LUMINANCE_ALPHA, INTENSITY, etc. Therefore we should look at the base internal format as well. NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]>
* mesa: don't use _mesa_base_tex_format for format parameter of GetTexImageMarek Olšák2013-02-111-1/+36
| | | | | | | | | | _mesa_base_tex_format doesn't accept GL_BGR and GL_ABGR_EXT, etc. v2: add a (now hopefully complete) helper function to deal with this NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]>
* mesa: adjust usage of swapBytes/littleEndian in format_matches_format_and_typeMarek Olšák2013-02-111-25/+17
| | | | | | | | | | - swapBytes has no effect on 8-bit single-component formats - GL_SHORT is in host byte order, so checking for littleEndian is unnecessary, I decided to make the change for single-component formats only Based on suggestions from Michel Dänzer. Reviewed-by: Michel Dänzer <[email protected]>
* mesa: remove per-format memcpy codepaths from texstore functionsMarek Olšák2013-02-111-590/+64
| | | | | | It's obsoleted by the common function _mesa_texstore_memcpy. Reviewed-by: Brian Paul <[email protected]>