summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* intel: Avoid making tiled miptrees we won't be able to blit.Eric Anholt2013-04-081-14/+21
| | | | | | | | | | | | | | Doing so was breaking miptree mapping, which we really need to be able to handle. With this change, intel_miptree_map_direct() falls through to doing a CPU mapping on the buffer like we need. With the previous 2 patches, all of these should be fixed: piglit max-texture-size (all 3 patches required!) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37871 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44958 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53494 Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Do temporary CPU maps of textures that are too big to GTT map.Eric Anholt2013-04-081-0/+21
| | | | | | | | This still fails, since 8192*4bpp == 32768, which is too big to use the blitter on. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]>
* intel: Add support for writing to our linear-temporary-CPU-map case.Eric Anholt2013-04-081-2/+23
| | | | | | | This will be used for handling updates of large textures. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>.
* intel: Remove check for kernel 2.6.29.Kenneth Graunke2013-04-081-7/+0
| | | | | | | | | | | | | Now that we require 2.6.39, there's no need to also check for 2.6.29. Calling drm_intel_bufmgr_gem_enable_fenced_relocs() without checking should be safe, as it simply sets a flag. This does remove the check for zero fences available, but that doesn't seem worth checking. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Require kernel 2.6.39 for relaxed relocation support.Kenneth Graunke2013-04-082-5/+4
| | | | | | | | | | Chris Wilson's relaxed relocation patch landed in March 2011. Anyone running pre-3.0 kernels probably isn't going to get the latest Mesa anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove a few BRW_STATE_... enum values.Kenneth Graunke2013-04-081-2/+0
| | | | | | | | These were likely used for BRW_NEW_... dirty bit flags at one point, but they're unused now. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove brw->vb.info and struct brw_vertex_info.Kenneth Graunke2013-04-082-13/+0
| | | | | | | Nobody uses this value, so there's no need to set it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove the BRW_NEW_INPUT_DIMENSIONS flag.Kenneth Graunke2013-04-083-8/+0
| | | | | | | | | | When I removed the proj_attrib_mask optimization, I also removed the last consumer of this bit without realizing it. Since nobody uses it, there's no point in flagging it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* register_allocate: Fix the type of best_benefit.Matt Turner2013-04-081-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* r600g/llvm: Add support for native isa for pre EGVincent Lejeune2013-04-082-2/+6
| | | | | This fixes bug 62756 : https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12
* gallium/util: add const to a parameter of util_max_layerMarek Olšák2013-04-061-1/+1
|
* st/mesa: don't expose ARB_color_buffer_float without driver support in GL coreMarek Olšák2013-04-061-0/+11
| | | | Reviewed-by: Brian Paul <[email protected]>
* mesa: allow drivers not to expose ARB_color_buffer_float in GL core profileMarek Olšák2013-04-067-15/+48
| | | | Reviewed-by: Brian Paul <[email protected]>
* mesa: move updating clamp control derived state out of mesa_update_state_lockedMarek Olšák2013-04-064-36/+40
| | | | | | | | | It has 2 dependencies: glClampColor and the framebuffer, we might just as well do the update where those two are changed. v2: cosmetic changes from Brian's email Reviewed-by: Brian Paul <[email protected]>
* mesa: don't set _ClampFragmentColor to TRUE if it has no effectMarek Olšák2013-04-0613-21/+37
| | | | | | | | This should reduce shader recompilations with drivers that emulate fragment color clamping, because we want the clamping to be enabled only if there is a signed normalized or floating-point colorbuffer. Reviewed-by: Brian Paul <[email protected]>
* mesa: refactor clamping controls, get rid of _ClampReadColorMarek Olšák2013-04-067-30/+58
| | | | | | v2: cosmetic changes from Brian's email Reviewed-by: Brian Paul <[email protected]>
* mesa: don't memcmp() off the end of a cache key.Chris Forbes2013-04-061-2/+9
| | | | | | | | | | | | | | | | Reported-by: `per` in #intel-gfx The size of the cache key varies, so store the actual size as well as the key blob itself, rather than just assuming it's the same as the size passed in. NOTE: This is a candidate for stable branches. V2: Don't leave silly holes in structure; use unsigned instead of GLuint. V3: Fix missing case for `last` match. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* radeonsi: Add compute support v3Tom Stellard2013-04-0511-49/+378
| | | | | | | | | | | v2: - Only dump shaders when env variable is set. v3: - Don't emit VGT registers Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]
* radeonsi: Set TCL1_ACTION_ENA when invalidating the texture cacheTom Stellard2013-04-051-0/+1
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]
* radeonsi: Remove si_pm4_inval_vertex_cache()Tom Stellard2013-04-053-8/+1
| | | | | | | | This function is a holdover from r600g and is identical to si_pm4_inval_texture_cache(), so it is not needed. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]
* gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2Tom Stellard2013-04-0510-80/+105
| | | | | | | | | | | | This target string now contains four values instead of three. The old processor field (which was really being interpreted as arch) has been split into two fields: processor and arch. This allows drivers to pass a more a more detailed description of the hardware to compiler frontends. v2: - Adapt to libclc changes Reviewed-by: Francisco Jerez <[email protected]>
* util: add ETC as compressed formatWladimir2013-04-051-0/+1
| | | | | | | Add UTIL_FORMAT_LAYOUT_ETC to util_format_is_compressed. It was missing. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: fix is_blit_generic_supported() stencil checkingBrian Paul2013-04-051-12/+14
| | | | | | | | | | | | | | | Don't check if there's sampler support for stencil if we're not going to actually blit/copy stencil values. Fixes the case where we mistakenly said we can't support a blit of depth values from S8Z24 to X8Z24. Also, rename the is_stencil variable to dst_has_stencil to improve readability. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* Honor GLX_DONT_CARE in MATCH_MASKAlexander Monakov2013-04-051-1/+3
| | | | | | | | | NOTE: This is a candidate for stable branches. Reviewed-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47478 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62999 Bugzilla: http://bugs.winehq.org/show_bug.cgi?id=26763
* freedreno: use autogenerated register defsRob Clark2013-04-0523-1617/+2116
| | | | | | | | | | | | | Switch to use the envytools generated headers for register/bitfield definitions. This is the first step in preparing to add a3xx support, since it avoids having conflicting names for a3xx and a2xx registers. And since I'm using envytools for a3xx it is simpler to just use it for everything. This shouldn't cause any functional change, it is really just a lot of renaming. Signed-off-by: Rob Clark <[email protected]>
* st/wgl: Install our windows message hook to threads created before the ICD ↵José Fonseca2013-04-052-26/+196
| | | | | | | | | | | | | | | is loaded. Otherwise we will not receive destroy windows events, causing framebuffers to leak. This happens particularly with java and jogl. Tested with java + jogl, MATLAB. VMware Internal Bug Number: 1013086. Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: Work without sse2 if llvm is new enoughAdam Jackson2013-04-051-2/+3
| | | | | | | | At least on llvm 3.2 this appears to work fine. Tested on an Athlon XP 2600+, which has sse and 3dnow but not sse2. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* winsys/radeon: add command stream replay dump for faulty lockup v3Jerome Glisse2013-04-057-37/+443
| | | | | | | | | | | | | | | | | | Build time option, set RADEON_CS_DUMP_ON_LOCKUP to 1 in radeon_drm_cs.h to enable it. When enabled after each cs submission the code will try to detect lockup by waiting on one of the buffer of the cs to become idle, after a timeout it will consider that the cs triggered a lockup and will write a radeon_lockup.c file in current directory that have all information for replaying the cs. To build this file : gcc -O0 -g radeon_lockup.c -ldrm -o radeon_lockup -I/usr/include/libdrm v2: Add radeon_ctx.h file to mesa git tree v3: Slightly improve dumped file for easier editing, only dump first faulty cs Signed-off-by: Jerome Glisse <[email protected]>
* st/xlib: add HUD support for xlib/GLXBrian Paul2013-04-044-0/+34
| | | | | | For the softpipe and llvmpipe drivers. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/hud: add GALLIUM_HUD_PERIOD env varBrian Paul2013-04-041-1/+16
| | | | | | | To set the graph update rate, in seconds. The default update rate has also been changed to 1/2 second. Reviewed-by: Marek Olšák <[email protected]>
* gallium/hud: initialize sampler stateBrian Paul2013-04-041-0/+6
| | | | | | | | | The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with unnormalized texcoords (at least for softpipe). v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE Reviewed-by: Marek Olšák <[email protected]>
* glsl: Add an optimization pass to flatten simple nested if blocks.Kenneth Graunke2013-04-044-0/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GLBenchmark 2.7's shaders contain conditional blocks like: if (x) { if (y) { ... } } where the outer conditional's then clause contains exactly one statement (the nested if) and there are no else clauses. This can easily be optimized into: if (x && y) { ... } This saves a few instructions in GLBenchmark 2.7: total instructions in shared programs: 11833 -> 11649 (-1.55%) instructions in affected programs: 8234 -> 8050 (-2.23%) It also helps CS:GO slightly (-0.05%/-0.22%). More importantly, however, it simplifies the control flow graph, which could enable other optimizations. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use a variable for the push constant size in kB.Kenneth Graunke2013-04-041-2/+3
| | | | | | | | | This clarifies that the offset of 2 is actually 16 kB / 8kB units. It also keys both computations off of a single variable, which should make it easier to change in the future. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Turn brw->urb.vs_size and gs_size into local variables.Kenneth Graunke2013-04-043-22/+12
| | | | | | | | These variables are only used within a single function, so we may as well make them local variables. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Remove BRW_NEW_WM_INPUT_DIMENSIONS dirty bit.Kenneth Graunke2013-04-043-4/+0
| | | | | | | | This was only produced by the brw_wm_input_dimensions atom, which was removed in the previous commit. So there's no need for the dirty bit. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Delete brw_vs_constval.c and the brw_wm_input_sizes atom.Kenneth Graunke2013-04-045-279/+0
| | | | | | | | This was only used to compute proj_attrib_mask, which was removed by the previous commit. That makes this dead code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove now dead brw_wm_prog_key::proj_attrib_mask field.Kenneth Graunke2013-04-043-29/+0
| | | | | | | | | The previous commit removed the last user of this field, so there's no longer any point in setting it. Removing this should eliminate state-dependent recompiles, and make the precompile more reliable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove fixed-function texture projection avoidance optimization.Kenneth Graunke2013-04-041-25/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This optimization attempts to avoid extra attribute interpolation instructions for texture coordinates where the W-component is 1.0. Unfortunately, it requires a lot of complexity: the brw_wm_input_sizes state atom (all the brw_vs_constval.c code) needs to run on each draw. It computes the input_size_masks array, then uses that to compute proj_attrib_mask. Differences in proj_attrib_mask can cause state-dependent fragment shader recompiles. We also often fail to guess proj_attrib_mask for the fragment shader precompile, causing us to needlessly compile it twice. Furthermore, this optimization only applies to fixed-function programs; it does not help modern GLSL-based programs at all. Generally, older fixed-function programs run fine on modern hardware anyway. The optimization has existed in some form since the initial commit. When we rewrote the fragment shader backend, we dropped it for a while. Eric readded it in commit eb30820f268608cf451da32de69723036dddbc62 as part of an attempt to cure a ~1% performance regression caused by converting the fixed-function fragment shader generation code from Mesa IR to GLSL IR. However, no performance data was included in the commit message, so it's unclear whether or not it was successful. Time has passed, so I decided to re-measure this. Surprisingly, Eric's OpenArena timedemo actually runs /faster/ after removing this and the brw_wm_input_sizes atom. On Ivybridge at 1024x768, I measured a 1.39532% +/- 0.91833% increase in FPS (n = 55). On Ironlake, there was no statistically significant difference (n = 37). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use ctx->Stencil._WriteEnabled in DEPTH_STENCIL_STATE.Kenneth Graunke2013-04-041-5/+1
| | | | | | | | This is the same computation as the _WriteEnabled flag, so we may as well use it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Fix stencil write enable flag in 3DSTATE_DEPTH_BUFFER on Gen7+.Kenneth Graunke2013-04-041-1/+1
| | | | | | | | | | | | ctx->Stencil.WriteMask is a statically sized array of 3 elements. Checking it against 0 actually is a NULL check, and can never fail, which meant that we always said stencil writes were enabled. Use the new core Mesa derived state flag to fix this. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* mesa: Add new ctx->Stencil._WriteEnabled derived state flag.Kenneth Graunke2013-04-042-0/+6
| | | | | | | | | | | i965 needs to know whether stencil writes are enabled in several places, and gets the test wrong sometimes. While we could create a function to compute this, it seems generally useful enough to warrant a new piece of derived state. Also, all the plumbing is already in place. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* gallivm: some minor cube map cleanupRoland Scheidegger2013-04-041-10/+15
| | | | | | | | | | | | | | | The ar_ge_as_at variable was just very very confusing since the condition was actually the other way around (as_at_ge_ar). So change the condition (and the selects depending on it) to match the variable name. And also change the chosen major axis in case the coord values are the same. OpenGL doesn't care one bit which one is chosen in this case but it looks like dx10 would require z chosen over y, and y chosen over x (previously did x chosen over y, y chosen over z). Since it's all the same effort just honor dx10's wishes. (Though actually, for some prefered orderings, we could save one (or two with derivatives) selects since the tnewx and tnewz (and the corresponding dmax values) are the same.) Reviewed-by: Jose Fonseca <[email protected]>
* i965: Ask the register allocator to round-robin through registers.Eric Anholt2013-04-043-3/+31
| | | | | | | | | | | | The way we were allocating registers before, packing into low register numbers for Ironlake, resulted in an overly-constrained dependency graph for instruction scheduling. Improves GLBenchmark 2.1 performance by 4.5% +/- 0.7% (n=26). No difference on my old GLSL demo (n=20). No difference on nexuiz (n=15). v2: Fix off-by-one bug that made the change only work for 16-wide on i965. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* llvmpipe: implement ucmpZack Rusin2013-04-042-0/+32
| | | | | | | and add a test for it Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* Avoid spurious GCC warnings in STATIC_ASSERT() macro.Paul Berry2013-04-042-2/+2
| | | | | | | | | | | | | GCC 4.8 now warns about typedefs that are local to a scope and not used anywhere within that scope. This produced spurious warnings with the STATIC_ASSERT() macro (which used a typedef to provoke a compile error in the event of an assertion failure). This patch switches to a simpler technique that avoids the warning. v2: Avoid GCC-specific syntax. Also update p_compiler.h. Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno: document debug flagErik Faye-Lund2013-04-041-0/+4
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* st/wgl: add HUD supportBrian Paul2013-04-045-0/+42
| | | | | | v2: fix a few minor issues spotted by Jose. Reviewed-by: José Fonseca <[email protected]>
* st/wgl: make stw_current_context() non-staticBrian Paul2013-04-042-1/+3
| | | | Reviewed-by: José Fonseca <[email protected]>
* util: add debug_memory_check_block(), debug_memory_tag()Brian Paul2013-04-042-0/+61
| | | | | | | | | | The former just checks that the given block is valid by checking the header and footer. The later sets the memory block's tag. With extra debug code, we can use that for monitoring/checking particular allocations. Reviewed-by: José Fonseca <[email protected]>
* gallium/hud: replace malloc w/ MALLOCBrian Paul2013-04-041-1/+1
| | | | | | To match the FREE() called used later. Fixes things on Windows. Reviewed-by: Marek Olšák <[email protected]>