summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: When emitting a src/dst write of an output, keep the write maskIan Romanick2011-07-231-1/+5
| | | | | | | | | | | | | | Fixes i965 piglit: vs-varying-array-mat[234]-col-row-wr vs-varying-array-mat[234]-index-col-row-wr vs-varying-array-mat[234]-index-row-wr vs-varying-array-mat[234]-row-wr vs-varying-mat[234]-col-row-wr vs-varying-mat[234]-row-wr Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* prog_optimize: Set unused regs to PROGRAM_UNDEFINED after CMP->MOV conversionIan Romanick2011-07-231-0/+9
| | | | | | | | | | | | | | | | | | | Leaving the unused registers with other values caused assertion failures and other problems in places that blindly iterate over all sources. brw_vs_emit.c:1381: get_src_reg: Assertion `c->regs[file][index].nr != 0' failed. Fixes i965 piglit: vs-uniform-array-mat[234]-col-row-rd vs-uniform-array-mat[234]-index-col-row-rd vs-uniform-array-mat[234]-index-row-rd vs-uniform-mat[234]-col-row-rd Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Copy reladdr in src_reg(dst_reg) constructorIan Romanick2011-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | Fixes i965 piglit: vs-temp-array-mat[234]-col-row-wr vs-temp-array-mat[234]-index-col-row-wr vs-temp-array-mat[234]-index-row-wr vs-temp-mat[234]-col-row-wr Fixes swrast piglit: fs-temp-array-mat[234]-col-row-wr fs-temp-array-mat[234]-index-col-row-wr fs-temp-array-mat[234]-index-row-wr fs-temp-mat[234]-col-row-wr vs-temp-array-mat[234]-col-row-wr vs-temp-array-mat[234]-index-col-row-wr vs-temp-array-mat[234]-index-row-wr vs-temp-mat[234]-col-row-wr Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Add each relative address to the previousIan Romanick2011-07-231-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes many cases of accessing arrays of matrices using non-constant indices at each level. Fixes i965 piglit: vs-temp-array-mat[234]-index-col-rd vs-temp-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-wr vs-uniform-array-mat[234]-index-col-rd Fixes swrast piglit: fs-temp-array-mat[234]-index-col-rd fs-temp-array-mat[234]-index-col-row-rd fs-temp-array-mat[234]-index-col-wr fs-uniform-array-mat[234]-index-col-rd fs-uniform-array-mat[234]-index-col-row-rd fs-varying-array-mat[234]-index-col-rd fs-varying-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-rd vs-temp-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-wr vs-uniform-array-mat[234]-index-col-rd vs-uniform-array-mat[234]-index-col-row-rd vs-varying-array-mat[234]-index-col-rd vs-varying-array-mat[234]-index-col-row-rd vs-varying-array-mat[234]-index-col-wr Reviewed-by: Eric Anholt <[email protected]>
* glsl: When lowering non-constant vector indexing, respect existing conditionsIan Romanick2011-07-231-5/+24
| | | | | | | If the non-constant index was in the LHS of an assignment, any existing condititon on that assignment would be lost. Reviewed-by: Eric Anholt <[email protected]>
* glsl: When lowering non-constant array indexing, respect existing conditionsIan Romanick2011-07-231-3/+18
| | | | | | | | | | | | | | | If the non-constant index was in the LHS of an assignment, any existing condititon on that assignment would be lost. Fixes i965 piglit: fs-temp-array-mat[234]-col-row-wr fs-temp-array-mat[234]-index-col-row-wr fs-temp-array-mat[234]-index-col-wr fs-temp-array-mat[234]-index-row-wr vs-varying-array-mat[234]-index-col-wr Reviewed-by: Eric Anholt <[email protected]>
* glsl: Rework lowering of non-constant array indexingIan Romanick2011-07-231-19/+116
| | | | | | | | | | | | | | | | | | | | | | | | | The previous implementation could easily get tricked if the LHS of an assignment included a non-constant index that was "inside" another dereference. For example: mat4 m[2]; m[0][i] = vec4(0.0); Due to the way it tracked whether the array was being assigned, it would think that the non-constant index was in an r-value. The new code fixes that by tracking l-values and r-values differently. The index is also replaced by cloning the IR and replacing the index variable instead of the odd way it was done before. v2: Apply some simplifications suggested by Eric Anholt. Making assignment_generator::rvalue be ir_dereference instead of ir_rvalue simplified the code a bit. Fixes i965 piglit fs-temp-array-mat[234]-index-wr and vs-varying-array-mat[234]-index-wr. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34691 Reviewed-by: Eric Anholt <[email protected]>
* glsl: Split out part of variable_index_to_cond_assign_visitor::needs_loweringIan Romanick2011-07-231-5/+10
| | | | | | | Other code will soon need to know if an array needs lowering based exclusively on the storage mode. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Move is_array_or_matrix outside visitor classIan Romanick2011-07-231-5/+6
| | | | | | | There's no reason for it to be there, and another class that may not have access to the visitor will need it soon. Reviewed-by: Eric Anholt <[email protected]>
* gallivm: Add a note about log2 computation and denormalized numbers.José Fonseca2011-07-221-0/+6
|
* gallivm: Fix lp_build_exp2 order 4-5 polynomial coefficients and bump order.José Fonseca2011-07-221-12/+12
| | | | | | | Not sure how I computed these, but they were wrong (which explains why bumping the polynomial order before never improved precision). This allows to pass the EXP test cases of PSPrecision/VSPrecision DCTs.
* gallivm: Increase lp_build_rsqrt() precision.José Fonseca2011-07-221-1/+1
| | | | | | | Add an iteration step, which makes rqsqrt precision go from 12bits to 24, and fixes RSQ/NRM test case of PSPrecision/VSPrevision DCTs. There are no uses of this function outside shader translation.
* gallivm: Update minimax comments.José Fonseca2011-07-221-6/+17
|
* gallivm: Fix lp_build_exp/lp_build_log.José Fonseca2011-07-221-2/+2
| | | | | Never used so far -- we only used the base 2 variants -- which is why it went unnoticed so far.
* llvmpipe: Unit tests for arithmetic functions.José Fonseca2011-07-223-2/+298
| | | | | | Conflicts: src/gallium/drivers/llvmpipe/SConscript
* util: Store alpha value too.José Fonseca2011-07-221-1/+1
|
* glsl: Add standalone_scaffolding.cpp to SConscript.Vinson Lee2011-07-221-0/+1
|
* glsl: Add unit tests for lower_jumps.cppPaul Berry2011-07-2253-0/+1538
| | | | | | | | | | | | | | These tests invoke do_lower_jumps() in isolation (using the glsl_test executable) and verify that it transforms the IR in the expected way. The unit tests may be run from the top level directory using "make check". For reference, I've also checked in the Python script create_test_cases.py, which was used to generate these tests. It is not necessary to run this script in order to run the tests. Acked-by: Chad Versace <[email protected]>
* glsl: Create a standalone executable for testing optimization passes.Paul Berry2011-07-225-3/+403
| | | | | | | | | | | | | This patch adds a new build artifact, glsl_test, which can be used for testing optimization passes in isolation. I'm hoping that we will be able to add other useful standalone tests to this executable in the future. Accordingly, it is built in a modular fashion: the main() function uses its first argument to determine which test function to invoke, removes that argument from argv[], and then calls that function to interpret the rest of the command line arguments and perform the test. Currently the only test function is "optpass", which tests optimization passes.
* glsl: Move functions into standalone_scaffolding.cpp for later reuse.Paul Berry2011-07-224-58/+150
| | | | | | | | | | | | | | | | This patch moves the following functions from main.cpp (the main cpp file for the standalone executable that is used to create the built-in functions) to standalone_scaffolding.cpp, so that they can be re-used in other standalone executables: - initialize_context()* - _mesa_new_shader() - _mesa_reference_shader() *initialize_context contained some code that was specific to main.cpp, so it was split into two functions: initialize_context() (which remains in main.cpp), and initialize_context_from_defaults() (which is in standalone_scaffolding.cpp).
* mesa: Add an ifndef guard around the definition of the INLINE macroPaul Berry2011-07-221-20/+22
| | | | | | | | | Several Mesa headers redundantly define the INLINE macro. Adding this guard prevents the compiler from complaining about macro redefinition. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* Revert "g3dvl: Preserve previously rendered components for MC output."Younes Manton2011-07-211-4/+4
| | | | | | This reverts commit b56daf71d2f63d044d4c53ab49c6f87e02991a28. The bug is actually in softpipe's blend and writemask interaction.
* Merge branch 'gallium-polygon-stipple'Brian Paul2011-07-2119-168/+447
|\
| * softpipe: use the polygon stipple utility moduleBrian Paul2011-07-219-14/+131
| | | | | | | | | | | | | | | | | | | | This is an alternative to the draw module's polygon stipple stage. The softpipe implementation here is just a test. The advantange of using the new polygon stipple utility module (with other drivers) is we can avoid software vertex processing in the draw module and get much better performance. Polygon stipple doesn't require special vertex processing like the other draw module stage.
| * softpipe: implement fragment shader variantsBrian Paul2011-07-2113-117/+251
| | | | | | | | We'll need shader variants to accomodate the new polygon stipple utility.
| * util: assorted updates to polygon stipple helperBrian Paul2011-07-211-10/+33
| |
| * softpipe: use tgsi_shader_info fields for fragcoord origin, center, etc.Brian Paul2011-07-214-17/+5
| |
| * tgsi: add info fields for fragcoord origin, center, etcBrian Paul2011-07-212-10/+31
| |
| * softpipe: remove obsolete commentBrian Paul2011-07-211-4/+0
| |
| * softpipe: rename a functionBrian Paul2011-07-211-7/+7
| |
* | Merge branch 'remove-copyteximage-hook'Brian Paul2011-07-2115-411/+35
|\ \
| * | st/mesa: get rid of redundant clipping code in st_copy_texsubimage()Brian Paul2011-07-191-28/+0
| | |
| * | mesa: remove unused dd_function_table::CopyTexImage1D/2D() hooksBrian Paul2011-07-191-18/+0
| | |
| * | meta: remove _mesa_meta_CopyTexImage1D/2D()Brian Paul2011-07-193-125/+0
| | |
| * | st/mesa: remove st_CopyTexImage1D/2D()Brian Paul2011-07-191-55/+0
| | |
| * | radeon: remove radeonCopyTexImage2D()Brian Paul2011-07-197-65/+0
| | |
| * | intel: remove intelCopyTexImage1D/2D()Brian Paul2011-07-191-97/+0
| | |
| * | mesa: remove comments referring to Driver.TexImage1D/2DBrian Paul2011-07-191-6/+3
| | |
| * | mesa: stop using ctx->Driver.CopyTexImage1D/2D() hooksBrian Paul2011-07-191-17/+32
| | |
* | | u_vbuf_mgr: restore buffer offsetsChia-I Wu2011-07-211-0/+10
| | | | | | | | | | | | | | | | | | | | | u_vbuf_upload_buffers modifies the buffer offsets. If they are not restored, and any of the vertex formats is not supported natively, the next u_vbuf_mgr_draw_begin call will translate the vertex buffers with incorrect buffer offsets.
* | | mesa: GLES2 should return different error enums for invalid fbo queriesMarek Olšák2011-07-211-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ES 2.0.25 page 127 says: If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then querying any other pname will generate INVALID_ENUM. See also: b9e9df78a03edb35472c2e231aef4747e09db792 NOTE: This is a candidate for the 7.10 and 7.11 branches. Reviewed-by: Ian Romanick <[email protected]>
* | | nouveau: hook up video decoding with nouveau_contextChristoph Bumiller2011-07-218-1/+56
| | | | | | | | | | | | | | | This doesn't include nvfx since its context struct is not derived from common nouveau_context (yet).
* | | glsl: Add ir_function_detect_recursion.cpp to SConscript.Vinson Lee2011-07-201-0/+1
| | |
* | | glsl: Reject shaders that contain static recursionIan Romanick2011-07-205-0/+404
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GLSL 1.20 and later specs say: "Recursion is not allowed, not even statically. Static recursion is present if the static function call graph of the program contains cycles." Recursion is detected and rejected both a compile-time and at link-time. The complie-time check happens to detect some cases that may be removed by various optimization passes. The spec doesn't seem to allow this, but other vendors (e.g., NVIDIA) appear to only check at link-time after all optimizations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33885 Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* | | glsl: Make prototype_string publicly availableIan Romanick2011-07-202-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | Also clarify the documentation for one of the parameters. Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* | | g3dvl: remove unused vertex shader inputsMarek Olšák2011-07-202-4/+4
| | | | | | | | | | | | See also comments in the code.
* | | i965: Apply a homebrew workaround for GPU hang in OGLC api-texcoord.Eric Anholt2011-07-201-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The behavior of flushes in the hardware is a maze of twisty passages, and strangely the VS constants appear to be loaded during a pipeline flush instead of at the time of the packet emit according to the simulator. On moving the STATE_BASE_ADDRESS packet to where it really needed to live (in order for data loads by other packets to be correct), we sometimes no longer got a flush between those packets where we apparently needed it. This replicates the flushes implied by a STATE_BASE_ADDRESS update, fixing the GPU hangs in OGLC and the "engine" demo. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36821 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39257 Tested-by: Keith Packard <[email protected]> (bzflag and etracer fixed) Acked-by: Kenneth Graunke <[email protected]>
* | | i965: Enable the PIPE_CONTROL workaround workaround out of paranoia.Eric Anholt2011-07-202-3/+29
| | | | | | | | | | | | | | | | | | | | | | | | There's scary stuff going on in PIPE_CONTROL internals, and if the BSpec says to do this to make PIPE_CONTROL work, I'll go ahead and do it because we'll probably never be able to debug it after the fact. v2: Use stall at scoreboard instead of depth stall, as noted by Ken.
* | | i965: Avoid kernel BUG_ON if we happen to wait on the pipe_control w/a BO.Eric Anholt2011-07-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For this and occlusion queries, we're trying to avoid setting I915_GEM_DOMAIN_RENDER for the write domain, because the data written is definitely not going through the render cache, but we do need to tell the kernel that the object has been written. However, with using I915_GEM_DOMAIN_GTT, the kernel on retiring the batchbuffer sees that the w/a BO has a write domain of GTT, and puts it on the flushing list. If something tries to wait for that BO to finish rendering (such as the AUB dumper reading the contents of BOs), we get into wait_request (since obj->active) but with a 0 seqno (since the object is on the flushing list, not actually on a ringbuffer), and BUG_ONs. To avoid the kernel bug (which I'm hoping to delete soon anyway), just use I915_GEM_DOMAIN_INSTRUCTION like occlusion queries do. This doesn't result in more flushing, because we invalidate INSTRUCTION on every batchbuffer now that we're state streaming, anyway. Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Kenneth Graunke <[email protected]>
* | | intel: Use the GLSL-based meta clear when available.Eric Anholt2011-07-201-1/+4
| | | | | | | | | | | | | | | | | | | | | Improves firefox-talos-gfx performance under GL when 3D clears are enabled: [ 0] gl-before firefox-talos-gfx 20.193 20.251 0.27% 3/3 [ 0] gl-after firefox-talos-gfx 18.013 18.040 0.19% 3/3