summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* r600g: Invalidate texture cache when creating vertex buffers for compute v2Tom Stellard2012-09-191-1/+3
| | | | | | | | | | | Compute shaders fetch data from vertex buffers via the texture cache, so we need to make sure the texture cache is flushed. v2: - Fix rebase mistake - Fix spelling in comment Reviewed-by: Marek Olšák <[email protected]>
* r600g: Use LOOP_START_DX10 for loopsTom Stellard2012-09-193-2/+11
| | | | | | | | | | LOOP_START_DX10 ignores the LOOP_CONFIG* registers, so it is not limited to 4096 iterations like the other LOOP_* instructions. Compute shaders need to use this instruction, and since we aren't optimizing loops with the LOOP_CONFIG* registers for pixel and vertex shaders, it seems like we should just use it for everything. Reviewed-by: Marek Olšák <[email protected]>
* r600g: Set the correct value of COLOR*_DIM for RATsTom Stellard2012-09-191-2/+2
| | | | | | | | | For buffers (which is what is being used for RATs), the COLOR*_DIM.WIDTH_MASK field needs to be set to the low 16-bits of the buffer size, and the COLOR*_DIM.HEIEGHT_MAX needs to be set to the high bits. Reviewed-by: Marek Olšák <[email protected]>
* r600g: Make sure to initialize DB_DEPTH_CONTROL register for computeTom Stellard2012-09-191-1/+3
| | | | | | The kernel CS checker will fail if this register is not initialized. Reviewed-by: Marek Olšák <[email protected]>
* r600g: Add some comments and debug printfs to compute codeTom Stellard2012-09-192-5/+53
| | | | Reviewed-by: Marek Olšák <[email protected]>
* r600g: Add missing break to case statementTom Stellard2012-09-191-0/+1
|
* radeon/llvm: Emit ISA for ALU instructions in the R600 code emitterMichal Sciubidlo2012-09-1910-167/+359
| | | | Signed-off-by: Tom Stellard <[email protected]>
* radeon/llvm: Only support 512 constant registers on R600Tom Stellard2012-09-191-1/+1
| | | | | This is necessary upcoming encoding changes, since we will only be using 9-bits for register encoding.
* Revert "mesa: consolidate subtexture x/y/width/height error checking code"Brian Paul2012-09-191-73/+84
| | | | | | This reverts commit 5b807400a87d5efefc481017eb420b772933e1da. accidentally pushed.
* Revert "more comment"Brian Paul2012-09-191-2/+4
| | | | | | This reverts commit 5205db6a7ce623a7fca72e6dc6391bd12be3f6aa. accidentally pushed
* Revert "mesa: clean-up and fix glCompressedTexSubImage error checking"Brian Paul2012-09-191-81/+64
| | | | | | This reverts commit 0c67fe5d2dc6d8066fc23c39184d9614abf63992. accidentally pushed.
* mesa: clean-up and fix glCompressedTexSubImage error checkingBrian Paul2012-09-191-64/+81
|
* more commentBrian Paul2012-09-191-4/+2
|
* mesa: consolidate subtexture x/y/width/height error checking codeBrian Paul2012-09-191-84/+73
| | | | | This is the code that checks if a subtexure region is aligned to the compressed format's block size.
* winsys/radeon: fix relocs cachingVadim Girlin2012-09-192-8/+6
| | | | | | | | | | | Don't cache pointers to elements of reallocatable array. In some circumstances it caused false cache hits resulting in incorrect command stream and gpu lockup. Note: This is a candidate for the stable branches. Signed-off-by: Vadim Girlin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeon/llvm: Add a fdiv pattern.Vincent Lejeune2012-09-181-3/+10
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* radeon/llvm: reserve also corresponding 128bits regVincent Lejeune2012-09-181-0/+1
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* softpipe: implement the new can_create_resource() functionBrian Paul2012-09-172-5/+29
| | | | | | And define a SP_MAX_TEXTURE_SIZE value as we do in llvmpipe. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: implement the new can_create_resource() functionBrian Paul2012-09-171-5/+23
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: implement new proxy texture codeBrian Paul2012-09-172-1/+73
| | | | | | | If the gallium driver implements the can_create_resource() function, call it to do proxy texture size checks. Reviewed-by: Jose Fonseca <[email protected]>
* gallium: add new pipe_screen::can_create_resource() functionBrian Paul2012-09-172-0/+18
| | | | | | | | | | Used to implement proxy textures. If a gallium driver doesn't implement this function we'll just continue to use the core Mesa fallback code. Without this hook we really have no good way to implement OpenGL proxy textures with gallium drivers. Reviewed-by: Jose Fonseca <[email protected]>
* mesa: take cube faces into account in _mesa_test_proxy_teximage()Brian Paul2012-09-171-0/+1
| | | | | There will always be six cube faces so take that into consideration when computing the texture size and comparing against the limit.
* mesa: handle GL_PROXY_TEXTURE_CUBE_MAP in _mesa_num_tex_faces()Brian Paul2012-09-171-1/+7
|
* llvmpipe: set max cube texture size to 4K x 4KBrian Paul2012-09-172-1/+2
| | | | | | | | Before, the limit was 8K. For 32-bit RGBA that would be require 1.5 GB of memory (w/out mipmaps). That's well beyond the LP_MAX_TEXTURE_SIZE of 1GB. Reviewed-by: Jose Fonseca <[email protected]>
* mesa: move/fix levels check for glTexStorage()Brian Paul2012-09-171-8/+8
| | | | | | Fix copy&paste error and move min levels check closer to max levels check. Reviewed-by: Jose Fonseca <[email protected]>
* mesa: rewrite glTexStorage() codeBrian Paul2012-09-171-74/+79
| | | | | | | | Simplify the code and make it more like the other glTexImage commands. Call _mesa_legal_texture_dimensions() to validate width, height, depth. Call ctx->Driver.TestProxyTexImage() to make sure texture is not too large. Reviewed-by: Jose Fonseca <[email protected]>
* mesa: rework texture size error checkingBrian Paul2012-09-174-225/+161
| | | | | | | | | | | | | | | | | | | | | | There are two aspects to texture image size checking: 1. Are the width, height, depth legal values (not negative, not larger than the max size for the mipmap level, etc)? 2. Is the texture just too large to handle? For example, we might not be able to really allocate memory for a 3D texture of maxSize x maxSize x maxSize. Previously, we did (1) via the ctx->Driver.TestProxyTextureImage() hook but those tests are really device-independent. Now we do (2) via that hook since the max texture memory and texture shape are device-dependent. Also, (1) is now done outside the general texture parameter error checking functions because of the special interaction with proxy textures. The recently introduced PROXY_ERROR token is removed. The teximage() and copyteximage() functions are bit simpler now (less if-then nesting, etc.) Reviewed-by: Jose Fonseca <[email protected]>
* mesa: refactor _mesa_test_proxy_teximage() codeBrian Paul2012-09-172-30/+56
| | | | | | | Basically, move the body into a new _mesa_legal_texture_dimensions() function. More refactoring to come. Reviewed-by: Jose Fonseca <[email protected]>
* mesa: move glTexImage 'level' error checkingBrian Paul2012-09-171-22/+10
| | | | | | | Move level checking out of _mesa_test_proxy_teximage() and into the other error-checking functions. Reviewed-by: Jose Fonseca <[email protected]>
* mesa: change create_version_string() return type to voidBrian Paul2012-09-171-1/+1
| | | | Fixes "warning: no return statement in function returning non-void"
* glsl: make _mesa_builtin_uniform_desc staticDave Airlie2012-09-182-3/+1
| | | | | | | I can't see any reason this is global (unless for debugging) Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeon/llvm: Inital flow control support for SITom Stellard2012-09-177-2/+168
| | | | | | This adds basic flow control support for If-Then-Else blocks using predicates (stored in the EXEC register) and a predicate stack for nested flow control.
* r600g: Close a memory leak of llvm byte streamsXinya Zhang2012-09-171-0/+1
| | | | | | | No regressions found in the tests of opencl-example/run_tests.sh. Signed-off-by: Xinya Zhang <[email protected]> Signed-off-by: Tom Stellard <[email protected]>
* radeon/llvm: Fix unused variable warningTom Stellard2012-09-171-1/+0
|
* radeon/llvm: Move kernel arg lowering into R600TargetLowering classTom Stellard2012-09-176-470/+35
|
* main/version: consolodate version string creation for ES/Desktop GLJordan Justen2012-09-171-34/+24
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Stop putting 8 NOPs after each prorgam.Eric Anholt2012-09-171-8/+0
| | | | | | | | | | | | | As far as I can see, the intention of the requirement that we do so is to prevent instruction prefetch from wandering out into either unmapped memory or memory with a different caching type, and hanging the chip. The kernel makes sure that the page after your BO has a valid page of the same caching type, which meets this requirement, so there's no need to waste space between our programs (and in instruction cache) on this. Saves another 9kb instructions in l4d2 shaders. Acked-by: Kenneth Graunke <[email protected]>
* i965: Test instruction compaction on gen7Eric Anholt2012-09-172-10/+23
|
* i965: Add support for instruction compaction on Gen7.Kenneth Graunke2012-09-173-33/+220
| | | | | | | | | | Reduces l4d2 program size from 1195kb to 919kb. Improves performance by 0.22% +/- 0.11% (n=70). v2: Rebase on compaction v2, fix up flag reg handling (by anholt). v3: Fix uncompaction of the flag register number. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Support instruction compaction between control flow.Eric Anholt2012-09-171-28/+92
| | | | Reviewed-by: Paul Berry <[email protected]>
* i965: Add support for instruction compaction.Eric Anholt2012-09-1710-8/+946
| | | | | | | | | | | | | | | This reduces program size by using some smaller encodings for common bit patterns in the Gen ISA, with the hope of making programs fit in the instruction cache better. v2: Use larger bitshifts for the uncompressed field setups, in line with the way it's described in the spec. Consistently name a brw_compile "p" like all other code. Add a couple more tests. Consistently call things "compacted" not "compressed" (which is a different feature). Drop the explicit check for not compacting SENDs, which is unjustified and already implied by our lack of support for immediate values. Reviewed-by: Paul Berry <[email protected]>
* i965: Prepare the break/cont uip/jip setting for compacted instructions.Eric Anholt2012-09-171-14/+43
| | | | | | | | | The first cut at instruction compaction won't compact things that would change control flow jump distances, but we do need to still be able to walk the instruction stream, which involves jumping by 8 or 16 bytes between instructions. Reviewed-by: Paul Berry <[email protected]>
* i965: Move program dump to a helper function in brw_eu.c.Eric Anholt2012-09-177-55/+40
| | | | | | | | | It's going to get more complicated when we do instruction compaction. This also introduces putting the program offset in the output. v2: Use next_insn_offset in brw_get_program(), too. Reviewed-by: Paul Berry <[email protected]>
* i965: Make a linkable library for the contents of i965_dri.so.Eric Anholt2012-09-172-5/+13
| | | | | | | | To do unit testing of i965, we want to be able to link against the driver's symbols and prod them. If we don't have a separate lib from our loadable module, libtool gets super whiny. Acked-by: Paul Berry <[email protected]>
* dri: Reuse dri_test.c for stub glapi symbols for unit testing.Eric Anholt2012-09-172-1/+9
| | | | | | | | This file is used to provide stubs for the link test in gallium dri drivers. But the same stubs without the main can be used for making unit tests for code in a dri driver. Acked-by: Paul Berry <[email protected]>
* i965: Clear brw_compile on setup.Eric Anholt2012-09-171-0/+2
| | | | | | | | I noticed in valgrind that p->single_program_flow was used while uninitialized. Everything else zeroed out brw_compile, but this is better API. Reviewed-by: Paul Berry <[email protected]>
* radeon/llvm: Match integer add/sub for SI.Michel Dänzer2012-09-171-2/+8
| | | | | Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeon/llvm: Complete integer comparison patterns for SI.Michel Dänzer2012-09-171-4/+12
| | | | | Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeon/llvm: Match AMDGPUfract on SI.Michel Dänzer2012-09-171-1/+3
| | | | | Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeon/llvm: Match int_AMDGPU_floor for SI.Michel Dänzer2012-09-171-1/+3
| | | | | Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>