summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNSMatt Turner2012-09-241-19/+2
| | | | | | | | | | signbit() appears to be available everywhere (even MSVC according to MSDN), so let's use it instead of open-coding some messy and confusing bit twiddling macros. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54805 Reviewed-by: Paul Berry <[email protected]> Suggested-by: Ian Romanick <[email protected]>
* clover: Silence narrowing conversion warnings in resource.cpp.Francisco Jerez2012-09-241-3/+3
|
* clover: Handle NULL value for clEnqueueNDRangeKernel local_work_sizeTom Stellard2012-09-241-7/+6
| | | | [ Francisco Jerez: Slight simplification. ]
* i965/blorp: Increase Y alignment for multisampled stencil blits.Paul Berry2012-09-241-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a band-aid fix for a bug in commit 5fd67fa (i965/blorp: Reduce alignment restrictions for stencil blits), which causes multisampled stencil blits to work incorrectly on Sandy Bridge. When blitting to or from a normal stencil buffer, we have to use a coordinate transformation that swizzles coordinates to account for the fact that stencil buffers use W tiling, but the most similar tiling format available for textures and render targets is Y tiling. The differences between W and Y tiling cause pixels to be scrambled within a block of size 8x4 (width x height) as measured relative to a W tile, or 16x2 as measured relative to a Y tile. So in order to make sure that pixels at the edges of the blit aren't lost, we need to align the rendering rectangle (and the buffer sizes) to multiples of the 8x4 block size. This alignment happens in the brw_blorp_blit_params constructor, whereas the determination of how to swizzle the coordinates happens during code generation, in the brw_blorp_blit_program class. When blitting to or from a multisampled stencil buffer, the coordinate swizzling is more complex, because it has to account for the interleaving pattern of samples, which uses 4x4 blocks for 4x MSAA and 8x4 blocks for 8x MSAA. The end result is that if multisampling is in use, the 16x2 block size (relative so a Y tile) needs to be expanded to 16x4, and the corresponding size relative to a W tile expands to 8x8. The problem doesn't affect Ivy Bridge severely enough to crop up in Piglit tests because on Ivy Bridge we have to disable multisampling when blitting *to* a multisampled stencil buffer (the blorp compiler generates code to compensate for the fact that multisampling is disabled). However I suspect a bug is still present because we don't disable multisampling when blitting *from* a multisampled stencil buffer. This patch fixes the problem by doubling the vertical alignment requirement when blitting to or from a multisampled stencil buffer, and multisampling has not been disabled. In the long run I would like to rework the brw_blorp_blit_params constructor--it's difficult to follow and has had several subtle bugs like this one. However this band-aid fix should be suitable for cherry-picking to release branches. Fixes Piglit tests "unaligned-blit {2,4} stencil {msaa,upsample}" on Sandy Bridge. NOTE: This is a candidate for stable release branches. Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: check for zero-size image in st_TestProxyTexImage()Brian Paul2012-09-241-0/+5
| | | | | | Fixes divide by zero issue in llvmpipe driver. Reviewed-by: José Fonseca <[email protected]>
* mesa: Silence narrowing warnings in ff_fragment_shader's emit_texenv().Kenneth Graunke2012-09-231-4/+4
| | | | | | | | | | | | | Recent version of GCC report a warning for the implicit conversion from int to float: ff_fragment_shader.cpp:897:3: warning: narrowing conversion of '(1 << ((int)rgb_shift))' from 'int' to 'float' inside { } is ill-formed in C++11 [-Wnarrowing] This is because floats cannot precisely represent all possible 32-bit integer values. However, texenv code is all expected to be floating point, so this should not be a problem. Signed-off-by: Kenneth Graunke <[email protected]>
* radeon/llvm: support for interpolation intrinsicsVincent Lejeune2012-09-2210-2/+318
| | | | Reviewed-by: Tom Stellard <[email protected]>
* draw: fix non-indexed draw calls if there's an index bufferMarek Olšák2012-09-223-8/+6
| | | | | | | | | | | | pipe_draw_info::indexed determines if it should be indexed and not the presence of an index buffer. This fixes crashes in r300g. NOTE: This is a candidate for the stable branches. Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r600g: Fix build with LLVM compilerTom Stellard2012-09-211-1/+1
|
* r600g: set QUANT_MODE on Cayman tooMarek Olšák2012-09-221-1/+2
| | | | | | This fixes piglit/fbo-blit-stretched. Reviewed-by: Alex Deucher <[email protected]>
* r600g: use CS helpers to emit streamout stateMarek Olšák2012-09-222-33/+14
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: remove initialization of unused loop register tablesMarek Olšák2012-09-222-38/+0
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: remove now-unused SURFACE_BASE_UPDATE logicMarek Olšák2012-09-223-9/+3
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: remove unused CB registers from register listsMarek Olšák2012-09-222-87/+0
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: atomize framebuffer stateMarek Olšák2012-09-2211-868/+664
| | | | | | Tested on RS880, Evergreen and Cayman. Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't snoop context state while building shadersMarek Olšák2012-09-223-28/+43
| | | | | | Let's use the shader key describing the state. Reviewed-by: Alex Deucher <[email protected]>
* meta: Add on demand compilation of per target shader programsAnuj Phogat2012-09-211-57/+84
| | | | | | | | | | | | | | | | | A call to glGenerateMipmap() follows the generation of a relevant shader program in setup_glsl_generate_mipmap(). To support all texture targets and to avoid compiling shaders everytime, per target shader programs are compiled on demand and saved for the next call. Fixes float-texture(mipmap.manual): See Comment 6: https://bugs.freedesktop.org/show_bug.cgi?id=54296 NOTE: This is a candidate for stable branches. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* clover: Initialize height and depth to 1 for transfersTom Stellard2012-09-211-1/+1
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* pipe-loader: Remove a few debug_printfsTom Stellard2012-09-212-4/+0
| | | | | | On debug builds these were always being printed. Reviewed-by: Francisco Jerez <[email protected]>
* radeon/llvm: Handle loads from the constants address space.Tom Stellard2012-09-212-0/+10
| | | | | Reading from constant memory is not supported yet, so constant reads use global memory.
* radeon/llvm: Add support for v4f32 stores on R600Tom Stellard2012-09-213-9/+27
|
* radeon/llvm: Add support for i8 reads on R600Tom Stellard2012-09-213-0/+25
|
* radeon/llvm: Expand vector fadd and fmul on R600Tom Stellard2012-09-211-0/+3
|
* radeon/llvm: Add optimization for FP_ROUNDTom Stellard2012-09-212-0/+27
|
* radeon/llvm: Replace AMDGPU pow intrinsic with the llvm versionTom Stellard2012-09-214-7/+26
|
* i965/blorp: Fix narrowing warnings.Paul Berry2012-09-211-3/+3
| | | | | | | | | | | Blorp has to convert rectangle coordinates from integers to floats in order to send them down the GPU pipeline. Recent versions of GCC issue a warning for this, since a float is not capable of precisely representing all possible 32-bit integer values. Suppress the warning with an explicit type cast in the case of blorp, since rectangle coordinates will never be large enough to cause a loss of precision. Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove brw_set_predicate_inverse(p, true) from scratch offset codeKenneth Graunke2012-09-211-1/+0
| | | | | | | | | | | | | | Given that it exists between a push/pop of instruction state, this call can only affect the MOV or ADD instruction generated just below it. Neither of those instructions are predicated, so it makes no sense to ask for the inverse predicate. This fixes grumblings from the simulator debugger, which was complaining about an invalid predicate. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Don't override S3TC internalFormat if data is pre-compressed.Kenneth Graunke2012-09-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Commit 42723d88d intended to override an S3TC internalFormat to a generic compressed format when the application requested online compression of uncompressed data. Unfortunately, it also broke pre-compressed textures when libtxc_dxtn isn't installed but the extensions are forced on. Both glCompressedTexImage2D() and glTexImage2D() call teximage(), which calls _mesa_choose_texture_format(), hitting this override code. If we have actual S3TC source data, we can't treat it as any other format, and need to avoid the override. Since glCompressedTexImage2D() passes in a format of GL_NONE (which is illegal for glTexImage), we can use that to detect the pre-compressed case and avoid the overrides. Fixes a regression since 42723d88d370a7599398cc1c2349aeb951ba1c57. NOTE: This is a candidate for the 9.0 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-and-tested-by: Jordan Justen <[email protected]>
* i965/blorp: Add support for blits between SRGB and linear formats.Kenneth Graunke2012-09-202-4/+8
| | | | | | | | | | | | Fixes colorspace issues in L4D2 when multisampling is enabled (the scene was far too dark, but the flashlight area was way too bright). The nVidia and AMD binary drivers both allow this kind of blit. NOTE: This is a candidate for the 9.0 branch. Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* mesa: Ignore SRGB when determining compatible resolve formats.Kenneth Graunke2012-09-201-1/+2
| | | | | | | | | | | | | | | MSAA resolves and other blit-like operations ignore SRGB state anyway, so we should be able to safely allow resolves between compatible SRGB/linear formats like SRGBA8 and RGBA8888. This matches the behavior of the nVidia and AMD binary drivers. Fixes completely black rendering when using multisampling in L4D2. NOTE: This is a candidate for the 9.0 branch. Reviewed-by: Paul Berry <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* gallium: mention PIPE_TIMEOUT_INFINITE in the fence_finish() commentBrian Paul2012-09-201-1/+1
|
* llvmpipe: fix overflow bug in total texture size computationBrian Paul2012-09-201-2/+16
| | | | | | | | | | | | | | | | v2: use uint64_t for the total_size variable, per Jose. Also add two earlier checks for exceeding the max texture size. For example a 1K^3 RGBA volume would overflow the lpr->image_stride variable. Use simple algebra to avoid overflow in intermediate values. So instead of "x * y > z" use "x > z / y". This should work if we happen to be on a platform that doesn't have 64-bit types. Reviewed-by: Jose Fonseca <[email protected]>
* r600g/llvm: rs780/rs880 are r600 asicsAlex Deucher2012-09-201-2/+2
| | | | Signed-off-by: Alex Deucher <[email protected]>
* mesa: Allow glGetTexParameter of GL_TEXTURE_SRGB_DECODE_EXTIan Romanick2012-09-201-0/+12
| | | | | | | | | This was already (correctly) supported for glGetSamplerParameter paths. NOTE: This is a candidate for stable branches. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r300/compiler: Use precomputed q values in the register allocatorTom Stellard2012-09-191-1/+69
|
* r300g: Init regalloc state during context creationTom Stellard2012-09-198-155/+204
| | | | | | Initializing the regalloc state is expensive, and since it is always the same for every compile we only need to initialize it once per context. This should help improve shader compile times for the driver.
* r300/compiler: Don't create register classes for inputsTom Stellard2012-09-191-14/+1
|
* ra: Add q_values parameter to ra_set_finalize()Tom Stellard2012-09-195-5/+18
| | | | | | This allows the user to pass precomputed q values to the allocator. Reviewed-by: Kenneth Graunke <[email protected]>
* ra: Clarify usage of ra_set_node_reg()Tom Stellard2012-09-191-0/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* r600g: Invalidate texture cache when creating vertex buffers for compute v2Tom Stellard2012-09-191-1/+3
| | | | | | | | | | | Compute shaders fetch data from vertex buffers via the texture cache, so we need to make sure the texture cache is flushed. v2: - Fix rebase mistake - Fix spelling in comment Reviewed-by: Marek Olšák <[email protected]>
* r600g: Use LOOP_START_DX10 for loopsTom Stellard2012-09-193-2/+11
| | | | | | | | | | LOOP_START_DX10 ignores the LOOP_CONFIG* registers, so it is not limited to 4096 iterations like the other LOOP_* instructions. Compute shaders need to use this instruction, and since we aren't optimizing loops with the LOOP_CONFIG* registers for pixel and vertex shaders, it seems like we should just use it for everything. Reviewed-by: Marek Olšák <[email protected]>
* r600g: Set the correct value of COLOR*_DIM for RATsTom Stellard2012-09-191-2/+2
| | | | | | | | | For buffers (which is what is being used for RATs), the COLOR*_DIM.WIDTH_MASK field needs to be set to the low 16-bits of the buffer size, and the COLOR*_DIM.HEIEGHT_MAX needs to be set to the high bits. Reviewed-by: Marek Olšák <[email protected]>
* r600g: Make sure to initialize DB_DEPTH_CONTROL register for computeTom Stellard2012-09-191-1/+3
| | | | | | The kernel CS checker will fail if this register is not initialized. Reviewed-by: Marek Olšák <[email protected]>
* r600g: Add some comments and debug printfs to compute codeTom Stellard2012-09-192-5/+53
| | | | Reviewed-by: Marek Olšák <[email protected]>
* r600g: Add missing break to case statementTom Stellard2012-09-191-0/+1
|
* radeon/llvm: Emit ISA for ALU instructions in the R600 code emitterMichal Sciubidlo2012-09-1910-167/+359
| | | | Signed-off-by: Tom Stellard <[email protected]>
* radeon/llvm: Only support 512 constant registers on R600Tom Stellard2012-09-191-1/+1
| | | | | This is necessary upcoming encoding changes, since we will only be using 9-bits for register encoding.
* Revert "mesa: consolidate subtexture x/y/width/height error checking code"Brian Paul2012-09-191-73/+84
| | | | | | This reverts commit 5b807400a87d5efefc481017eb420b772933e1da. accidentally pushed.
* Revert "more comment"Brian Paul2012-09-191-2/+4
| | | | | | This reverts commit 5205db6a7ce623a7fca72e6dc6391bd12be3f6aa. accidentally pushed
* Revert "mesa: clean-up and fix glCompressedTexSubImage error checking"Brian Paul2012-09-191-81/+64
| | | | | | This reverts commit 0c67fe5d2dc6d8066fc23c39184d9614abf63992. accidentally pushed.