summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* st/nine: Split NineSurface9_CopySurfaceAxel Davy2015-08-215-125/+169
| | | | | | | | | | | | | NineSurface9_CopySurface was supporting more cases than what we needed, and doing checks that were innapropriate for some NineSurface9_CopySurface use cases. This patch splits it into two for the two use cases, and moves the checks to the caller. This patch also adds a few checks to NineDevice9_UpdateSurface Signed-off-by: Axel Davy <[email protected]>
* st/nine: Simplify Volume9 dirty region trackingAxel Davy2015-08-213-67/+35
| | | | | | | Similar to what was done for Surface9, track the dirty region only in VolumeTexture9. Signed-off-by: Axel Davy <[email protected]>
* util/u_blitter: implement alpha blending for pipe->blitMarek Olšák2015-08-217-23/+49
|
* gallium: Add blending to pipe blitChristoph Bumiller2015-08-214-0/+5
| | | | | | This type of blending is used for gallium nine software cursor Signed-off-by: David Heidelberg <[email protected]>
* st/nine: Revert to sw cursor in case of failure to set hw cursorAxel Davy2015-08-211-2/+2
| | | | | Signed-off-by: Axel Davy <[email protected]> Reviewed-by: David Heidelberg <[email protected]>
* st/nine: Do not call ID3DPresent_GetCursorPos for sw cursorAxel Davy2015-08-211-3/+4
| | | | | | | | | | | For sw cursor we do not tell wine the cursor position (the app tells us directly). We shouldn't use ID3DPresent_GetCursorPos. device->cursor.pos already contains the coordinates the app gave us. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: David Heidelberg <[email protected]>
* st/nine: Force hw cursor for Windowed modeAxel Davy2015-08-211-3/+9
| | | | | | | | According to the spec, Windowed mode must have hw cursor Signed-off-by: Axel Davy <[email protected]> Reviewed-by: David Heidelberg <[email protected]>
* st/nine: Hide hardware cursor when we don't use itAxel Davy2015-08-211-6/+12
| | | | | | | | | We have either hardware cursor or software cursor. When we use software cursor, we should hide the hardware cursor. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: David Heidelberg <[email protected]>
* st/nine: fix D3DRS_DITHERENABLE wrong state groupAxel Davy2015-08-211-1/+1
| | | | | | | | | D3DRS_DITHERENABLE was assigned to the rasterizer state group, but it was used for the blend group. Assign it to the blend group. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Account POINTSIZE_MIN and POINTSIZE_MAX for point sizePatrick Rudolph2015-08-213-14/+19
| | | | | | | | | | When using D3DRS_POINTSIZE make sure the value is at least D3DRS_POINTSIZE_MIN but not greater than D3DRS_POINTSIZE_MAX. Fixes some Wine tests. Reviewed-by: Axel Davy <[email protected]> Signed-off-by: Patrick Rudolph <[email protected]>
* st/nine: Align texture memoryPatrick Rudolph2015-08-214-6/+6
| | | | | | | | | | | Align texture memory on 32 byte boundry to allow SSE/AVX memcpy to work on locked rects. This fixes some crashes with games using SSE. Reviewed-by: David Heidelberg <[email protected]> Reviewed-by: Axel Davy <[email protected]> Signed-off-by: Patrick Rudolph <[email protected]>
* st/nine: Always set point_quad_rasterization to 1Axel Davy2015-08-211-1/+1
| | | | | | | Both Points and Point Sprites are rasterized like quads, according to d3d9 doc and gallium rasterizer doc. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Fix Swizzle for ATI2 formatAxel Davy2015-08-211-0/+5
| | | | | | | | | | We had red and green in the wrong channels for the ATI2 format (RGTC2). Found thanks to wine tests. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: David Heidelberg <[email protected]>
* target/d3dadapter9: Return Windows like card namesPatrick Rudolph2015-08-213-42/+359
| | | | | | | | | Add support for multiple cards and fill in Win like card name, driver name and version info. Use fallback for unknown vendors and unknown card names. Reviewed-by: Axel Davy <[email protected]> Signed-off-by: Patrick Rudolph <[email protected]>
* glsl: fix error message when validating tcs output declsIlia Mirkin2015-08-211-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: pass through 4th opcode argument in bitmap/pixel visitorsIlia Mirkin2015-08-211-6/+6
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: fix assignments with 4-operand arguments (i.e. BFI)Ilia Mirkin2015-08-211-1/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: allow image_size on float imagesMartin Peres2015-08-211-1/+2
| | | | | | | | | | | | This got missed because the piglit test only tested int images to avoid a combinatiorial explosion of format, targets, stages and sizes which takes more than 5 minutes to test on nvidia's driver. This patch also drops the IMAGE_FUNCTION_AVAIL_ATOMIC which is not applicable to the image_size codepath but was not hurting in any way. Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* clover: fix llvm 3.5 build errorZoltan Gilian2015-08-211-12/+21
| | | | | | | | | | There is no MDOperand in llvm 3.5. v2: Check if kernel metadata is present to avoid crash (EdB). v3: Second attempt to avoid crash: switch off metadata query for llvm < 3.6. Reviewed-by: Serge Martin (EdB) <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* mesa: update fbo state in glTexStorageTapani Pälli2015-08-211-0/+15
| | | | | | | | | | We have to re-validate FBOs rendering to the texture like is done with TexImage and CopyTexImage. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91673 Cc: "10.6" <[email protected]>
* vc4: Add algebraic opt for rcp(1.0).Eric Anholt2015-08-201-0/+8
| | | | | | | | | | | We're generating rcps as part of backend lowering of the packed coordinate in the CS, and we don't want to lower them in NIR because of the extra newton-raphson steps in the common case. However, GLB2.7 is moving a vertex attribute with a 1.0 W component to the position, and that makes us produce some silly RCPs. total instructions in shared programs: 97590 -> 97580 (-0.01%) instructions in affected programs: 74 -> 64 (-13.51%)
* vc4: Allow unpack_8[abcd]_f's src to stay in r4.Eric Anholt2015-08-201-1/+15
| | | | | | | I had QPU emit code to do it, but forgot to flag the register class. total instructions in shared programs: 97974 -> 97590 (-0.39%) instructions in affected programs: 25291 -> 24907 (-1.52%)
* vc4: Pack the unorm-packing bits into a src MUL instruction when possible.Eric Anholt2015-08-205-16/+104
| | | | | | | | Now that we do non-SSA QIR instructions, we can take a NIR SSA src that's only used by the unorm packing and just stuff the pack bits into it. total instructions in shared programs: 98136 -> 97974 (-0.17%) instructions in affected programs: 4149 -> 3987 (-3.90%)
* vc4: Add a QIR helper for whether the op is a MUL type.Eric Anholt2015-08-203-4/+16
|
* vc4: Drop an unused algebraic op.Eric Anholt2015-08-201-9/+0
| | | | NIR now handles this optimization for us.
* vc4: Switch QPU_PACK_SCALED to be two non-SSA instructions.Eric Anholt2015-08-205-21/+19
| | | | | total instructions in shared programs: 98159 -> 98136 (-0.02%) instructions in affected programs: 12279 -> 12256 (-0.19%)
* vc4: Make the pack-to-unorm instructions be non-SSA.Eric Anholt2015-08-204-42/+36
| | | | | | | | This helps ensure that the register allocator doesn't force the later pack operations to insert extra MOVs. total instructions in shared programs: 98170 -> 98159 (-0.01%) instructions in affected programs: 2134 -> 2123 (-0.52%)
* vc4: Allow QIR registers to be non-SSA.Eric Anholt2015-08-204-4/+10
| | | | | | | | | Now that we have NIR, most of the optimization we still need to do is peepholes on instruction selection rather than general dataflow operations. This means we want to be able to have QIR be a lot closer to the actual QPU instructions, just with virtual registers. Allowing multiple instructions writing the same register opens up a lot of possibilities.
* vc4: We can now move TEX_RESULT accesses across other r4 ops.Eric Anholt2015-08-201-16/+0
| | | | No difference on shader-db.
* glsl: fix binding validation for interface blocksTimothy Arceri2015-08-211-12/+18
| | | | | | V2: rebase on SSBO changes Reviewed-by: Ian Romanick <[email protected]>
* glsl: interleave constant propagation and foldingTimothy Arceri2015-08-211-2/+43
| | | | | | | | | | | The constant folding pass can take a long time to complete so rather than running through the entire pass each time a new constant is propagated (and vice versa) interleave them. This change helps ES31-CTS.arrays_of_arrays.InteractionFunctionCalls1 go from around 2 min -> 23 sec. Reviewed-by: Ian Romanick <[email protected]>
* nv50/ir: pre-compute BFE arg when both bits and offset are immIlia Mirkin2015-08-201-3/+9
| | | | | | | | | | Due to a quirk in how the nv50 opt passes run, the algebraic optimization that looks for these BFE's happens before the constant folding pass. Rearranging these passes isn't a great idea, but this is easy enough to fix. Allows a following cvt to eliminate the bfe in certain situations. Signed-off-by: Ilia Mirkin <[email protected]>
* glsl: expose textureQueryLod in GLSL 4.00+ fragment shadersIlia Mirkin2015-08-201-37/+82
| | | | | | | | | | | | | | | See issue from the ARB_texture_query_lod spec for LOD vs Lod confusion: (3) The core specification uses the "Lod" spelling, not "LOD". Should this extension be modified to use "Lod"? RESOLVED: The "Lod" spelling is the correct spelling for the core specification and the preferred spelling for use. However, use of "LOD" also exists, as the extension predated the core specification, so this extension won't remove use of "LOD". Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* Revert "mesa/formats: refactor by collapsing cases in switch statement by type"Nanley Chery2015-08-201-17/+135
| | | | | | | | | | | This reverts commit ffe6c6ad5f719dedd1b6b95e8590e3f20b23d340. _mesa_format_num_components() does not include the padding bits in mesa formats containing 'X' channels. This could cause mipmap generation for certain uncompressed formats to underestimate the number of channels in the source image by 1. Signed-off-by: Nanley Chery <[email protected]>
* r600g: Fix handling of TGSI_OPCODE_ARR with SBGlenn Kennard2015-08-211-1/+1
| | | | | | | | | FLT_TO_INT goes in the vector pipes on evergreen/NI, not the trans unit as on earlier chips. Signed-off-by: Glenn Kennard <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: Turn 'r600_shader_key' struct into unionEdward O'Callaghan2015-08-214-38/+42
| | | | | | | | | | | | This struct was getting a bit crowded, following the lead of radeonsi, mirror the idea of having sub-structures for each shader type. Turning 'r600_shader_key' into an union saves some trivial memory and CPU cycles for the shader keys. [airlied: drop as_ls, and reorder so larger fields at start.] Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: Rewrite r600_shader_selector_key() to use a switch stmtEdward O'Callaghan2015-08-211-7/+17
| | | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965: Use NIR by default for vertex shadersJason Ekstrand2015-08-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shader-db results for vec4 on i965: total instructions in shared programs: 1499894 -> 1502261 (0.16%) instructions in affected programs: 1414224 -> 1416591 (0.17%) helped: 2434 HURT: 10543 GAINED: 1 LOST: 0 Shader-db results for vec4 on g4x: total instructions in shared programs: 1437411 -> 1439779 (0.16%) instructions in affected programs: 1362402 -> 1364770 (0.17%) helped: 2434 HURT: 10544 GAINED: 0 LOST: 0 Shader-db results for vec4 on Iron Lake: total instructions in shared programs: 1437214 -> 1439593 (0.17%) instructions in affected programs: 1362205 -> 1364584 (0.17%) helped: 2433 HURT: 10544 GAINED: 1 LOST: 0 Shader-db results for vec4 on Sandy Bridge: total instructions in shared programs: 2022092 -> 1941570 (-3.98%) instructions in affected programs: 1886838 -> 1806316 (-4.27%) helped: 7510 HURT: 10737 GAINED: 0 LOST: 0 Shader-db results for vec4 on Ivy Bridge: total instructions in shared programs: 1853749 -> 1804960 (-2.63%) instructions in affected programs: 1686736 -> 1637947 (-2.89%) helped: 6735 HURT: 11101 GAINED: 0 LOST: 0 Shader-db results for vec4 on Haswell: total instructions in shared programs: 1853749 -> 1804960 (-2.63%) instructions in affected programs: 1686736 -> 1637947 (-2.89%) helped: 6735 HURT: 11101 GAINED: 0 LOST: 0 Signed-off-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Acked-by: Matt Turner <[email protected]>
* glsl: check if return_deref in lower_subroutine_visitor::visit_leave isn't NULLKai Wasserbäch2015-08-211-1/+1
| | | | | | | | | Fixes a crash in Piglit's spec@arb_shader_subroutine@[email protected] for me. Signed-off-by: Kai Wasserbäch <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nv50/ir: Handle OP_CVT when folding constant expressionsTobias Klausmann2015-08-201-0/+78
| | | | | [imirkin: handle more type combinations, use macro] Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: undo more shifts still by allowing a pre-SHL to occurIlia Mirkin2015-08-201-15/+33
| | | | | | | | This happens with unpackSnorm lowering. There's yet another bitfield-extract behind it, but there's too much variation to be worth cutting through. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: don't require AND when the high byte is being addressedIlia Mirkin2015-08-201-0/+12
| | | | | | | unpackUnorm* lowering doesn't AND the high byte/word as it's unnecessary. Detect that situation as well. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: detect i2f/i2i which operate on specific bytes/wordsIlia Mirkin2015-08-204-4/+82
| | | | | | | | | | | Some Unigine shaders have been observed to unpack bytes out of 32-bit integers and convert them to floats. I2F/I2I can handle this sort of thing directly. Detect the handleable situations. This misses 16-bit word capabilities in nv50, but I haven't seen shaders that would actually make use of that. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: detect AND/SHR pairs and convert into EXTBFIlia Mirkin2015-08-201-20/+46
| | | | | | | Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: support different unordered_set implementationsChih-Wei Huang2015-08-205-12/+57
| | | | | | | | | | | | If build with C++11 standard, use std::unordered_set. Otherwise if build on old Android version with stlport, use std::tr1::unordered_set with a wrapper class. Otherwise use std::tr1::unordered_set. Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: Fix "handle nir_intrinsic_image_size"Martin Peres2015-08-201-4/+3
| | | | | | | | | | | | | | I pushed a half-baked version of "i965: handle nir_intrinsic_image_size" by accident. Not having the Reviewed-by: tags on the last two commits should have been a red flag but I somehow missed it after the QA check. This patch should fix image-size for non-int images. I will add support to the piglit test for all the other image types. Sorry for the noise. Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: enable GL_ARB_shader_image_sizeMartin Peres2015-08-201-0/+1
| | | | Signed-off-by: Martin Peres <[email protected]>
* i965: handle nir_intrinsic_image_sizeMartin Peres2015-08-201-0/+46
| | | | | | | | | | | | | | v2, Review from Francisco Jerez: - avoid the camelCase for the booleans - init the booleans using the sampler type - force the initialization of all the components of the output register v3: - Rename a variable from CubeMapArray to CubeArray to re-use GLSL's name (Ilia) - Fix some indentation and drop parenthesis (Topi) - Fix a signed/unsigned comparaison warning Signed-off-by: Martin Peres <[email protected]>
* nir: convert the glsl intrinsic image_size to nir_intrinsic_image_sizeMartin Peres2015-08-202-6/+17
| | | | | | | | | | | | | | | | | v2, review from Francisco Jerez: - make the destination variable as large as what the nir instrinsic defines (4) instead of the size of the return variable of glsl. This is still safe for the already existing code because all the intrinsics affected returned the same amount of components as expected by glsl IR. In the case of image_size, it is not possible to do so because the returned number of component depends on the image type and this case is not well handled by nir. v3: - Style fix Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* glsl: add support for the imageSize builtinMartin Peres2015-08-201-16/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code is heavily inspired from Francisco Jerez's code supporting the image_load_store extension. Backends willing to support this builtin should handle __intrinsic_image_size. v2: Based on the review of Ilia Mirkin - Enable the extension for GLES 3.1 - Fix indentation - Fix the return type (float to int, number of components for CubeImages) - Add a warning related to GLES 3.1 v3: Based on the review of Francisco Jerez - Refactor the code to share both add_image_function and _image with the other image-related functions v4: Based on Topi Pohjolainen's comments - Do not add parenthesis for the return value v5: based on Francisco Jerez's comments: - Fix a few indent issues - Reduce the size of a condition by testing the dimension and array properties instead of enumerating all the formats. Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>