summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: use compute shaders for clear_buffer & copy_bufferMarek Olšák2018-10-168-203/+350
| | | | | Fast color clears should be much faster. Also, fast color clears on evicted buffers should be 200x faster on GFX8 and older.
* radeonsi: use copy_buffer in buffer_do_flush_region directlyMarek Olšák2018-10-161-11/+4
|
* radeonsi: use faster integer division for instance divisorsMarek Olšák2018-10-163-36/+83
| | | | | | | | | | We know the divisors when we upload them, so instead we can precompute and upload division factors derived from each divisor. This fast division consists of add, mul_hi, and two shifts, and we have to load 4 dwords intead of 1. This probably won't affect any apps.
* radeonsi: use higher subpixel precision (QUANT_MODE) for smaller viewportsMarek Olšák2018-10-163-9/+53
|
* radeonsi: move emission of PA_SU_VTX_CNTL into emit_guardbandMarek Olšák2018-10-164-6/+11
| | | | | We'll modify the quant mode there, which also affects the guarband computation.
* radeonsi: don't re-upload the sample position constant buffer repeatedlyMarek Olšák2018-10-164-16/+33
|
* radeonsi: set PA_SU_PRIM_FILTER_CNTL optimallyMarek Olšák2018-10-163-4/+13
|
* radeonsi: center viewport to improve guardband clipping for high resolutionsMarek Olšák2018-10-164-14/+62
| | | | | | This will be more useful when we change the quant mode to increase subpixel precision and decrease the viewport range (which might not be possible if the viewport is not centered in the viewport range).
* radeonsi: save raster config in screen, add se_tile_repeatMarek Olšák2018-10-163-7/+17
|
* radeonsi: switch back to standard DX sample positionsMarek Olšák2018-10-161-17/+26
| | | | Apps may rely on them.
* radeonsi: add GDS support to CP DMAMarek Olšák2018-10-163-21/+89
|
* radeonsi: rename si_gfx_* functions to si_cp_*Marek Olšák2018-10-165-59/+59
| | | | and write_event_eop -> release_mem
* radeonsi: make si_gfx_write_event_eop more configurableMarek Olšák2018-10-165-15/+29
|
* intel/nir, freedreno/ir3: Use the separated dead write vars passCaio Marcelo de Oliveira Filho2018-10-151-0/+1
| | | | | | | No changes to shader-db for intel. No changes to shader-db expected for freedreno. Reviewed-by: Jason Ekstrand <[email protected]>
* v3d: Add support for hardware pack/unpack of half floats.Eric Anholt2018-10-151-0/+1
| | | | | Cuts the formerly 7-minute simulation time of fs-packHalf2x16.shader_test in half.
* gallium/ttn: Convert inputs and outputs to derefs of variables.Eric Anholt2018-10-154-69/+64
| | | | | | | | | | | This means that TTN shaders more closely resemble GTN shaders: they have inputs and outputs as variable derefs, with the variables having their .driver_location already set up for you. This will be useful for v3d to do input variable DCE in NIR, which we can't do when the TTN shaders never have a pre-nir_lower_io stage. Acked-by: Rob Clark <[email protected]>
* gallium/ttn: Fix the type of gl_FragDepth.Eric Anholt2018-10-151-0/+1
| | | | | | | | In TGSI we have a vec4 of which only .z is used, but for NIR we should be using a float the same as other NIR IR. We were already moving TGSI's .z to the .x channel. Acked-by: Rob Clark <[email protected]>
* freedreno/a6xx: Enable blitterKristian H. Kristensen2018-10-155-0/+623
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Update headersKristian H. Kristensen2018-10-151-16/+30
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Remove unnecessary GRAS_2D_BLIT_INFO writeKristian H. Kristensen2018-10-151-2/+0
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* gallium/u_transfer_helper: Add support for separate Z24/S8 as well.Kenneth Graunke2018-10-145-22/+60
| | | | | | | | | | | | | | | | u_transfer_helper already had code to handle treating packed Z32_S8 as separate Z32_FLOAT and S8_UINT resources, since some drivers can't handle that interleaved format natively. Other hardware needs depth and stencil as separate resources for all formats. For example, V3D3 needs this for 24-bit depth as well. This patch adds a new flag to lower all depth/stencils formats, and implements support for Z24_UNORM_S8_UINT. (S8_UINT_Z24_UNORM is left as an exercise to the reader, preferably someone who has access to a machine that uses that format.) Reviewed-by: Eric Anholt <[email protected]>
* gallium/format: Add a helper to combine separate Z24 and S8 stencil.Kenneth Graunke2018-10-142-0/+22
| | | | | | | This new function takes separate Z24 depth and S8 stencil sources, and packs them into a single combined Z24S8 buffer. Reviewed-by: Eric Anholt <[email protected]>
* gallium/auxiliary: Add util_format_get_depth_only() helper.Kenneth Graunke2018-10-141-0/+21
| | | | | | | This will be used by u_transfer_helper.c shortly, in order to split packed depth-stencil into separate resources. Reviewed-by: Eric Anholt <[email protected]>
* r600/sb: Fix constant-logical-operand warning.Vinson Lee2018-10-121-1/+1
| | | | | | | | | | | | | | | | | sb/sb_bc_parser.cpp:620:27: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand] if (cf->bc.op_ptr->flags && FF_GDS) ^ ~~~~~~ sb/sb_bc_parser.cpp:620:27: note: use '&' for a bitwise operation if (cf->bc.op_ptr->flags && FF_GDS) ^~ & sb/sb_bc_parser.cpp:620:27: note: remove constant to silence this warning if (cf->bc.op_ptr->flags && FF_GDS) ~^~~~~~~~~ Fixes: da977ad90747 ("r600/sb: start adding GDS support") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* scons: Allow building with custom MSVC_USE_SCRIPT script.Jose Fonseca2018-10-121-0/+1
| | | | | | | | | | | | | | | | | | SCons MSVC support relies on vcvarsall.bat to extract the PATH, CPP includes, library paths, etc. And SCons also has an build env var named MSVC_USE_SCRIPT which one can use to point to alternative vcvarsall.bat script. This change exposes this MSVC_USE_SCRIPT build env variable as a SCons command line variable. This will enable using MSVC outside Program Files (e.g, network shares, etc.) This change also links advapi32 library, necessary for the Windows Registry API used by WGL state tracker, avoiding missing symbols. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/va: use provided sizes and coords for vlVaGetImageBoyuan Zhang2018-10-111-3/+28
| | | | | | | | | | | | | | | | vlVaGetImage should respect the width, height, and coordinates x and y that passed in. Therefore, pipe_box should be created with the passed in values instead of surface width/height. v2: add input size check, return error when size out of bounds v3: fix the size check for vaimage v4: add size adjustment for x and y coordinates Signed-off-by: Boyuan Zhang <[email protected]> Cc: "18.2" <[email protected]> Reviewed-by: Leo Liu <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Acked-by: Christian König <[email protected]>
* svga: change svga_destroy_shader_variant() to return voidBrian Paul2018-10-095-23/+6
| | | | | | | | | | | | | svga_destroy_shader_variant() itself flushes and retries the command if there's a failure. So no need for the callers to do it. Other callers of the function were already ignoring the return value. This also fixes a corner-case double-free reported by Coverity (and reported by Dave Airlie). Tested with various OpenGL apps. Reviewed-by: Charmaine Lee <[email protected]>
* nvc0: fix blitting red to srgb8_alphaIlia Mirkin2018-10-091-0/+4
| | | | | | | | | | | | | For some reason the 2d engine can't handle this. Red formats get special treatment there, so perhaps related. Fixes dEQP-GLES3 tests of the form: dEQP-GLES3.functional.fbo.blit.conversion.r{8,16f,32f}_to_srgb8_alpha8 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* nv50,nvc0: guard against zero-size blitsIlia Mirkin2018-10-092-0/+14
| | | | | | | | | | The current state tracker can generate these sometimes. Fixing this is more involved, and due to some integer math we can generate divisions-by-zero. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* nv50,nvc0: mark RGBX_UINT formats as renderableIlia Mirkin2018-10-091-4/+4
| | | | | | | | | | | | | | | | This helps st/mesa avoid some (apparently) buggy fallbacks. Specifically the CopyTexSubImage fallback tries to read texture A as RGBA_FLOAT and write back that data into the target format, which fails for integer formats which have no appropriate logic to do the conversion. Since integer formats don't blend, there's no harm in the fact that the "A" component gets written anyways. Fixes, among others: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/canvas/tex-2d-rgb8ui-rgb_integer-unsigned_byte.html Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* st/dri: Handle BGRA5551 formatMichel Dänzer2018-10-091-0/+13
| | | | Reviewed-by: Marek Olšák <[email protected]>
* freedreno/a5xx+a6xx: fix LRZ pitch alignmentRob Clark2018-10-081-1/+1
| | | | | | | Both RB_2D_DST_SIZE.PITCH (a6xx) and RB_MRT[n].PITCH (a5xx) need alignment to 64. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add LRZ supportRob Clark2018-10-088-132/+104
| | | | | | | | | | As with a5xx, hidden behind FD_MESA_DEBUG=lrz due to being paranoid about z-fighting issues with some games (in particular, this was observed with 0ad on a5xx.. but I think the proper solution to enable this by default is to figure out how to do driver specific driconf options). Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-10-087-38/+120
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add helper for various CP_EVENT_WRITERob Clark2018-10-085-38/+30
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: remove unused fxnsRob Clark2018-10-082-19/+0
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: remove fd6_shader_stateobjRob Clark2018-10-083-23/+10
| | | | | | | Earlier gen's already got this cleanup, but a6xx was still off on a branch then. Signed-off-by: Rob Clark <[email protected]>
* util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITYMarek Olšák2018-10-061-1/+3
| | | | | Initial version discussed with Rob Clark under a different patch name. This approach leaves his driver unaffected.
* radeonsi: fix a typo at CS_PARTIAL_FLUSHMarek Olšák2018-10-061-1/+1
| | | | harmless
* ac: add ac_build_roundMarek Olšák2018-10-061-3/+1
|
* ac: correct PKT3_COPY_DATA definitionsMarek Olšák2018-10-064-6/+6
|
* ac: define all address spaces properlyMarek Olšák2018-10-061-3/+3
|
* gallivm: Make it possible to disable some optimization shortcuts in release ↵Gert Wollny2018-10-064-21/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | builds For testing it is of interest that all tests of dEQP pass, e.g. to test virglrenderer on a host only providing software rendering like in a CI. Hence make it possible to disable certain optimizations that make tests fail. While we are there also add some documentation to the flags to make it clear that this is opt-out. Setting the environment variable "GALLIVM_PERF=no_filter_hacks" can be used to make the following tests pass in release mode: dEQP-GLES2.functional.texture.mipmap.2d.affine.*_linear_* dEQP-GLES2.functional.texture.mipmap.cube.generate.* dEQP-GLES2.functional.texture.vertex.2d.filtering.*_mipmap_linear_* dEQP-GLES2.functional.texture.vertex.2d.wrap.* Related: https://bugs.freedesktop.org/show_bug.cgi?id=94957 v2: rename optimization disabling flag to 'safemath' and also move the nopt flag to the perf flags. v3: rename flag "safemath" to "no_filter_hacks" since safemath is usually associated with floating point operations (Roland) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* virgl: Pass resource size and transfer offsetsTomeu Vizoso2018-10-064-28/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the size of a resource when creating it so a backing can be kept in the other side. Also pass the required offset to transfer commands. This moves vtest closer to how virtio-gpu works, making it more useful for testing. v2: - Use new messages for creation and transfers, as changing the behavior of the existing messages would be messy given that we don't want to break compatibility with older servers. v3: - Use correct strides: The resource corresponding to the output display might have a differnt line stride then the IOVs, so when reading back to this resource take the resource stride and the the IOV stride into account. v4: Fix transfer size calculation (Andrey Simiklit) v5: Add comment about transfer size value in the PUT commend (Gurchetan). Add a comment about the size correction for transfers for reading and writing the resource. Fixing this by correctly evaluating the size upfront will need some work also on the virglrenderer side. Signed-off-by: Tomeu Vizoso <[email protected]> (v2) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl, vtest: Correct the transfer size calculationGert Wollny2018-10-061-1/+3
| | | | | | | | | | | The transfer size used in virglrenderer refers to uint32_t, so one must add 3 and then divide by 4 instead of adding 3/4 which is a no-op with integers. Fixes: b3b82fe8ea virgl/vtest: add vtest driver Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders vgt_vertex_reuseSonny Jiang2018-10-054-2/+18
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders TessellationSonny Jiang2018-10-054-5/+26
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders PSSonny Jiang2018-10-053-14/+60
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders VSSonny Jiang2018-10-053-33/+77
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi:optimizing SET_CONTEXT_REG for shaders GSSonny Jiang2018-10-054-24/+154
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>