aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a6xx: texture state objRob Clark2018-10-176-33/+251
| | | | | | | | | | | Unfortunately gallium doesn't match what the hw wants perfectly here, in using a separate CSO for each texture/sampler. So we have to use a hash table to map the collection of texture/samplers to hw state object. We probably could use separate hw state objects for texture and sampler state, but mesa/st tends to update the tex and samp state together. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add resource seqnoRob Clark2018-10-174-3/+11
| | | | | | | Intended to be something more compact than a 64b pointer, which could be used as a key into hashtables. Prep work for texture state objects. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: move const emit to state groupRob Clark2018-10-174-15/+70
| | | | | | | | | | | | | | Eventually we want to move nearly everything, but no other state depends on const state, so this is the easiest one to move first. For webgl aquarium, this reduces GPU load by about 10%, since for each fish it does a uniform upload plus draw.. fish frequently are visible in only a single tile, so this skips the uniform uploads for other tiles. The additional step of avoiding WFI's when using CP_SET_DRAW_STATE seems to be work an additional 10% gain for aquarium. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add infrastructure for CP_DRAW_STATERob Clark2018-10-172-0/+46
| | | | | | | Add helper to add state-groups to emit, and code to emit CP_DRAW_STATE packet if we have any state-groups. Signed-off-by: Rob Clark <[email protected]>
* freedreno: reduce resource dependency tracking overheadRob Clark2018-10-171-42/+67
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: Remove the Emacs mode linesNeil Roberts2018-10-17113-226/+0
| | | | | | | | | | | | | | | These are not necessary because the corresponding settings are set via the .dir-locals.el file anyway. Most of them were missing a ‘:’ after “tab-width” which was making Emacs display an annoying warning whenever you open the file. This patch was made with: sed -ri '/-\*- mode:/,/^$/d' \ $(find src/gallium/{drivers,winsys} -name \*.\[ch\] \ -exec grep -l -- '-\*- mode:' {} \+) Signed-off-by: Rob Clark <[email protected]>
* freedreno: Fix the Emacs indentation configuration fileNeil Roberts2018-10-171-1/+1
| | | | | | | The .dir-locals.el had the wrong name for the truthy value so it wasn’t setting indent-tabs-mode. Signed-off-by: Rob Clark <[email protected]>
* freedreno: allocate batches from the cache in launch_gridHyunjun Ko2018-10-171-1/+2
| | | | | | | | | | Needs to allocate batches from the cache so that it could get a valid index and make resource dependancy tracking right. In addition this fixes assertion on debug build since the commit 1a40faa8 landed. Signed-off-by: Rob Clark <[email protected]>
* freedreno: adds nondraw param to fd_bc_alloc_batchHyunjun Ko2018-10-174-6/+6
| | | | | | | | Needs to specify nondraw when creating a batch through fd_bc_alloc_batch since it'd better create a batch through it rather than fd_batch_create. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: remove fd6_emit_render_cntl()Rob Clark2018-10-172-34/+0
| | | | | | It was dead code carried over from a5xx Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix broken texcoord inputsRob Clark2018-10-171-21/+1
| | | | | | | | TODO not sure if this is best solution, but current logic is broken for texcoord inputs. It is definitely the simplest solution. Fixes: 1a24f519663 freedreno/ir3: ignore unused inputs Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix off-by-one error in BEGIN_RING()Rob Clark2018-10-171-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeonsi: track context rolls better for the Vega scissor bug workaroundMarek Olšák2018-10-167-34/+80
| | | | | | We should get fewer context rolls with the SET_CONTEXT_REG optimization, but it would have been for nothing if the scissor state rolled the context anyway. Don't emit the scissor state if there is no context roll.
* radeonsi: emit sample locations for 1xAA only when the hw bug is presentMarek Olšák2018-10-161-4/+2
|
* radeonsi: use compute shaders for clear_buffer & copy_bufferMarek Olšák2018-10-168-203/+350
| | | | | Fast color clears should be much faster. Also, fast color clears on evicted buffers should be 200x faster on GFX8 and older.
* radeonsi: use copy_buffer in buffer_do_flush_region directlyMarek Olšák2018-10-161-11/+4
|
* radeonsi: use faster integer division for instance divisorsMarek Olšák2018-10-163-36/+83
| | | | | | | | | | We know the divisors when we upload them, so instead we can precompute and upload division factors derived from each divisor. This fast division consists of add, mul_hi, and two shifts, and we have to load 4 dwords intead of 1. This probably won't affect any apps.
* radeonsi: use higher subpixel precision (QUANT_MODE) for smaller viewportsMarek Olšák2018-10-163-9/+53
|
* radeonsi: move emission of PA_SU_VTX_CNTL into emit_guardbandMarek Olšák2018-10-164-6/+11
| | | | | We'll modify the quant mode there, which also affects the guarband computation.
* radeonsi: don't re-upload the sample position constant buffer repeatedlyMarek Olšák2018-10-164-16/+33
|
* radeonsi: set PA_SU_PRIM_FILTER_CNTL optimallyMarek Olšák2018-10-163-4/+13
|
* radeonsi: center viewport to improve guardband clipping for high resolutionsMarek Olšák2018-10-164-14/+62
| | | | | | This will be more useful when we change the quant mode to increase subpixel precision and decrease the viewport range (which might not be possible if the viewport is not centered in the viewport range).
* radeonsi: save raster config in screen, add se_tile_repeatMarek Olšák2018-10-163-7/+17
|
* radeonsi: switch back to standard DX sample positionsMarek Olšák2018-10-161-17/+26
| | | | Apps may rely on them.
* radeonsi: add GDS support to CP DMAMarek Olšák2018-10-163-21/+89
|
* radeonsi: rename si_gfx_* functions to si_cp_*Marek Olšák2018-10-165-59/+59
| | | | and write_event_eop -> release_mem
* radeonsi: make si_gfx_write_event_eop more configurableMarek Olšák2018-10-165-15/+29
|
* intel/nir, freedreno/ir3: Use the separated dead write vars passCaio Marcelo de Oliveira Filho2018-10-151-0/+1
| | | | | | | No changes to shader-db for intel. No changes to shader-db expected for freedreno. Reviewed-by: Jason Ekstrand <[email protected]>
* v3d: Add support for hardware pack/unpack of half floats.Eric Anholt2018-10-151-0/+1
| | | | | Cuts the formerly 7-minute simulation time of fs-packHalf2x16.shader_test in half.
* gallium/ttn: Convert inputs and outputs to derefs of variables.Eric Anholt2018-10-154-69/+64
| | | | | | | | | | | This means that TTN shaders more closely resemble GTN shaders: they have inputs and outputs as variable derefs, with the variables having their .driver_location already set up for you. This will be useful for v3d to do input variable DCE in NIR, which we can't do when the TTN shaders never have a pre-nir_lower_io stage. Acked-by: Rob Clark <[email protected]>
* gallium/ttn: Fix the type of gl_FragDepth.Eric Anholt2018-10-151-0/+1
| | | | | | | | In TGSI we have a vec4 of which only .z is used, but for NIR we should be using a float the same as other NIR IR. We were already moving TGSI's .z to the .x channel. Acked-by: Rob Clark <[email protected]>
* freedreno/a6xx: Enable blitterKristian H. Kristensen2018-10-155-0/+623
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Update headersKristian H. Kristensen2018-10-151-16/+30
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Remove unnecessary GRAS_2D_BLIT_INFO writeKristian H. Kristensen2018-10-151-2/+0
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* gallium/u_transfer_helper: Add support for separate Z24/S8 as well.Kenneth Graunke2018-10-145-22/+60
| | | | | | | | | | | | | | | | u_transfer_helper already had code to handle treating packed Z32_S8 as separate Z32_FLOAT and S8_UINT resources, since some drivers can't handle that interleaved format natively. Other hardware needs depth and stencil as separate resources for all formats. For example, V3D3 needs this for 24-bit depth as well. This patch adds a new flag to lower all depth/stencils formats, and implements support for Z24_UNORM_S8_UINT. (S8_UINT_Z24_UNORM is left as an exercise to the reader, preferably someone who has access to a machine that uses that format.) Reviewed-by: Eric Anholt <[email protected]>
* gallium/format: Add a helper to combine separate Z24 and S8 stencil.Kenneth Graunke2018-10-142-0/+22
| | | | | | | This new function takes separate Z24 depth and S8 stencil sources, and packs them into a single combined Z24S8 buffer. Reviewed-by: Eric Anholt <[email protected]>
* gallium/auxiliary: Add util_format_get_depth_only() helper.Kenneth Graunke2018-10-141-0/+21
| | | | | | | This will be used by u_transfer_helper.c shortly, in order to split packed depth-stencil into separate resources. Reviewed-by: Eric Anholt <[email protected]>
* r600/sb: Fix constant-logical-operand warning.Vinson Lee2018-10-121-1/+1
| | | | | | | | | | | | | | | | | sb/sb_bc_parser.cpp:620:27: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand] if (cf->bc.op_ptr->flags && FF_GDS) ^ ~~~~~~ sb/sb_bc_parser.cpp:620:27: note: use '&' for a bitwise operation if (cf->bc.op_ptr->flags && FF_GDS) ^~ & sb/sb_bc_parser.cpp:620:27: note: remove constant to silence this warning if (cf->bc.op_ptr->flags && FF_GDS) ~^~~~~~~~~ Fixes: da977ad90747 ("r600/sb: start adding GDS support") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* scons: Allow building with custom MSVC_USE_SCRIPT script.Jose Fonseca2018-10-121-0/+1
| | | | | | | | | | | | | | | | | | SCons MSVC support relies on vcvarsall.bat to extract the PATH, CPP includes, library paths, etc. And SCons also has an build env var named MSVC_USE_SCRIPT which one can use to point to alternative vcvarsall.bat script. This change exposes this MSVC_USE_SCRIPT build env variable as a SCons command line variable. This will enable using MSVC outside Program Files (e.g, network shares, etc.) This change also links advapi32 library, necessary for the Windows Registry API used by WGL state tracker, avoiding missing symbols. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/va: use provided sizes and coords for vlVaGetImageBoyuan Zhang2018-10-111-3/+28
| | | | | | | | | | | | | | | | vlVaGetImage should respect the width, height, and coordinates x and y that passed in. Therefore, pipe_box should be created with the passed in values instead of surface width/height. v2: add input size check, return error when size out of bounds v3: fix the size check for vaimage v4: add size adjustment for x and y coordinates Signed-off-by: Boyuan Zhang <[email protected]> Cc: "18.2" <[email protected]> Reviewed-by: Leo Liu <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Acked-by: Christian König <[email protected]>
* svga: change svga_destroy_shader_variant() to return voidBrian Paul2018-10-095-23/+6
| | | | | | | | | | | | | svga_destroy_shader_variant() itself flushes and retries the command if there's a failure. So no need for the callers to do it. Other callers of the function were already ignoring the return value. This also fixes a corner-case double-free reported by Coverity (and reported by Dave Airlie). Tested with various OpenGL apps. Reviewed-by: Charmaine Lee <[email protected]>
* nvc0: fix blitting red to srgb8_alphaIlia Mirkin2018-10-091-0/+4
| | | | | | | | | | | | | For some reason the 2d engine can't handle this. Red formats get special treatment there, so perhaps related. Fixes dEQP-GLES3 tests of the form: dEQP-GLES3.functional.fbo.blit.conversion.r{8,16f,32f}_to_srgb8_alpha8 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* nv50,nvc0: guard against zero-size blitsIlia Mirkin2018-10-092-0/+14
| | | | | | | | | | The current state tracker can generate these sometimes. Fixing this is more involved, and due to some integer math we can generate divisions-by-zero. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* nv50,nvc0: mark RGBX_UINT formats as renderableIlia Mirkin2018-10-091-4/+4
| | | | | | | | | | | | | | | | This helps st/mesa avoid some (apparently) buggy fallbacks. Specifically the CopyTexSubImage fallback tries to read texture A as RGBA_FLOAT and write back that data into the target format, which fails for integer formats which have no appropriate logic to do the conversion. Since integer formats don't blend, there's no harm in the fact that the "A" component gets written anyways. Fixes, among others: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/canvas/tex-2d-rgb8ui-rgb_integer-unsigned_byte.html Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* st/dri: Handle BGRA5551 formatMichel Dänzer2018-10-091-0/+13
| | | | Reviewed-by: Marek Olšák <[email protected]>
* freedreno/a5xx+a6xx: fix LRZ pitch alignmentRob Clark2018-10-081-1/+1
| | | | | | | Both RB_2D_DST_SIZE.PITCH (a6xx) and RB_MRT[n].PITCH (a5xx) need alignment to 64. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add LRZ supportRob Clark2018-10-088-132/+104
| | | | | | | | | | As with a5xx, hidden behind FD_MESA_DEBUG=lrz due to being paranoid about z-fighting issues with some games (in particular, this was observed with 0ad on a5xx.. but I think the proper solution to enable this by default is to figure out how to do driver specific driconf options). Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-10-087-38/+120
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add helper for various CP_EVENT_WRITERob Clark2018-10-085-38/+30
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: remove unused fxnsRob Clark2018-10-082-19/+0
| | | | Signed-off-by: Rob Clark <[email protected]>