summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: Honour pipe_rasterizer::point_quad_rasterization.José Fonseca2014-01-091-10/+57
| | | | | | | | | | | | Commit eda21d2a3010d9fc5a68b55a843c5e44b2abf8dd fixed the rasterization of points for Direct3D but ended up breaking the rasterization of OpenGL non-sprite points, in particular conform's pntrast.c test. The only way to get both working is to properly honour pipe_rasterizer::point_quad_rasterization, and follow the weird OpenGL rule when it is false. Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: Work around internal compiler errorThomas Sondergaard2014-01-081-2/+2
| | | | | | | | | This small rearrangement avoids MSVC 2013 ICE. Also, this should be a better memory access order. Cc: "10.0" <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* freedreno: add basic query supportRob Clark2014-01-088-1/+275
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add for now some simple/basic query support (ie. things not actually requiring the GPU). Might change around a bit when I actually add GPU queries, but for now this enables some useful performance info in the GALLIUM_HUD. For example: GALLIUM_HUD=fps+batches+batches-sysmem+batches-gmem+restores,draw-calls The driver specific specific queries are: + draw-calls + batches - number of batches per second, sum of batches-sysmem plus batches-gmem + batches-gmem - render a set of tiles in GMEM, for each tile (optionally) system mem -> gmem (restore), plus N draws, plus gmem -> system mem (resolve) per second + batches-sysmem - N draws to system memory (GMEM bypass) per second + restores - number of GMEM batches that required restore per second Ideally for GMEM rendering, you want batches-gmem to equal fps. If the app is doing something that triggers multiple passes (ie. requires extra round trip gmem <-> system memory) then the # of batches per second will go up relative to fps. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: use cs patch instead of RFI+RMWRob Clark2014-01-088-52/+46
| | | | | | | | Since we now have the cmdstream patch mechanism needed for hw binning, might as well also use it for RB_RENDER_CONTROL updates. This avoids the need to use RMW (and associated WFI) to update RB_RENDER_CONTROL. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: support for hw binning passRob Clark2014-01-0815-158/+706
| | | | | | | | | | | | | | | | | | | | | | | The binning pass sorts vertices into which bins/tiles they apply to. The visibility information generated during the binning pass can be used to speed up the rendering pass by filtering out vertices which do not apply to the current tile. See: https://github.com/freedreno/freedreno/wiki/Adreno-tiling#optimized-approach This brings a significant fps boost. A rough assortment of tests (supertuxkart, etracer, tremulous, glmark2 'build' test, etc) seems to yield a ~35-45% fps improvement. For now, to be conservative, the binning pass is not enabled yet by default. To enable it use: FD_MESA_DEBUG=binning So far I haven't found anything that breaks with binning enabled, but I'd like a bit more testing before I enable it as default. Signed-off-by: Rob Clark <[email protected]>
* freedreno: be more clever about gmem usageRob Clark2014-01-082-9/+18
| | | | | | Only need to leave room for depth/stencil if it is actually used, etc. Signed-off-by: Rob Clark <[email protected]>
* freedreno: resync generated headersRob Clark2014-01-085-24/+214
| | | | Signed-off-by: Rob Clark <[email protected]>
* llvmpipe: Fix the bottom_edge_rule adjustment for points.José Fonseca2014-01-081-4/+4
| | | | | | | | | The adjustment needs to be applied to the y coordinates and not the x coordinates, just like the equivalent code for lines and triangles in lp_setup_line.c and lp_setup_tri.c. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: Respect bottom_edge_rule when computing the rasterization bounding ↵José Fonseca2014-01-083-3/+3
| | | | | | | | | | | | | | boxes. This was inadvertently forgotten when replacing gl_rasterization_rules with lower_left_origin and half_pixel_center (commit 2737abb44efebfa10ac84b183c20fc5818d1514e). This makes a difference when lower_left_origin != half_pixel_center, e.g, D3D10. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* ilo: enable HiZChia-I Wu2014-01-084-7/+45
| | | | | | The support is still early. Fast depth buffer clear is not enabled yet. HiZ can be forced off with ILO_DEBUG=nohiz.
* ilo: resolve Z/HiZ correctlyChia-I Wu2014-01-085-1/+234
| | | | | | When the depth buffer is to be read, perform a Depth Buffer Resolve if it has been rendered. When the depth buffer is to be rendered, perform a HiZ Buffer Resolve when the depth buffer is modified externally.
* ilo: add flags to texture slicesChia-I Wu2014-01-081-0/+29
| | | | | The flags are used to mark who (CPU, BLT, or RENDER) has accessed the resource and how (READ or WRITE).
* ilo: rename and add an accessor for texture slicesChia-I Wu2014-01-084-19/+41
| | | | | Rename ilo_texture::slice_offsets to ilo_texture::slices and add an accessor, ilo_texture_get_slice().
* ilo: add HiZ op support to the pipelinesChia-I Wu2014-01-0811-4/+1070
| | | | | | Add blitter functions to perform Depth Buffer Clear, Depth Buffer Resolve, and Hierarchical Depth Buffer Resolve. Those functions set ilo_blitter up and pass it to the pipelines to emit the commands.
* ilo: add support for HiZ allocationChia-I Wu2014-01-082-1/+82
| | | | Add tex_create_hiz() to create HiZ bo. It is not really called yet.
* ilo: refactor separate stencil allocationChia-I Wu2014-01-081-20/+27
| | | | | Move separate stencil allocation code to tex_create_separate_stencil to keep tex_create sane.
* ilo: assorted GPE fixes for HiZChia-I Wu2014-01-085-69/+67
| | | | | | | Allow HiZ op to be specified in 3DSTATE_WM. Pass depth format directly in gen7_emit_3DSTATE_SF. Use tex->hiz.bo to determine if HiZ exists. Fix 3DSTATE_SF for the case when there is no ilo_rasterizer_state. Fix 3DSTATE_PS for the case when there is no ilo_shader_state.
* ilo: no layer offsetting on GEN7+Chia-I Wu2014-01-081-1/+5
| | | | | Even though the Ivy Bridge PRM lists some restrictions that require layer offsetting as the Sandy Bridge PRM does, it seems they are actually lifted.
* ilo: offset to layers only when necessaryChia-I Wu2014-01-084-20/+137
| | | | | | | GEN6 has several requirements regarding the LOD/Depth/Width/Height of the render targets and the depth buffer. We used to offset to the layers in question unconditionally to meet the requirements. With this commit, offseting is done only when the requirements are not met.
* ilo: allow ilo_zs_surface to skip layer offsettingChia-I Wu2014-01-083-19/+18
| | | | Make offset to layer optional in ilo_gpe_init_zs_surface.
* ilo: allow ilo_view_surface to skip layer offsettingChia-I Wu2014-01-084-88/+72
| | | | | Make offset to layer optional in ilo_gpe_init_view_surface_for_texture. render_cache_rw is always the same as is_rt and is replaced.
* llvmpipe: Basic implementation of pipe_context::set_sample_mask.José Fonseca2014-01-075-7/+20
| | | | | | | | | | | | | | | | | We don't support MSAA (ie, number of samples is always one) therefore sample_mask boils down to a synonym of the rasterizer_discard flag. Also, this change makes setup actually use the value received in lp_setup_set_rasterizer_discard instead of reaching out to llvmpipe upper layers to re-fetch it. Based on Si Chen's draft. With this patch `wgf11multisample Coverage passes 100%` on the UMD D3D10 state tracker. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Si Chen <[email protected]>
* cso_context: Fix cso_context::sample_mask initial value.José Fonseca2014-01-071-1/+1
| | | | | | | | | | | | The initial value of cso_context::sample_mask_saved is irrelevant as it will be overwritten with cso_context::sample_mask in cso_save_sample_mask. Therefore it is cso_context::sample_mask that needs to be properly initialized. This fixes regressions in blits and mipmap generation after adding support for sample_mask to llvmpipe. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Implement alpha_to_coverage for non-MSAA framebuffers.Si Chen2014-01-073-1/+59
| | | | | | | | Implement Alpha to Coverage by discarding a fragment alpha component is less than 0.5. This is a joint work of Jose and Si. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/draw: remove double semicolonDave Airlie2014-01-071-1/+1
| | | | | | code cleanup. Signed-off-by: Dave Airlie <[email protected]>
* haiku libGL: Move from gallium target to src/hglAlexander von Gluck IV2014-01-069-1241/+2
| | | | | | | | | | | | * The Haiku renderers need to link to libGL to function properly in all usage contexts. As mesa drivers build before gallium targets, we couldn't properly link the mesa swrast driver to the gallium libGL target for Haiku. * This is likely better as it mimics how glx is laid out ensuring the Haiku libGL is better understood. * All renderers properly link in libGL now. Acked-by: Brian Paul <[email protected]>
* haiku: Fix missing HaikuGL header pathsAlexander von Gluck IV2014-01-062-0/+2
| | | | Acked-by: Brian Paul <[email protected]>
* radeonsi: calculate NUM_BANKS for DB correctly on CIKMarek Olšák2014-01-063-4/+27
| | | | | | NUM_BANKS is not constant on CIK. Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: set correct pipe config for Hawaii in DBMarek Olšák2014-01-063-15/+26
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: disable HTILE for 1D-tiled depth-stencil buffersMarek Olšák2014-01-061-0/+5
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* freedreno/a3xx: fix blend state corruption issueRob Clark2013-12-268-33/+66
| | | | | | | | | | | | | | | | | | | | | | Using RMW on banked context registers is not safe. The value read could be the wrong one. So if there has been a DRAW_IDX launched, the RMW must be preceded by a WAIT_FOR_IDLE to ensure the read part of RMW sees the correct value. To avoid unnecessary WFI's, keep track if there is a need for WFI, and only emit one if needed. Furthermore, keep track if we even need to update the register in the first place. And to cut down on the amount of RMW to avoid excessive WFI's, at the tiling/GMEM level we can always overwrite RB_RENDER_CONTROL, as the state at beginning of draw/clear cmds (which we IB to) is always undefined. In the draw/clear commands, we always still use RMW (with WFI if needed), but only if the register value actually changes. (At points where the current value cannot be known, the saved value is reset to ~0, which includes bits outside of RBRC_DRAW_STATE, so there never is chance for confusion.) Signed-off-by: Rob Clark <[email protected]>
* freedreno: prepare for hw binningRob Clark2013-12-269-142/+159
| | | | | | | | | | | | Actually assign VSC_PIPE's properly, which will be needed for tiling. And introduce fd_tile for per-tile state (including the assignment of tile to VSC_PIPE). This gives us the proper pipe setup that we'll need for hw binning pass, and also cleans things up a bit by not having to pass so many parameters around. And will also make it easier to introduce different tiling patterns (since we may no longer render tiles in a simple left-to-right top-to-bottom pattern). Signed-off-by: Rob Clark <[email protected]>
* freedreno: resync generated headersRob Clark2013-12-266-76/+131
| | | | Signed-off-by: Rob Clark <[email protected]>
* r600/pipe: Stop leaking context->start_compute_cs_cmd.buf on EG/CMAaron Watry2013-12-231-0/+2
| | | | | | | | Found while tracking down memory leaks in VDPAU playback Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* st/vdpau: Destroy context when initialization failsAaron Watry2013-12-231-0/+1
| | | | | | | | | Prevents a potential memory leak found when tracking down something else. Reviewed-by: Christian König <[email protected]> Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* radeon/llvm: Free target data at end of optimizationAaron Watry2013-12-231-0/+1
| | | | | | Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* r600/compute: Use the correct FREE macro when deleting compute stateAaron Watry2013-12-231-1/+1
| | | | | | Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* r600/compute: Free compiled kernels when deleting compute stateAaron Watry2013-12-231-0/+2
| | | | | | v2: Remove unnecessary null pointer check CC: "10.0" <[email protected]>
* radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcodeAaron Watry2013-12-235-18/+41
| | | | | | | | | | | | | | | | | | | Previously we were creating a new LLVMContext every time that we called radeon_llvm_parse_bitcode, which caused us to leak the context every time that we compiled a CL program. Sadly, we can't dispose of the LLVMContext at the point that it was being created because evergreen_launch_grid (and possibly the SI equivalent) was assuming that the context used to compile the kernels was still available. Now, we'll create a new LLVMContext when creating EG/SI compute state, store it there, and pass it to all of the places that need it. The LLVM Context gets destroyed when we delete the EG/SI compute state. Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* pipe_loader/sw: close dev->lib when initialization failsAaron Watry2013-12-231-1/+4
| | | | | | | | Prevents a memory leak. Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* clover: Remove unused variableAaron Watry2013-12-231-1/+0
| | | | | | Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* llvmpipe: use pipe_sampler_view_release() to avoid segfaultJonathan Liu2013-12-221-0/+6
| | | | | | | | | This fixes another case of faulting when freeing a pipe_sampler_view that belongs to a previously destroyed context. Cc: "10.0" <[email protected]> Signed-off-by: Jonathan Liu <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeonsi: Use htile_buffer for depth only when there is no stencil.Andreas Hartmetz2013-12-221-0/+8
| | | | Signed-off-by: Marek Olšák <[email protected]>
* winsys/radeon: remove superfluous distinction of casesNiels Ole Salscheider2013-12-221-15/+5
| | | | | | Signed-off-by: Niels Ole Salscheider <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: Only scan pixel shaders for TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFSMichel Dänzer2013-12-201-4/+7
| | | | | | It's not relevant for other shader types. Reviewed-by: Marek Olšák <[email protected]>
* r600g: Fix spelling errorAaron Watry2013-12-191-1/+1
| | | | Trivial change, testing commit access
* clover: Append buffers that use CL_MEM_USE_HOST_PTR.Jan Vesely2013-12-181-1/+1
| | | | | | | | | Specs say it's legal for implementations to use internal copies, and the write synchronization seems to work. Fixes clCreateBuffer (together with previous patches) and buffer-flags piglits. Signed-off-by: Jan Vesely <[email protected]> Acked-by: Francisco Jerez <[email protected]>
* clover: Add parameter checks to clCreateBuffer.Jan Vesely2013-12-181-1/+13
| | | | | | | | | v2: Use fewer if statements and functional tricks instead of single-use method, suggested by Francisco Jerez. Squash two small patches into one. Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* st/dri: Check for kernel support before enabling fd sharing v2Thomas Hellstrom2013-12-182-8/+11
| | | | | | | | | | | | | The dri2 state tracker is checking for driver support before enabling dri2ImageExtension version 7. This commit adds a check that also the kernel driver supports fd sharing through prime. Note that this adds a libdrm dependency on dri2.c. v2: Removed unnecessary clamping of bool expression Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Christopher James Halse Rogers <[email protected]>
* radeonsi: set CB_DISABLE if the color mask is 0Marek Olšák2013-12-181-3/+8
| | | | | | Also needed for the DB in-place decompression according to hw docs. Reviewed-by: Alex Deucher <[email protected]>