summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* amd/common: add ac_build_waitcnt()Samuel Pitoiset2017-12-143-15/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: make use of ac_build_fdiv()Samuel Pitoiset2017-12-141-7/+1
| | | | | | | And move the comment to amd/common. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: make use of ac_get_spi_shader_z_format()Samuel Pitoiset2017-12-143-23/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* swr: Correct texture allocation and limit max size to 2GBBruce Cherniak2017-12-132-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes piglit tex3d-maxsize by correcting 4 things: The total_size calculation was using 32-bit math, therefore a >4GB allocation request overflowed and was not returning false (unsupported). Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle >4GB allocations. Added error checking on texture allocations to fail gracefully. Finally, temporarily decreased supported max texture size from 4GB to 2GB. The gallivm texture-sampler needs some additional work to correctly handle larger than 2GB textures (offsets to LLVMBuildGEP are signed). I'm working on a follow-on patch to allow up to 4GB textures, as this is useful in HPC visualization applications. Fixes piglit tex3d-maxsize. v2: Updated patch description to clarify ">4GB". Reviewed-By: George Kyriazis <[email protected]>
* swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.Bruce Cherniak2017-12-131-2/+1
| | | | | | | | | | | | | Environment variable KNOB_MAX_WORKER_THREADS allows the user to override default thread creation and thread binding. Previous commit to adjust linux cpu topology caused setting this KNOB to bind all threads to a single core. This patch restores correct functionality of override. Cc: <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* gallium/docs: document behavior of set_sample_mask()Brian Paul2017-12-131-1/+4
| | | | | | | | The sample mask is used even if msaa is not explicity enabled when we have a framebuffer with multisampled surfaces. That's DX behavior and what the Radeon drivers do. Not sure about other drivers at this point. Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: create get_tcs_tes_buffer_address helperTimothy Arceri2017-12-131-12/+32
| | | | | | This will be shared between the NIR and TGSI backends. Reviewed-by: Nicolai Hähnle <[email protected]>
* cso: add point rasterization sanity check assertionBrian Paul2017-12-121-0/+5
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/u_blitter: replace tabs with spacesBrian Paul2017-12-121-18/+18
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: don't pass a pipe_resource to util_resource_is_array_texture()Brian Paul2017-12-122-4/+4
| | | | | | | | | No need to pass a pipe_resource when we can just pass the target. This makes the function potentially more usable. Rename it too. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/aux: include nr_samples in util_resource_size() computationBrian Paul2017-12-121-1/+2
| | | | | | | | | | | | | This function is only used in two places: 1. VMware driver, but only for HUD reporting 2. st/nine state tracker, used for texture memory accounting Fixes: a69efa9482d ("util: add new util_resource_size() function in u_resource.[ch]") Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* svga: trivial whitespace/formatting fixes in svga_pipe_rasterizer.cBrian Paul2017-12-121-9/+5
|
* gallivm: fix texture wrapping for texture gather for mirror modesRoland Scheidegger2017-12-121-74/+171
| | | | | | | | | | | | | | | | | | | Care must be taken that all coords end up correct, the tests are very sensitive that everything is correctly rounded. This doesn't matter for bilinear filter (since picking a wrong texel with weight zero is ok), and we could also switch the per-sample coords mistakenly. While here, also optimize the coord_mirror helper a bit (we can do the mirroring directly by exploiting float rounding, no need for fixing up odd/even manually). I did not touch the mirror_clamp and mirror_clamp_to_border modes. In contrast to mirror_clamp_to_edge and mirror_repeat these are legacy modes. They are specified against old gl rules, which actually does the mirroring not per sample (so you get swapped order if the coord is in the mirrored section). I think the idea though is that they should follow the respecified mirror_clamp_to_edge rules so the order would be correct. Reviewed-by: Jose Fonseca <[email protected]>
* winsys/amdgpu: disable local BOs again due to worse performanceMarek Olšák2017-12-111-2/+3
| | | | | Cc: 17.3 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon/vce: move destroy command before feedback commandLeo Liu2017-12-081-1/+1
| | | | | | | | | | | | | | VCE processing IBs starts from session and task info at first level, other commands processed subsequently. The task info for destroy is embedded to destroy command, resulting that feedback command is not properly procoessed. This is causing kernel spin VM fault messages on Polaris and Vega10 card when running ends at encode application. The fix is also verified on VCE physical mode card. Signed-off-by: Leo Liu <[email protected]> Cc: [email protected] Acked-by: Christian König <[email protected]>
* meson: Add lmsensors to gallium libgl-xlib target.Dylan Baker2017-12-071-1/+3
| | | | | | Fixes: 5e71efef44b992b5d70b ("meson: Add lmsensors support") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: add dep_thread to every lib that includes threads.hEric Engestrom2017-12-074-3/+4
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104141 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* meson: fix pl111 dependency on vc4Eric Engestrom2017-12-072-5/+6
| | | | | | | | src/gallium/winsys/pl111/drm/libpl111winsys.a(pl111_drm_winsys.c.o): In function `pl111_drm_screen_create': pl111_drm_winsys.c:(.text+0x33): undefined reference to `vc4_drm_screen_create_renderonly' Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* r600/sb: do not convert if-blocks that contain indirect array accessGert Wollny2017-12-073-2/+5
| | | | | | | | | | | | | | | | | | | | If an array is accessed within an if block, then currently it is not known whether the value in the address register is involved in the evaluation of the if condition, and converting the if condition may actually result in out-of-bounds array access. Consequently, if blocks that contain indirect array access should not be converted. Fixes piglits on r600/BARTS: spec/glsl-1.10/execution/variable-indexing/ vs-output-array-float-index-wr vs-output-array-vec3-index-wr vs-output-array-vec4-index-wr Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104143 Signed-off-by: Gert Wollny <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: add support for compute grid/block sizes. (v2)Dave Airlie2017-12-064-3/+100
| | | | | | | | | | We just pass these in from outside in a constant buffer. The shader side stores them once they are accessed once. v2: fix to not use a temp_reg. Signed-off-by: Dave Airlie <[email protected]>
* r600: handle image/buffer sizes correctly.Dave Airlie2017-12-063-4/+21
| | | | | | This adds support to compute for the resq workarounds (buffer/cube sizes) Signed-off-by: Dave Airlie <[email protected]>
* r600/compute: add support for emitting compute image/buffer atomsDave Airlie2017-12-061-1/+9
| | | | Signed-off-by: Dave Airlie <[email protected]>
* r600/compute: handle atomic counters in compute state.Dave Airlie2017-12-061-0/+9
| | | | Signed-off-by: Dave Airlie <[email protected]>
* r600/compute: add support for TGSI compute shaders. (v1.1)Dave Airlie2017-12-062-28/+103
| | | | | | | | | | | This add paths to handle TGSI compute shaders and shader selection. It also avoids emitting certain things on tgsi paths, CBs, vertex buffers, config reg init (not required). v1.1: fix rat mask calc Signed-off-by: Dave Airlie <[email protected]>
* r600/shader: add compute support to shader assemblerDave Airlie2017-12-061-0/+14
| | | | Signed-off-by: Dave Airlie <[email protected]>
* r600/texture: drop lowering 1d/2d images to linear.Dave Airlie2017-12-061-8/+0
| | | | | | | This appears to cause hangs with compute images. Unless we can find more specifics, just don't do this for now. Signed-off-by: Dave Airlie <[email protected]>
* meson: fix keyword argument in declare_dependency()Eric Engestrom2017-12-062-2/+2
| | | | | | | | | `declare_dependency()` takes `compile_args`, not `c_args`. It was correct in all the other `declare_dependency()` from that commit. Fixes: 0bbecc5a8548883f76a71 "meson: define driver dependencies" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* swr/scons: Fix another intermittent build failureGeorge Kyriazis2017-12-061-0/+1
| | | | | | | gen_BackendPixelRate*.cpp depends on gen_ar_eventhandler.hpp. Fix missing dependency. Reviewed-by: Bruce Cherniak <[email protected]>
* radeonsi: make const and stream uploaders allocate read-only memoryMarek Olšák2017-12-061-2/+5
| | | | | | | and anything that clones these uploaders, like u_threaded_context. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a separate allocator for fine fencesMarek Olšák2017-12-063-1/+9
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: make shader binaries use read-only memoryMarek Olšák2017-12-065-3/+13
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: make IBs use read-only memoryMarek Olšák2017-12-061-0/+1
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print the buffer list for CHECK_VMMarek Olšák2017-12-061-0/+1
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow DMABUF exports for local buffersMarek Olšák2017-12-061-1/+4
| | | | | | Cc: 17.3 <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: always place sparse buffers in VRAMNicolai Hähnle2017-12-062-2/+6
| | | | | | | | | | | | Together with "radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check", this ensures that sparse buffers are placed in VRAM. Noticed by an assertion that started triggering with commit d4fac1e1d7 ("gallium/radeon: enable suballocations for VRAM with no CPU access") Fixes KHR-GL45.sparse_buffer_tests.BufferStorageTest in debug builds. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE checkNicolai Hähnle2017-12-061-1/+1
| | | | | | | | | | | The flag is on the pipe_resource, not the r600_resource. I don't see an obvious bug related to this, but it could potentially lead to suboptimal placement of some resources. Fixes: a41587433c4d ("gallium/radeon: add R600_RESOURCE_FLAG_UNMAPPABLE") Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* freedreno/a5xx: hide ARB_base_instanceRob Clark2017-12-051-1/+8
| | | | | | Grrr.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle input/output componentRob Clark2017-12-051-4/+6
| | | | | | | | | | | | | | | | | After the mesa/st nir linking support, we start to see inputs/outputs like: decl_var shader_out INTERP_MODE_NONE float packed:uv (VARYING_SLOT_VAR9.x, 1, 0) decl_var shader_out INTERP_MODE_NONE float packed:uv@0 (VARYING_SLOT_VAR9.y, 1, 0) (ie. were location_frac != .x) Unfortunately I overlooked the addition of the component parameter to load_input/store_output, so when we started encountering inputs/outputs with component other than .x, we'd end up loading/storing the wrong input/output. Signed-off-by: Rob Clark <[email protected]>
* r600: refactor and export some shader selector code for computeDave Airlie2017-12-052-7/+27
| | | | | | This just moves some code around to make it easier to add compute. Signed-off-by: Dave Airlie <[email protected]>
* r600: add compute support to compressed resource handling.Dave Airlie2017-12-052-6/+26
| | | | | | This just adds support for decompressing compute resources. Signed-off-by: Dave Airlie <[email protected]>
* r600: update max threads per block for evergreen computeDave Airlie2017-12-051-0/+4
| | | | Signed-off-by: Dave Airlie <[email protected]>
* r600/shader: add local memory support to shader assembler.Dave Airlie2017-12-051-0/+165
| | | | | | | | This is needed for compute shaders. v1.1: make work for vectors, fix missing lds ops. Signed-off-by: Dave Airlie <[email protected]>
* r600/cs: add support for compute to image/buffers/atomics stateDave Airlie2017-12-054-19/+79
| | | | | | | This just adds the compute paths to state handling for the main objects Signed-off-by: Dave Airlie <[email protected]>
* r600: handle compute null key shader stateDave Airlie2017-12-051-0/+2
| | | | Signed-off-by: Dave Airlie <[email protected]>
* r600: add some missing cayman register definesDave Airlie2017-12-051-0/+4
| | | | | | These are just taken from the kernel, and were seen in some fglrx dumps. Signed-off-by: Dave Airlie <[email protected]>
* r600: don't set EOP on pop or loop endDave Airlie2017-12-051-1/+1
| | | | | | This appears to bad, compute shaders hang without it. Signed-off-by: Dave Airlie <[email protected]>
* r600/ssbo: refactor out buffer coord calcs and use for atomic path.Dave Airlie2017-12-051-34/+37
| | | | | | | | The atomic rat path has a bug in the ssbo path, refactor out the address calcs from the load/store paths and reuse to fix the bug in the buffer rat atomic path. Signed-off-by: Dave Airlie <[email protected]>
* r600/ssbo: fix multi-dword buffer loads.Dave Airlie2017-12-051-5/+7
| | | | | | This fixes loading from different channels. Signed-off-by: Dave Airlie <[email protected]>
* r600/ssbo: use r32ui format for ssbo resources.Dave Airlie2017-12-051-3/+3
| | | | | | | This works best for returning the correct values and sizes in tests. Signed-off-by: Dave Airlie <[email protected]>
* r600: refactor out the immediate setup code.Dave Airlie2017-12-051-38/+28
| | | | | | This just refactors the same code out of the images/buffers paths. Signed-off-by: Dave Airlie <[email protected]>