summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* svga: scons: remove unused HAVE_SYS_TYPES_H defineEmil Velikov2015-07-292-2/+0
| | | | | | | | | There isn't a single instance in mesa that mentions HAVE_SYS_TYPES_H, other than this file. Cc: Jose Fonseca <[email protected]> Acked-by: Brian Paul <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* gallium/auxiliary: Avoid double promotion.Matt Turner2015-07-292-2/+2
| | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium/auxiliary: Use exp2(x) instead of pow(2.0, x).Matt Turner2015-07-292-4/+4
|
* nvc0/ir: cache vertex out base so that we don't recompute againIlia Mirkin2015-07-291-8/+15
| | | | | | | | The global CSE pass stinks and is unable to pull this out. Easy enough to handle it here and avoid generating unnecessary special register loads (which can allegedly be quite slow). Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: output base for reading is based on laneidIlia Mirkin2015-07-291-0/+25
| | | | | | | | | | | PFETCH retrieves the address for incoming vertices, not output vertices in TCS. For output vertices, we must use the laneid as a base. Fixes barrier piglit test, which was failing for entirely non-barrier reasons, but rather that it was (a) trying to draw multiple patches and (b) the incoming patch size was not the same as the outgoing patch size. Signed-off-by: Ilia Mirkin <[email protected]>
* Revert "pipe-loader: simplify pipe_loader_drm_probe"Francisco Jerez2015-07-291-4/+9
| | | | | | | | | This reverts commit a27ec5dc460b91dc44675f48cddbbb2631ee824f. It breaks the intended behaviour of pipe_loader_probe() with ndev==0 as relied upon by clover to query the number of devices available to the pipe loader in the system. Acked-by: Emil Velikov <[email protected]>
* radeon: add support for streams to the common streamout code. (v2)Dave Airlie2015-07-296-15/+50
| | | | | | | | | | | | This adds to the common radeon streamout code, support for multiple streams. It updates radeonsi/r600 to set the enabled mask up. v2: update for changes in previous patch. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeon: move streamout buffer config to streamout enable function. (v2)Dave Airlie2015-07-292-9/+15
| | | | | | | | | | This will be used here later. v2: update atom sizes add check for old vs new enabled mask Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* vc4: Skip re-emitting the shader_rec if it's unchanged.Eric Anholt2015-07-285-43/+158
| | | | | | | | It's a bunch of work for us to emit it (and its uniforms), more work for the kernel to validate it, and additional work for the CLE to read it. Improves es2gears framerate by about 50%. Signed-off-by: Eric Anholt <[email protected]>
* vc4: Drop unused vpm_offset value.Eric Anholt2015-07-281-3/+0
| | | | | It's been dead since we started doing VS/CS attr offset setup during shader compile.
* vc4: Simplify vc4_use_bo and make sure it's not a shader.Eric Anholt2015-07-284-39/+26
| | | | | | | Since the conversion to keeping validated shaders around for the BO's lifetime, we haven't been checking that rendering doesn't happen to shaders. Make vc4_use_bo check that always, and just don't use it for the VC4_MODE_SHADER case (so now modes are unused)
* vc4: Keep the validated shader around for the simulator execution.Eric Anholt2015-07-283-13/+17
| | | | This more closely matches the kernel behavior on shader validation now.
* vc4: Make the object be the return value from vc4_use_bo().Eric Anholt2015-07-283-23/+25
| | | | Drops 40 bytes of code from validation.
* vc4: Ensure that the bin CL is properly capped by increment/flush.Eric Anholt2015-07-283-26/+36
| | | | | | | | We don't want anything to appear after we've kicked off the render (and thus job flush), since that might then get written out to the tile allocation state. Signed-off-by: Eric Anholt <[email protected]>
* vc4: Drop NV shader reloc validation.Eric Anholt2015-07-282-120/+60
| | | | It wasn't validating enough, and we don't need the packet.
* vc4: Fix raster surface shadow updates under DRI2.Eric Anholt2015-07-281-0/+6
| | | | | | | Glamor asks GBM for the handle of the BO, then flinks it itself. We were marking the bo non-private in the flink and dmabuf (DRI3) paths, but not the GEM handle path. As a result, non-pageflipping DRI2 swapbuffers (EGL apps, in particular) were never updating the texture.
* vc4: Fix bus errors on dumping CL on hardware.Eric Anholt2015-07-281-1/+1
| | | | | The kernel can't fixup unaligned float traps for us, so deref as a uint32_t first.
* radeon: add streamout status 1-3 queries.Dave Airlie2015-07-292-2/+19
| | | | | | | This adds support for queries against the non-0 vertex streams. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600,radeonsi: GL_ARB_conditional_render_invertedEdward O'Callaghan2015-07-293-11/+15
| | | | | | | | | | By using 'Tobias Klausmann' piglit test-suite patch. We obtain a full 12/12 passes using this patch. By 'faking' to claim support for this extension we obtain 7 fails and 5 passes. Signed-off-by: Edward O'Callaghan <[email protected]> Tested-by: Furkan Alaca <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: add support for interpolateAt functions (v2)Dave Airlie2015-07-281-1/+240
| | | | | | | | | | | This is part of ARB_gpu_shader5, and this passes all the piglit tests currently available. v2: use macros from the fine derivs commit. add comments. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nvc0/ir: trim out barrier sync for non-compute shadersIlia Mirkin2015-07-281-0/+6
| | | | | | | It seems like they're never necessary, and actively cause harm. This fixes some of the barrier-related piglits. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix barrier emissionIlia Mirkin2015-07-281-0/+2
| | | | | | immediate arguments require a flag to be set for each one Signed-off-by: Ilia Mirkin <[email protected]>
* vc4: Add support for ARB_draw_elements_base_vertex.Eric Anholt2015-07-271-1/+3
| | | | | | Gallium exposes it unconditionally, so do our best to support it. It fails on the negative index cases, but those seem unlikely to be used in the wild.
* freedreno/ir3: add transform-feedback supportRob Clark2015-07-274-4/+230
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: track "keeps" in irRob Clark2015-07-274-23/+17
| | | | | | | | | | | | Previously we had a fixed array to track kills, since they don't generate an SSA value, and then cheated by stuffing them in the outputs array before sending things through depth/sched/etc. But store instructions will need similar treatment. So convert this over to a more general array of instructions that must be kept and fix up the places that were previously relying on kills being in the output array. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add support for store instructionsRob Clark2015-07-273-6/+43
| | | | | | | | | | For store instructions, the "dst" register is a read register, not a written register. (Ie. it is the address to store to.) Lets not confuse register allocation, scheduling, etc, with these details. Instead just leave a dummy instr->regs[0], and take "dst" from instr->regs[1] and srcs following. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cleanup driver-param stuffRob Clark2015-07-273-10/+31
| | | | | | | | | Add 'enum ir3_driver_param' to track driver-param slots, and a create_driver_param() helper to avoid having the knowledge about where driver params are placed in const regs spread throughout the code as we add additional driver-params. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add transform-feedback stateRob Clark2015-07-274-3/+95
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add resource tracking support for written buffersRob Clark2015-07-274-12/+17
| | | | | | | | With stream-out (transform-feedback) we have the case where resources are *written* by the gpu, which needs basically the same tracking to figure out when rendering must be flushed. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx+a4xx: add support for vtxcnt semanticRob Clark2015-07-274-14/+31
| | | | | | This will be used for stream-out (transform-feedback) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add stream-output support to cmdline compilerRob Clark2015-07-271-4/+22
| | | | | | A bit hard-coded configuration at the moment, but sufficient for now. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop unused create_input() argRob Clark2015-07-271-11/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: move emit_const to ir3Rob Clark2015-07-2712-229/+263
| | | | | | | | | | | | | | | Details of the cmdstream packets are different between a3xx and a4xx, but the logic about the layout of const registers is the same, as that is dictated by the ir3 shader compiler. So rather than duplicating logic that is tightly coupled to ir3 between a3xx and a4xx, move this into ir3 and use per-generation callbacks for to build the cmdstream packets. This should make it easier to pass additional const regs (such as for transform feedback). And it also keeps the layout internal to ir3 in case we want to make the layout more dynamic some day. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: bit of shader API refactoringRob Clark2015-07-277-22/+25
| | | | | | | | | | | | Since for transform-feedback, we'll need more than just the TGSI tokens from the state object, just pass the entire state object to ir3_shader_create(). This also cleans things up a bit for some day in the future when we could take shader either as TGSI or directly NIR (for ex, glsl2nir or spirv2nir paths). In the same spirit, drop extra args from ir3_compile_shader_nir() (since it can anyways get what it needs from the ir3_shader_variant). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: updated cat6 encodingRob Clark2015-07-275-113/+230
| | | | | | | Sync updated cat6 encoding from freedreno.git, needed to properly encode store instructions. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: add fine derivate control (v2.1)Dave Airlie2015-07-252-6/+48
| | | | | | | | | | | | | This adds support for fine derivatives and enables ARB_derivative_control on radeonsi. (just fell out of my working out interpolation) v2: cleanup some bits, write a comment v2.1: take Michel's comment from the mailing list Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: fix GLSL textureGrad(samplerCube*) functionsMarek Olšák2015-07-253-34/+90
| | | | | | +4 piglits Reviewed-by: Michel Dänzer <[email protected]>
* nvc0: fix geometry program revalidation of clipping paramsIlia Mirkin2015-07-251-1/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* radeonsi: ubo indexing support (v2)Dave Airlie2015-07-251-3/+12
| | | | | | | | | | | | This is required as part of ARB_gpu_shader5. no backend changes are required for this, or if any are, it's the same ones as for samplers. v2: use get_indirect_index (Marek) Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: add support for indirect samplers (v2)Dave Airlie2015-07-251-8/+41
| | | | | | | | | | | | This adds the frontend support, however the llvm backend produces the wrong pattern, however we can conditionalise enabling ARB_gpu_shader5 on whatever version of llvm we fix this in. v2: drop unneeded sampler_indirect checks (Marek) Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: split out interpolation input selectionDave Airlie2015-07-252-26/+38
| | | | | | | | | | | This is prep work for using it in the interpolation code later. Also add storage for the input interpolation mode so we can pick it up later. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: separate out load sample positionDave Airlie2015-07-251-18/+26
| | | | | | | | This is prep work for reusing this in the interpolation code later. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nvc0/ir: per-patch vars are in a separate address spaceIlia Mirkin2015-07-242-11/+9
| | | | | | | | | | | | | There's no need to attempt to avoid overlapping generic i/o with patch i/o. By the same token, we can't merge patch and non-patch loads/stores. This fixes at least the tes-both-input-array-*-index-rd tessellation variable-indexing tests. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: kepler can't do indirect shader input/output loads directlyIlia Mirkin2015-07-238-6/+75
| | | | | | | | | | | | | | There's a special AL2P instruction (called AFETCH in nv50 ir) which computes a "physical" value to be used with indirect addressing with ALD. Fixes tcs-input-array-*-index-rd tcs-output-array-*-index-wr varying-indexing tessellation tests on Kepler. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: tess factors are now sysvals, adapt codegen to expect thatIlia Mirkin2015-07-236-11/+24
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallivm: Fix profile build.Jose Fonseca2015-07-231-1/+1
|
* gallium/util: Stop bundling our snprintf implementation.Jose Fonseca2015-07-233-1485/+31
| | | | | | | | | | | Use MSVCRT functions instead. Their semantics are slightly different but they can be made to work as expected. Also, use the same code paths for both MSVCRT and MinGW. https://bugs.freedesktop.org/show_bug.cgi?id=91418 Reviewed-by: Brian Paul <[email protected]>
* gallivm: Add ifdefs so raw_debug_stream is only defined when usedTom Stellard2015-07-231-0/+2
| | | | | | | Its only use is to implement a custom version of LLVMDumpValue on some Windows and embedded platforms. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Don't use raw_debug_ostream for dissasemblingTom Stellard2015-07-231-14/+13
| | | | | | | | All LLVM API calls that require an ostream object have been removed from the disassemble() function, so we don't need to use this class to wrap _debug_printf() we can just call this function directly. Reviewed-by: Jose Fonseca <[email protected]>
* gk110/ir: fake BAR supportIlia Mirkin2015-07-231-0/+12
| | | | | | Makes things sorta work until we figure out the real way to do this. Signed-off-by: Ilia Mirkin <[email protected]>