summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: don't try to use fast rcp for fdivRoland Scheidegger2017-01-241-1/+3
| | | | | | | | The use of fast rcp instruction is disabled, and will always fall back to use a division instead (1 / x). Hence, if we get a division opcode, it doesn't make much sense trying to split that into rcp/mul. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: (trivial) fix ddiv cpu implementationRoland Scheidegger2017-01-241-1/+0
| | | | | | | | | | | we can't use the cpu implementation of fdiv, as this one uses different lp_build_context, which causes assertion failure. Just use default fdiv action (there is no fast rcp for doubles which we could potentially use anyway). Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: implement ddiv opcodeRoland Scheidegger2017-01-241-0/+14
| | | | | | | | | softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64, so we really need to support that opcode. Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium/radeon: undef the very specific UPDATE_COUNTER macroSamuel Pitoiset2017-01-241-5/+9
| | | | | | | Also, wrap this into a do { ... } while (0). Suggested by Nicolai. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50: add support for MUL_ZERO_WINS propertyIlia Mirkin2017-01-235-2/+11
| | | | | | | This is simply keyed off the vertex shader, as that's guaranteed to be present in any pipeline. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add support for MUL_ZERO_WINS propertyIlia Mirkin2017-01-234-9/+25
| | | | | | | | This sets the dnz flag on all the relevant multiplication operations. At emission time, this will only be supported by nvc0+, so nv50 will need a different solution. Signed-off-by: Ilia Mirkin <[email protected]>
* st/nine: set the MUL_ZERO_WINS flag when supportedIlia Mirkin2017-01-231-0/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* gallium: add PIPE_CAP_TGSI_MUL_ZERO_WINSIlia Mirkin2017-01-2318-0/+19
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* gallium: add TGSI_PROPERTY_MUL_ZERO_WINSIlia Mirkin2017-01-233-3/+18
| | | | | | | | | This will be useful for proper D3D9 emulation, where this behavior is expected by some shaders. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* radeonsi: always set the TCL1_ACTION_ENA when invalidating L2Marek Olšák2017-01-231-1/+2
| | | | | | | | Some CIK-VI docs say this is the default behavior on SI. That doesn't answer whether it's also the default behavior on CIK-VI. Cc: 17.0 13.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't declare LDS in TESMarek Olšák2017-01-231-2/+1
| | | | | | not used since we started using the offchip tess ring Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: preload PS inputs only if KILL is usedMarek Olšák2017-01-231-2/+6
| | | | | | | | | | | so that most shaders can get lower VGPR usage thanks to lazy input loading. I think this is a more accurate constraint that prevents the black transitions in Witcher 2. Affected shaders (7758): Max Waves: 57437 -> 58231 (1.38 %) Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: adjust the rule for using the LINEAR_ALIGNED layoutMarek Olšák2017-01-231-1/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: drop all IBs if at least one was rejected within the contextMarek Olšák2017-01-231-1/+7
| | | | | | The corruption is inevitable and hangs are possible too. Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: report a rejected IB as a lost contextMarek Olšák2017-01-233-0/+14
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add HUD queries for monitoring some hw blocksSamuel Pitoiset2017-01-234-1/+110
| | | | | | | | | | | | It's also possible to monitor them via performance counters but the hardware can only use two counters simultaneously. It seems easier to re-use the existing code which reads from MMIO instead of writing a multi-pass approach. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: refactor the GRBM counters pathSamuel Pitoiset2017-01-233-43/+47
| | | | | | | | | | This will allow to expose more queries in order to know which blocks are busy/idle. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* swr: Align query results allocationGeorge Kyriazis2017-01-232-4/+5
| | | | | | | | | | | Some query results struct contents are declared as cache line aligned. Use aligned malloc, and align the whole struct, to be safe. Fixes crash when compiling with clang. CC: <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Prune empty nodes in CalculateProcessorTopology.Bruce Cherniak2017-01-231-0/+9
| | | | | | | | | | | | CalculateProcessorTopology tries to figure out system topology by parsing /proc/cpuinfo to determine the number of threads, cores, and NUMA nodes. There are some architectures where the "physical id" begins with 1 rather than 0, which was creating and empty "0" node and causing a crash in CreateThreadPool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102 Reviewed-By: George Kyriazis <[email protected]> CC: <[email protected]>
* st/va: make sure that we call begin_frame() only once v2Christian König2017-01-232-3/+9
| | | | | | | | | | This fixes "st/va: delay calling begin_frame until we have all parameters". v2: call begin frame after decoder (re)creation as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Nayan Deshmukh <[email protected]> Tested-by: Andy Furniss <[email protected]>
* r600: implement DDIVNicolai Hähnle2017-01-231-0/+59
| | | | | | Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* r600: factor out cayman_emit_unary_double_rawNicolai Hähnle2017-01-231-20/+42
| | | | | | | | We will use it for DDIV. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* r600: double multiply can handle only one multiply at a timeNicolai Hähnle2017-01-231-17/+19
| | | | | | | | | | It seems clear that trying to multiply two pairs of doubles would result in the temporary register getting overwritten by the second pair. So make the code more explicit. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* freedreno/a5xx: set frag shader threadsizeRob Clark2017-01-221-2/+7
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: set fragcoordxy properlyRob Clark2017-01-221-1/+1
| | | | | | | | | | | What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into bary.f. We were incorrectly setting both this and gl_FragCoord.xy to the same register resulting in all sorts of hilarity. Fixes stk, vdrift, 0ad, probably a bunch others. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/ir3: setup var locations in standalone compilerRob Clark2017-01-221-1/+69
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix psizeRob Clark2017-01-222-8/+5
| | | | | | | | Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on a5xx. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: srgb fixRob Clark2017-01-221-1/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix int vbosRob Clark2017-01-221-1/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix clear for uint/sint formatsRob Clark2017-01-221-19/+28
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix cull stateRob Clark2017-01-221-5/+5
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno: update generated headersRob Clark2017-01-226-13/+36
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* Revert "radeonsi: reject invalid vertex element formats"Marek Olšák2017-01-201-5/+0
| | | | | | | This reverts commit 9e4d1d8a7c0d60a6975d186944cd870e06f94773. It broke arb_vertex_type_10f_11f_11f_rev-draw-vertices, which has first_non_void == -1.
* gallium: add pipe_screen::resource_changed callback wrappersPhilipp Zabel2017-01-203-0/+47
| | | | | | | | | | Add resource_changed to the ddebug, rbug, and trace wrappers. Since it is optional, there is no need to add it to noop. Signed-off-by: Philipp Zabel <[email protected]> Suggested-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* etnaviv: implement resource_changed to invalidate internal resources derived ↵Philipp Zabel2017-01-201-0/+13
| | | | | | | | | | | | from imported buffers Implement the resource_changed pipe callback to invalidate internal resources derived from imported buffers. This is needed to update the texture for re-imported renderables. Signed-off-by: Philipp Zabel <[email protected]> Reviewed-by: Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* etnaviv: initialize seqno of imported resourcesPhilipp Zabel2017-01-201-0/+2
| | | | | | | | | | Imported resources already have contents that we want to be copied to texture resources derived from them. Set initial seqno of imported resources to 1, just as if it had already been rendered to. Signed-off-by: Philipp Zabel <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* st/dri: ask the driver to update its internal copies on reimportPhilipp Zabel2017-01-201-0/+4
| | | | | | | | | | | | For imported buffers that can't be used directly as a source to the texture samplers, the pipe driver might need to create an internal copy, for example in a different tiling layout. When buffers are reimported they may contain new image data, so the driver internal copies need to be recreated. Signed-off-by: Philipp Zabel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* gallium: add pipe_screen::resource_changedPhilipp Zabel2017-01-202-0/+22
| | | | | | | | | | Add a hook to tell drivers that an imported resource may have changed and they need to update their internal derived resources. Signed-off-by: Philipp Zabel <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* gallium/hud: add missing break in hud_cpufreq_graph_install()Samuel Pitoiset2017-01-201-0/+1
| | | | | | | Fixes: e99b9395bef "gallium/hud: Add support for CPU frequency monitoring" Cc: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radeonsi: reject invalid vertex element formatsMarek Olšák2017-01-191-0/+5
| | | | | | | | This should fix a coverity defect. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't forget to add HTILE to the buffer list for texturingMarek Olšák2017-01-191-6/+13
| | | | | | | | | | | | This fixes VM faults. Discovered by Samuel Pitoiset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450 Cc: 17.0 13.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* st/vdpau: only send buffers with B8G8R8A8 format to XNayan Deshmukh2017-01-193-3/+11
| | | | | | | | | | | | | | | | PresentPixmap only works if the pixmap depth matches with the window depth, otherwise it returns a BadMatch protocol error. Even if the depths match, the result won't look correctly if the VDPAU RGB component order doesn't match the X11 one so we only allow the X11 format. For other buffers we copy them to a buffer which is send to X. v2: only send buffers with format VDP_RGBA_FORMAT_B8G8R8A8 v3: reword commit message v4: add comment explaining the code Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeonsi: fix texture gather on stencil texturesNicolai Hähnle2017-01-191-2/+16
| | | | | | | | | | | | | | | | | | | | | | | At least on VI, texture gather doesn't work with a 24_8 data format, so use 8_8_8_8 and a modified swizzle instead. A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select the X24S8 pipe format because we don't support stencil-only render targets properly. With mip-mapping this can lead to a setup where the tiling is incompatible with stencil texturing, and a flushed stencil texture is used. For the flushed stencil, a literal X24S8 is used because there were issues with an 8bpp DB->CB copy. Longer term, it would be good if we could get away from these workarounds, i.e. properly support an S8 format for stencil-only rendering and flushed stencil. Since stencil texturing is somewhat rare, it's not a high priority. Fixes GL45-CTS.texture_cube_map_array.sampling. Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* radeonsi: Always leave poly_offset in a valid stateZachary Michaels2017-01-191-1/+3
| | | | | | | | | | This commit makes si_update_poly_offset set poly_offset to NULL if uses_poly_offset is false. This way poly_offset either points into the currently queued rasterizer, or it is NULL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451 Cc: "13.0 17.0" <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: use #ifdef not #if for PIPE_ARCH_BIG_ENDIANDave Airlie2017-01-191-1/+1
| | | | | | | | This fixes the build on ppc/s390. Reviewed-by: Roland Scheidegger <[email protected]> Cc: "17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: determine in advance which VBOs should be added to the buffer listMarek Olšák2017-01-183-4/+11
| | | | | | v2: now it should be correct Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use fewer pointer dereferences in upload_vertex_buffer_descriptorsMarek Olšák2017-01-181-8/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: reject invalid vertex buffer indices at state creationMarek Olšák2017-01-182-5/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a global dirty mask for shader pointersMarek Olšák2017-01-184-41/+51
| | | | | | Only vertex buffers use a separate bool flag. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a bitmask-based loop in si_decompress_texturesMarek Olšák2017-01-183-7/+31
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>