summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glsl: move lowering after matching validationTimothy Arceri2016-01-061-11/+11
| | | | | | | | After lowering the matching flag is_unmatched_generic_inout is lost so we need to move this validation before lowering. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* glsl: only add outward facing varyings to resourse list for SSOTimothy Arceri2016-01-061-7/+10
| | | | | | | | | An SSO program can have multiple stages and we only want to add the externally facing varyings. The current code was adding both the packed inputs and outputs for the first and last stage of each program. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* i965/gen9: Modify the conditions to use blitter on skl+Anuj Phogat2016-01-051-3/+9
| | | | | | | | | Conditions modified allow skl+ to use blitter: - for all tiling formats - to write data to YF/YS tiled surfaces Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/gen9: Return false in place of assert in intelEmitCopyBlit()Anuj Phogat2016-01-051-3/+4
| | | | | | | This allows the fallback paths to handle it correctly. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/gen9: Remove regions overlap check in fast copy blitAnuj Phogat2016-01-051-5/+0
| | | | | | | | Overlapping blits are anyway undefined in OpenGL. So no need of overlap check here. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/gen9: Don't use fast copy blit in case of non power of 2 cppAnuj Phogat2016-01-051-2/+4
| | | | | | | Fast copy blit is currently enabled for use only with Yf/Ys tiling. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i915/i965: Fix typo in perf_debug messageIan Romanick2016-01-052-2/+2
| | | | | | Trivial Signed-off-by: Ian Romanick <[email protected]>
* st/mesa: minor indentation fixesBrian Paul2016-01-051-4/+4
|
* draw: minor indentation fixBrian Paul2016-01-051-1/+1
|
* mesa: minor clean-up of some memcpy/sizeof() calls in m_matrix.cBrian Paul2016-01-051-7/+7
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* util: add debug_dump_ubyte_rgba_bmp()Brian Paul2016-01-052-0/+63
| | | | | | Like debug_dump_float_rgba_bmp() but takes ubyte values. Reviewed-by: Charmaine Lee <[email protected]>
* mesa: check for z=0 in _mesa_Vertex3dv()Brian Paul2016-01-051-1/+4
| | | | | | | | | | | It's very rare that a GL app calls glVertex3dv(), but one in particular calls it lot, always with Z = 0. Check for that condition and convert the call into glVertex2f. This reduces VBO memory used and reduces the number of times we have to switch between float[2] and float[3] vertex formats in the svga driver. This results in a small but measurable performance improvement. Reviewed-by: Charmaine Lee <[email protected]>
* svga: fix test for SVGA_NEW_STIPPLEBrian Paul2016-01-052-4/+8
| | | | | | | | | | | | | We only want to set the SVGA_NEW_STIPPLE dirty flag when the polygon stipple state changes. Before, we only set the flag when we were enabling stipple, but not disabling. We don't really have to add SVGA_NEW_STIPPLE to the dirty FS state set since it's a subset of SVGA_NEW_RAST, but let's be explicit. This doesn't fix any known bugs. Reviewed-by: Charmaine Lee <[email protected]>
* svga: add some comments in svga_state_vs.cBrian Paul2016-01-051-0/+3
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* svga: change svga_hw_view_state::dirty to booleanBrian Paul2016-01-051-1/+1
| | | | | | Since it's a true/false value. Reviewed-by: Charmaine Lee <[email protected]>
* svga: avoid emitting redundant SetVertexBuffers() commandsBrian Paul2016-01-052-5/+26
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* svga: check for no-ops in svga_bind_sampler_states()Brian Paul2016-01-051-1/+15
| | | | | | | and svga_set_sampler_views(). If there's no change, return early and don't set a SVGA_NEW_x dirty state flag. Reviewed-by: Charmaine Lee <[email protected]>
* i965: quieten compiler warning about out-of-bounds accessIlia Mirkin2016-01-051-0/+1
| | | | | | | | | | | | | | | gcc 4.9.3 shows the following error: brw_vue_map.c:260:20: warning: array subscript is above array bounds [-Warray-bounds] return brw_names[slot - VARYING_SLOT_MAX]; This is because BRW_VARYING_SLOT_COUNT is a valid value for the enum type. Adding an assert will generate no additional code but will teach the compiler to not complain. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* build: enable st/va with nouveau driverJulien Isorce2016-01-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vainfo fails in vaDriverInit because "dd_create_screen" does not reach strcmp(driver_name, "nouveau") code. Indeed when compiling the va target.c, the macro GALLIUM_NOUVEAU is not defined. This patch define the macro the same it is done for dri and vdpau targets. Tested with: ./autogen.sh --enable-glx --enable-gles2 --enable-egl --enable-vdpau --enable-glx-tls=yes --enable-va --with-gallium-drivers=swrast,nouveau --with-dri-drivers=swrast,nouveau --with-egl-platforms=x11 LIBVA_DRIVER_NAME=gallium vainfo Output: vainfo: Driver version: mesa gallium vaapi vainfo: Supported profile and entrypoints VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointVLD VAProfileMPEG4Simple : VAEntrypointVLD VAProfileMPEG4AdvancedSimple : VAEntrypointVLD VAProfileVC1Simple : VAEntrypointVLD VAProfileVC1Main : VAEntrypointVLD VAProfileVC1Advanced : VAEntrypointVLD VAProfileH264Baseline : VAEntrypointVLD VAProfileH264Main : VAEntrypointVLD VAProfileH264High : VAEntrypointVLD VAProfileNone : VAEntrypointVideoProc Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add support for st/vaJulien Isorce2016-01-053-50/+133
| | | | | | | | | | | - split nvc0_decoder_bsp in begin/next/end - preserve content buffer when calling nvc0_decoder_bsp_next - implement pipe_video_codec::begin_frame/end_frame https://bugs.freedesktop.org/show_bug.cgi?id=89969 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: split nouveau_vp3_bsp in begin/next/endJulien Isorce2016-01-054-41/+77
| | | | | | | | | | | | | It allows to call nouveau_vp3_bsp_next multiple times between one begin/end. It is required to support st/va. https://bugs.freedesktop.org/show_bug.cgi?id=89969 Signed-off-by: Julien Isorce <[email protected]> [imirkin: create strparm_bsp function, simplified w0 calculation] Signed-off-by: Ilia Mirkin <[email protected]>
* st/va: count number of slicesJulien Isorce2016-01-055-0/+25
| | | | | | | | The counter was not set but used by the nouveau driver. It is required otherwise visual output is garbage. Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Christian Koenig <[email protected]>
* i965/wm: use binding size for ubo/ssbo when automatic size is unsetIlia Mirkin2016-01-051-4/+10
| | | | | | | | | | | | | | This fixes the same tests that commit 8cf2e892f was attempting to fix: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize as confirmed by Samuel. Signed-off-by: Ilia Mirkin <[email protected]> Cc: Samuel Iglesias Gonsálvez <[email protected]> Cc: Marta Lofstedt <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* Revert "i965/wm: use proper API buffer size for the surfaces."Ilia Mirkin2016-01-054-13/+5
| | | | | | | | | | | This reverts commit 8cf2e892fca20c4776b4a07c39918343cb2d4e0e. It's entirely bogus to attempt to store anything about the binding in the buffer object itself, which might be bound any number of times. Signed-off-by: Ilia Mirkin <[email protected]> Cc: Samuel Iglesias Gonsálvez <[email protected]> Cc: Marta Lofstedt <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* st/mesa: make KHR_debug output independent of context creation flags (v2)Nicolai Hähnle2016-01-044-57/+98
| | | | | | | | | | | | | Instead, keep track of GL_DEBUG_OUTPUT and (un)install the pipe_debug_callback accordingly. Hardware drivers can still use the absence of the callback to skip more expensive operations in the normal case, and users can no longer be surprised by the need to set the debug flag at context creation time. v2: - re-add the proper initialization of debug contexts (Ilia Mirkin) - silence a potential warning (Ilia Mirkin) Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: scale up inter_bo size so that it's 16M for a 4K videoIlia Mirkin2016-01-041-2/+5
| | | | | | | | Experimentally, 4M causes corruption and slowness, try to ramp it up with size instead. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* nv50,nvc0: fix crash when increasing bsp bo size for h264Ilia Mirkin2016-01-042-4/+4
| | | | | | | | H264 doesn't have a bitplane bo. We just need a device reference, so use the one from the client. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* i965/wm: use proper API buffer size for the surfaces.Samuel Iglesias Gonsálvez2016-01-044-5/+13
| | | | | | | | | | | | | | | Commit 5bb5eeea fixes a bug indicating that the surfaces should have the API buffer size. Hovewer it picked the wrong value. This patch adds a new variable, which takes into account glBindBufferRange() values. This patch fixes the following CTS regressions: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]>
* radeonsi: remove unused parameter from si_shader_binary_read_configMarek Olšák2016-01-033-10/+7
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_shader_binary_upload out of si_shader_binary_readMarek Olšák2016-01-033-11/+10
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: dump LLVM module outside of radeon_llvm_compileMarek Olšák2016-01-034-9/+12
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: always add +DumpCode to the LLVM target machine for LLVM <= 3.5Marek Olšák2016-01-034-6/+5
| | | | | | | It's the same behavior that we use for later LLVM. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: r600_can_dump_shader should get TGSI processor type directlyMarek Olšák2016-01-034-15/+10
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pass TGSI processor type to si_shader_binary_read for dumpingMarek Olšák2016-01-033-4/+5
| | | | | | | the parameter will be used later Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pass TGSI processor type to si_compile_llvm for dumpingMarek Olšák2016-01-033-5/+5
| | | | | | | the parameter will be used later Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rename shader parameter definitions and variables for more clarityMarek Olšák2016-01-033-43/+43
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0/ir: add support for PK2H/UP2HIlia Mirkin2016-01-034-2/+28
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: use PK2H/UP2H when supportedIlia Mirkin2016-01-033-5/+14
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H supportIlia Mirkin2016-01-0316-0/+17
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: update PK2H/UP2H channel behavior infoIlia Mirkin2016-01-031-8/+8
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: document PK2H/UP2HIlia Mirkin2016-01-031-2/+14
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: fix parameter names for tesseval/tessctrl prototypesSamuel Pitoiset2016-01-031-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: fix double-const qualifierIlia Mirkin2016-01-032-2/+2
| | | | | | | Reported by Tom^ on IRC. The original intent was to mark the pointer constant as well as the data being pointed to, so move the *. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: use NIR_PASS helper macrosRob Clark2016-01-031-19/+28
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir: extract out helper macros for running passesRob Clark2016-01-032-36/+43
| | | | | | | | | Note these are a bit uglier, due to avoidance of GNU C extensions. But drivers which do not need to be built with compilers that don't support the extension can wrap these macros with their own. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno/ir3: we require block_index metadataRob Clark2016-01-031-0/+2
| | | | | | | | Found during NIR_TEST_CLONE=1 piglit run. We were using block->index but forgetting to require it. Causing things to not work with a cloned shader which didn't preserve block_index. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: refactor NIR IR handlingRob Clark2016-01-037-111/+202
| | | | | | | | | Immediately convert into NIR and do an initial key-agnostic lowering/ optimization pass. This should let us share most of the per-variant transformations between each variant, and hopefully minimize the draw- time variant creation part of the compilation process. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop unnecessary unreachable() caseRob Clark2016-01-031-2/+0
| | | | | | | It will still hit a compile_assert() in emit_tex, which has the advantage of dumping out the offending shader. Signed-off-by: Rob Clark <[email protected]>
* gallium/tests: fix build with clang compilerSamuel Pitoiset2016-01-031-273/+330
| | | | | | | | | | | | Nested functions are supported as an extension in GNU C, but Clang don't support them. This fixes compilation errors when (manually) building compute.c, or by setting --enable-gallium-tests to the configure script. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* nv50,nvc0: optimize coherent buffer checking at draw timeSamuel Pitoiset2016-01-036-68/+82
| | | | | | | | | | Instead of iterating over all the buffer resources looking for coherent buffers, we keep track of a context-wide count. This will save some iterations (and CPU cycles) in 99.99% case because usually coherent buffers are not so used. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>