summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir/builder: Add a helper for storing to variable derefsJason Ekstrand2016-03-281-0/+16
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir/builder: Add a helper for building fdot instructionsJason Ekstrand2016-03-281-0/+17
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir: Add a variable_foreach_safe helperJason Ekstrand2016-03-281-0/+3
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir/Makefile: Fix alphabetizationJason Ekstrand2016-03-282-6/+6
| | | | Reviewed-by: Rob Clark <[email protected]>
* mesa: add OES_texture_buffer and EXT_texture_buffer supportIlia Mirkin2016-03-2810-47/+94
| | | | | | | | Allow ES 3.1 contexts to access the texture buffer functionality. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: add OES_texture_buffer and EXT_texture_buffer supportIlia Mirkin2016-03-286-24/+46
| | | | | | | | | Expose the samplerBuffer/imageBuffer types, and allow the various functions to operate on them. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add OES_texture_buffer and EXT_texture_buffer extension to tableIlia Mirkin2016-03-282-0/+3
| | | | | | | | | | | We need to add a new bit since the GL ES exts require functionality from a combination of texture buffer extensions as well as images (for imageBuffer) support. Additionally, not all GPUs support all the texture buffer functionality (e.g. rgb32 isn't supported by nv50). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: properly return GetTexLevelParameter queries for buffer texturesIlia Mirkin2016-03-281-2/+52
| | | | | | | | | | | | | This fixes all failures with dEQP tests in this area. While ARB_texture_buffer_object explicitly says that GetTexLevelParameter & co should not be supported, GL 3.1 reverses this decision and allows all of these queries there. Conversely, there is no text that forbids the buffer-specific queries from being used with non-buffer images. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glsl: Delete initialized field from uniform storage test.Kenneth Graunke2016-03-281-19/+0
| | | | | | | Timothy deleted this field. Fixes "make check". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: remove initialized field from uniform storageTimothy Arceri2016-03-296-53/+1
| | | | | | | | The only place this was used was in a gallium debug function that had to be manually enabled. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nvc0: use a different offset for buffers and surfacesSamuel Pitoiset2016-03-294-28/+74
| | | | | | | | | | To not overwrite buffers and surfaces information, we need to use a different offset in the driver constant buffer. Currently, OP_SUQ is only supported for buffers but this will be slightly updated for images support. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* i965: Set address rounding bits for GL_NEAREST filtering as well.Kenneth Graunke2016-03-281-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yuanhan Liu decided these were useful for linear filtering in commit 76669381 (circa 2011). Prior to that, we never set them; it seems he tried to preserve that behavior for nearest filtering. It turns out they're useful for nearest filtering, too: setting these fixes the following dEQP-GLES3 tests: functional.fbo.blit.rect.nearest_consistency_mag functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_y functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_min functional.fbo.blit.rect.nearest_consistency_min_reverse_src_x functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_y Apparently, BLORP has always set these bits unconditionally. However, setting them unconditionally appears to regress tests using texture projection, 3D samplers, integer formats, and vertex shaders, all in combination, such as: functional.shaders.texture_functions.textureprojlod.isampler3d_vertex Setting them on Gen4-5 appears to regress Piglit's tests/spec/arb_sampler_objects/framebufferblit. Honestly, it looks like the real problem here is a lack of precision. I'm just hacking around problems here (as embarassing as it is). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Always use BRW_TEXCOORDMODE_CUBE when seamless filtering.Kenneth Graunke2016-03-281-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | When using seamless cube map mode and NEAREST filtering, we explicitly overrode the wrap modes to CLAMP_TO_EDGE. This was to implement the following spec text: "If NEAREST filtering is done within a miplevel, always apply apply wrap mode CLAMP_TO_EDGE." However, textureGather() ignores the sampler's filtering mode, and instead returns the four pixels that would be blended by LINEAR filtering. This implies that we should do proper seamless filtering, and include pixels from adjacent cube faces. It turns out that we can simply delete the NEAREST -> CLAMP_TO_EDGE overrides. Normal cube map sampling works by first selecting the face, and then nearest filtering fetches the closest texel. If the nearest texel was on a different face, then that face would have been chosen. So it should always be within the face anyway, which effectively performs CLAMP_TO_EDGE. Fixes 86 dEQP-GLES31.texture.gather.basic.cube.* tests. Signed-off-by: Kenneth Graunke <[email protected]> Suggested-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs.Kenneth Graunke2016-03-282-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our driver uses the brw_render_cache mechanism to track buffers we've rendered to and are about to sample from. Previously, we did a single PIPE_CONTROL with the following bits set: - Render Target Flush - Depth Cache Flush - Texture Cache Invalidate - VF Cache Invalidate - Instruction Cache Invalidate - CS Stall This combined both "top of pipe" invalidations and "bottom of pipe" flushes, which isn't how the hardware is intended to be programmed. The "top of pipe" invalidations may happen right away, without any guarantees that rendering using those caches has completed. That rendering may continue altering the caches. The "bottom of pipe" flushes do wait for the rendering to complete. The CS stall also prevents further work from happening until data is flushed out. What we wanted to do was wait for rendering complete, flush the new data out of the render and depth caches, wait, then invalidate any stale data in read-only caches. We can accomplish this by doing the "bottom of pipe" flushes with a CS stall, then the "top of pipe" flushes as a second PIPE_CONTROL. The flushes will wait until the rendering is complete, and the CS stall will prevent the second PIPE_CONTROL with the invalidations from executing until the first is done. Fixes dEQP-GLES3.functional.texture.specification.teximage2d_pbo subtests on Braswell and Skylake. These tests hit the meta PBO texture upload path, which binds the PBO as a texture and samples from it, while rendering to the destination texture. The tests then sample from the texture. For now, we leave Gen4-5 alone. It probably needs work too, but apparently it hasn't even been setting the (G45+) TC invalidation bit at all... v2: Add Sandybridge post-sync non-zero workaround, for safety. Cc: [email protected] Suggested-by: Francisco Jerez <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Whack UAV bit when FS discards and there are no color writes.Kenneth Graunke2016-03-281-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no framebuffer attachments, using a shader that discards based on gl_FragCoord. It uses occlusion queries to inspect whether pixels are rendered or not. Unfortunately, the hardware is not dispatching any pixel shaders, so discards never happen, and the full quad of pixels increments PS_DEPTH_COUNT, making the occlusion query results bogus. To understand why, we have to delve into the WM_INT internal signalling mechanism's formulas. The "WM_INT::Pixel Shader Kill Pixel" signal is defined as: 3DSTATE_WM::ForceKillPixel == ON || (3DSTATE_WM::ForceKillPixel != Off && !WM_INT::WM_HZ_OP && 3DSTATE_WM::EDSC_Mode != PREPS && (WM_INT::Depth Write Enable || WM_INT::Stencil Write Enable) && ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (3DSTATE_PS_EXTRA::PixelShaderKillsPixels || 3DSTATE_PS_EXTRA:: oMask Present to RenderTarget || 3DSTATE_PS_BLEND::AlphaToCoverageEnable || 3DSTATE_PS_BLEND::AlphaTestEnable || 3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable)) Because there is no depth or stencil buffer, writes to those buffers are disabled. So the highlighted condition is false, making the whole "Kill Pixel" condition false. This then feeds into the following "WM_INT::ThreadDispatchEnable" condition: 3DSTATE_WM::ForceThreadDispatch != OFF && !WM_INT::WM_HZ_OP && 3DSTATE_PS_EXTRA::PixelShaderValid && (3DSTATE_PS_EXTRA::PixelShaderHasUAV || WM_INT::Pixel Shader Kill Pixel || WM_INT::RTIndependentRasterizationEnable || (!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT && 3DSTATE_PS_BLEND::HasWriteableRT) || (WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF && (WM_INT::Depth Test Enable || WM_INT::Depth Write Enable)) || (3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) || (3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable || WM_INT::Depth Write Enable || WM_INT::Stencil Test Enable))) Given that there's no depth/stencil testing, no writeable render target, and the hardware thinks kill pixel doesn't happen, all of these conditions are false. We have to whack some bit to make PS invocations happen. There are many options. Curro suggested using the UAV bit. There's some precedence in doing that - we set it for fragment shaders that do SSBO/image/atomic writes when no color buffer writes are enabled. We can simply include discard here too. Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests. v2: Add a comment suggested and written by Jason Ekstrand. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vc4: Remove unused include from vc4_nir_lower_txf_ms.cRhys Kidd2016-03-281-1/+0
| | | | | | | | Found with grep and inspection. Test compiled on RPi hw. Assists any future effort to remove TGSI as an intermediate stage. Signed-off-by: Rhys Kidd <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* glapi/glx: Treat xserver generated targets as .PHONYAdam Jackson2016-03-281-0/+2
| | | | | | | | Meaning, always rebuild them when asked instead of bothering to look at timestamps (and then wondering why nothing happened when you said make). Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* glapi/glx: Thunk non-ABI calls through GetProcAddressAdam Jackson2016-03-281-2/+8
| | | | | Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* glapi/glx: Emit direct GL calls instead of dispatch lookupAdam Jackson2016-03-282-34/+10
| | | | | Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* glx: Unbreak generating some of the xorg glx headersAdam Jackson2016-03-281-2/+2
| | | | | | | | | | | | | | | | Broken by: commit 9ace0b542241c77ae82a0835ac8a09e2a7510eaf Author: Dylan Baker <[email protected]> Date: Wed May 20 15:49:11 2015 -0700 glapi: glX_proto_size.py: use argparse instead of getopt Which changed most, but not all, callers to use --header-tag instead of -h. Reviewed-by: Dylan Baker <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* mesa/st: Fix NULL access if no fragment shader is boundBas Nieuwenhuizen2016-03-281-2/+2
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* freedreno/ir3: fix for load_front_face intrinsicRob Clark2016-03-281-1/+8
| | | | | | | Seems like trying to widen in the same instruction as the add.s does a non-sign-extending widen. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix compiler warnRob Clark2016-03-281-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* nvc0: make sure to disable fetches from previously-set VBOs when blittingIlia Mirkin2016-03-281-0/+2
| | | | | | | We disable the vertex attributes, but also disable the VBO fetch details as well, just in case. Not known to fix anything. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: disable primitive restart and index bias during blitsIlia Mirkin2016-03-281-0/+11
| | | | | | | | | | | | | | | | Back in the dawn of time, we used to do immediate uploads for the vertex data, and all was well. However Maxwell dropped support for immediate vertex data, so we started feeding in a VBO (in all cases). But we forgot to disable some things that apply in such cases, specifically primitive restart and index bias. The latter was causing WoW and other Blizzard games trouble as they use a pattern where they draw with a base vertex (aka index bias), followed by texture uploads (aka blits, internally). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526 Cc: "11.1 11.2" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Karol Herbst <[email protected]>
* nvc0/ir: fix picking of coordinates from tex instruction for textureGradIlia Mirkin2016-03-281-1/+11
| | | | | | | | | | | On Fermi, there's an argument in front of the coords that combines array and indirect handle, while on Kepler the array and the indirect handle are separate (and in front of the coords). We were previously only accounting for the array bit of it, if there were an indirect access it wouldn't be counted in the formula. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: saturate depth writesIlia Mirkin2016-03-281-1/+4
| | | | | | | | | | | Apparently there's no post-FS clamping logic, so we have to do this by hand. The depth will never be outside of the 0..1 range, even on floating point zeta buffers, so this should be safe. Fixes dEQP-GLES3.functional.fbo.depth.*clamp.* which tests writing invalid values on various zeta buffer formats. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2)Marek Olšák2016-03-281-5/+5
| | | | | | v2: move the nr_cbufs check above the loop Reviewed-by: Ilia Mirkin <[email protected]> (v1)
* st/mesa: only minify height if target != 1D array in st_finalize_textureMarek Olšák2016-03-281-1/+6
| | | | | | | | | | | The st_texture_object documentation says: "the number of 1D array layers will be in height0" We can't minify that. Spotted by luck. No app is known to hit this issue. Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: optimize out the realloc from glCopyTexImagexD()Miklós Máté2016-03-271-0/+36
| | | | | | | | | | | v2: comment about the purpose of the code v3: also compare texFormat, add a perf debug message, formatting fixes Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* st/mesa: fix handling the fallback textureMiklós Máté2016-03-271-3/+4
| | | | | | | | | | This fixes crash when post-processing is enabled in SW:KotOR. v2: fix const-ness v3: move assignment into the if() block Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* st/mesa: enable GL_ATI_fragment_shaderMiklós Máté2016-03-272-0/+2
| | | | | Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* st/mesa: implement GL_ATI_fragment_shaderMiklós Máté2016-03-2710-4/+1064
| | | | | | | | | | | | | v2: fix arithmetic for special opcodes, fix fog state, cleanup v3: simplify handling of special opcodes, fix rebinding with different textargets or fog equation, lots of formatting fixes v4: adapt to the compile early, fix later architecture, formatting fixes Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* program: add ATI_fragment_shader to shader stages listMiklós Máté2016-03-271-0/+2
| | | | | Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* mesa: optionally associate a gl_program to ATI_fragment_shaderMiklós Máté2016-03-275-2/+34
| | | | | | | | | the state tracker will use it Acked-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/p_context.h: Make comment more readableEdward O'Callaghan2016-03-271-1/+1
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* mesa/st: Remove GLSLVersion clampingEdward O'Callaghan2016-03-271-10/+5
| | | | | | | While here, remove itermediate glsl_feature_level variable. Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeon/r600: Fix return type in failure branchEdward O'Callaghan2016-03-271-1/+1
| | | | | | | | Commit `d4e847ea` introduced a warning about making an integer from a pointer without a cast, fix it here. Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeon/r600_query.c: Minor style fixEdward O'Callaghan2016-03-271-1/+1
| | | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* virgl: drop next shader property for now.Dave Airlie2016-03-261-0/+1
| | | | Signed-off-by: Dave Airlie <[email protected]>
* glsl: reduce buffer block duplicationTimothy Arceri2016-03-266-82/+57
| | | | | | | | | | | | | This reduces some of the craziness required for handling buffer blocks. The problem is each shader stage holds its own information about a block in memory, we were copying that information to a program wide list but the per stage information remained meaning when a binding was updated we needed to update all versions of it. This changes the per stage blocks to instead point to a single version of the block information in the program list. Acked-by: Kenneth Graunke <[email protected]>
* st/xa: emit sampler view declarations in shadersBrian Paul2016-03-251-0/+19
| | | | | | | Fixes recent regressions with the VMware gallium driver. Reviewed-by: Charmaine Lee <[email protected]> Tested-by: Charmaine Lee <[email protected]>
* swr: [rasterizer jitter] Fix MASKLOADD AVX prototype (float -> i32)Tim Rowley2016-03-251-1/+1
|
* swr: [rasterizer core] NUMA optimizations...Tim Rowley2016-03-255-65/+105
| | | | | - Affinitize hot-tile memory to specific NUMA nodes. - Only do BE work for macrotiles assoicated with the numa node
* swr: [rasterizer jitter] Fix logic bug for alpha-to-coverage.Tim Rowley2016-03-251-2/+11
|
* swr: [rasterizer core] Fix Compute workitem retirementTim Rowley2016-03-254-31/+22
|
* swr: [rasterizer core] Cleanup state ring arena after last draw that ↵Tim Rowley2016-03-253-2/+14
| | | | | | references it completes Rather than waiting for the API thread to re-use it.
* swr: [rasterizer jitter] add missing include for llvm jiteventsTim Rowley2016-03-251-0/+4
|
* swr: [rasterizer core] Reduce Arena blocksize to 128KB (from 1MB).Tim Rowley2016-03-251-3/+7
| | | | | With global allocator this doesn't seem to affect performance at all. Overall memory consumption drops by up to 85%.
* swr: [rasterizer core] One last pass at Arena optimizationsTim Rowley2016-03-251-15/+15
|