summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Revert "st/nir: use NIR for asm programs"Eric Anholt2018-05-281-58/+7
| | | | | | | | | This reverts commit 5c33e8c7729edd5e16020ebb8703be96523e04f2. It broke fixed function vertex programs on vc4 and v3d, and apparently caused trouble for radeonsi's NIR paths as well. Acked-by: Timothy Arceri <[email protected]> https://bugs.freedesktop.org/show_bug.cgi?id=106673
* anv: move canonical_address calculation into a separate functionScott D Phillips2018-05-275-11/+47
| | | | | | | | | | | A later patch will make use of this in other places. Also, remove dependency on undefined behavior of left-shifting a signed value. v2: - move function into a separate header (Chris) v3: (by Ken) Add new header to the various build systems. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* r600: Fix SSG when not all components are writtenGert Wollny2018-05-281-4/+10
| | | | | | | | | | | | | | | | | | | | | Make sure only those components are written to that are specified in the write mask. Fixes: dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_fragment Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* r600: Correct IDIV if DST and SRC use the same temporaryGert Wollny2018-05-281-3/+49
| | | | | | | | | | | | | | | | | In cases like IDIV TEMP[0].xy TEMP[0].xx TEMP[1].yy the result will be written to the same register that is also a source register. Since the components are evaluated one by one, this may result in overwriting the source value for a later operation. Work around this by adding another temporary to store the result if the destination temporary index is equal to one of the source temporary indices. Fixes: dEQP-GLES2.functional.shaders.operator.binary_operator.div.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* i965: Revert recent tiled memcpy changes.Kenneth Graunke2018-05-265-186/+9
| | | | | | | | | This reverts commit 79fe00efb474b3f3f0ba4c88826ff67c53a02aef. This reverts commit f5e8b13f78a085bc95a1c0895e4a38ff6b87b375. This reverts commit d21c086d819d78fb3f6abcbb14aa492970f442aa. They broke the Android build and I'd rather not leave it broken for the long holiday weekend.
* i965/miptree: Use cpu tiling/detiling when mappingScott D Phillips2018-05-251-4/+98
| | | | | | | | | | | | | | | | | | | | | | | | | Rename the (un)map_gtt functions to (un)map_map (map by returning a map) and add new functions (un)map_tiled_memcpy that return a shadow buffer populated with the intel_tiled_memcpy functions. Tiling/detiling with the cpu will be the only way to handle Yf/Ys tiling, when support is added for those formats. v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson) v3: Add units to parameter names of tile_extents (Nanley Chery) Use _mesa_align_malloc for the shadow copy (Nanley) Continue using gtt maps on gen4 (Nanley) v4: Use streaming_load_memcpy when detiling v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it takes precedence. Add intel_miptree_access_raw, needed after rebasing on commit b499b85b0f2cc0c82b7c9af91502c2814fdc8e67. Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i915: Fix streaming loads for intel_tiled_memcpyChris Wilson2018-05-251-5/+5
| | | | | | | | We stream from a tiled and aligned source into an unaligned user buffer, so we need to use _mm_storeu_si128. Fixes: d21c086d819d78fb3f6abcbb14aa492970f442aa (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear) Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: remove unused variable addr_vecMarek Olšák2018-05-251-1/+1
| | | | trivial
* intel/blorp: Support blits and clears on surfaces with offsetsJason Ekstrand2018-05-255-1/+39
| | | | | | | | | | | | | For certain EGLImage cases, we represent a single slice or LOD of an image with a byte offset to a tile and X/Y intratile offsets to the given slice. Most of i965 is fine with this but it breaks blorp. This is a terrible way to represent slices of a surface in EGL and we should stop some day but that's a very scary and thorny path. This gets blorp to start working with those surfaces and fixes some dEQP EGL test bugs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: fix passing gl_ClipVertex for GS and tessMarek Olšák2018-05-253-4/+8
| | | | | | Also add the fprintf call. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix color inputs/outputs for GS and tessMarek Olšák2018-05-253-20/+34
| | | | | | | | | | GS is tested, tessellation is untested. Have outputs_written_before_ps for HW VS and outputs_written for other stages. The reason is that COLOR and BCOLOR alias for HW VS, which drives elimination of VS outputs based on PS inputs. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix incorrect parentheses around VS-PS varying eliminationMarek Olšák2018-05-251-2/+2
| | | | | | | I don't know if it caused issues. Cc: 18.0 18.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: simplify lastLevel determination in st_finalize_textureMarek Olšák2018-05-251-13/+4
| | | | | | | | | | | | | This fixes shader images where we always bind stObj->pt and not individual gl_texture_images. Roughly based on i965 commit 845ad2667ab2466752f06ea30bdb9c837116c308 which does a similar thing but for a different reason. This fixes GL CTS assertion failures introduced by Ilia. Cc: 18.0 18.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965/tiled_memcpy: inline movntdqa loads in tiled_to_linearScott D Phillips2018-05-254-5/+88
| | | | | | | | | | | | | | | | | | | | The reference for MOVNTDQA says: For WC memory type, the nontemporal hint may be implemented by loading a temporary internal buffer with the equivalent of an aligned cache line without filling this data to the cache. [...] Subsequent MOVNTDQA reads to unread portions of the WC cache line will receive data from the temporary internal buffer if data is available. This hidden cache line sized temporary buffer can improve the read performance from wc maps. v2: Add mfence at start of tiled_to_linear for streaming loads (Chris) Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* swr/rast: Adjusted avx512 primitive assembly for msvc codegenAlok Hota2018-05-251-49/+90
| | | | | | | | | Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by about 4x, MSVC compiler was going crazy making temporaries and split-loading inputs onto the stack unless explicit AVX-512 load ops were added Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Moved memory init out of core swr initAlok Hota2018-05-257-7/+86
| | | | | | | | Added two new files for a wrapper function for initialization v2: added missing include for single architecture builds Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Removed superfluous JitManager argument from passesAlok Hota2018-05-256-14/+13
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Renamed MetaData callsAlok Hota2018-05-252-87/+87
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Use metadata to communicate between passesAlok Hota2018-05-251-0/+28
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Check gCoreBuckets/CORE_BUCKETS equal length at compile timeAlok Hota2018-05-251-0/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Added in-place building to SCATTERPSAlok Hota2018-05-251-9/+20
| | | | | | | SCATTERPS previously assumed it was being used with an existing basic block Reviewed-by: Bruce Cherniak <[email protected]>
* radv: run the EarlyCSEMemSSA LLVM passSamuel Pitoiset2018-05-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It's recommended by the instruction combining pass, and RadeonSI also runs it. This pass used to segfault with one shader of F12017 in the past, but it no longer crashes. Maybe the LLVM IR generated by RADV has changed. Polaris10: Totals from affected shaders: SGPRS: 441352 -> 441648 (0.07 %) VGPRS: 310888 -> 300784 (-3.25 %) Spilled SGPRs: 13576 -> 12983 (-4.37 %) Code Size: 22560328 -> 22420544 (-0.62 %) bytes Max Waves: 40755 -> 41366 (1.50 %) Vega10: Totals from affected shaders: SGPRS: 442848 -> 442000 (-0.19 %) VGPRS: 310396 -> 300460 (-3.20 %) Spilled SGPRs: 13708 -> 12906 (-5.85 %) Code Size: 22479428 -> 22336216 (-0.64 %) bytes Max Waves: 45783 -> 46506 (1.58 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix dumping compute shader on the graphics queueSamuel Pitoiset2018-05-251-5/+8
| | | | | | | The graphics pipeline can be NULL. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_dump_pipeline_state() helperSamuel Pitoiset2018-05-251-6/+11
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rework how shaders are dumped when generating a hang reportSamuel Pitoiset2018-05-251-26/+15
| | | | | | | Use a flag for the active stages instead. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused parameter in radv_dump_annotated_shader()Samuel Pitoiset2018-05-251-8/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* mesa: do not leak ctx->Shader.ReferencedProgram referencesJose Dapena Paz2018-05-251-0/+3
| | | | | | | | | | | | When glUseProgram is used, references to the included shaders are added in ctx->Shader.ReferencedProgram. But those references are not decreased when the shader data is deallocated. Thus, those shaders are leaked. Explicitely remove the pending references to these shaders. Fixes: e6506b3cd23 ("mesa: retain gl_shader_programs after glDeleteProgram if they are in use") Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: set DB_EQAA.MAX_ANCHOR_SAMPLES correctlyMarek Olšák2018-05-241-4/+12
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: round ps_iter_samples in set_min_samplesMarek Olšák2018-05-242-3/+5
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove redundant ps_iter_samples clampMarek Olšák2018-05-241-1/+0
| | | | | | | si_get_ps_iter_samples already does this. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove some old gfx 9.x registersMarek Olšák2018-05-241-48/+0
| | | | | | | Leftover from bring up. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable primitive binning for all blitter opsMarek Olšák2018-05-243-2/+12
| | | | | | | same as amdvlk. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface/gfx6: don't overallocate mipmapped HTILEMarek Olšák2018-05-241-2/+11
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* egl/x11: deduplicate depth-to-format logicEric Engestrom2018-05-243-33/+26
| | | | | | Suggested-by: Emil Velikov <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965: enable OES_texture_view for gen8+Tapani Pälli2018-05-241-1/+2
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: changes to expose OES_texture_view extensionTapani Pälli2018-05-246-6/+32
| | | | | | | | | | | Functionality already covered by ARB_texture_view, patch also adds missing 'gles guard' for enums (added in f1563e6392). Tested via arb_texture_view.*_gles3 tests and individual app utilizing texture view with ETC2. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* docs: update release calendar for 18.1 seriesJuan A. Suarez Romero2018-05-241-26/+19
| | | | | | | | | | | v2: extend 18.1 series (Andres) v3: fix copy/paste typo (Engestrom) CC: Andres Gomez <[email protected]> CC: Emil Velikov <[email protected]> CC: Dylan Baker <[email protected]> Reviewed-by: Andres Gomez <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: call nir_lower_io_to_temporaries for VS, GS, TES and FSSamuel Pitoiset2018-05-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: call nir_split_var_copies() before nir_lower_var_copies()Samuel Pitoiset2018-05-241-0/+3
| | | | | | | | This doesn't nothing special currently because we don't create any copy_var instructions, but this is needed for the next patch. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.Francisco Jerez2018-05-231-3/+5
| | | | | | | | | | | | | | | Instead of directly using intel_obj->buffer. Among other things intel_bufferobj_buffer() will update intel_buffer_object:: gpu_active_start/end, which are used by glBufferSubData() to decide which path to take. Fixes a failure in the Piglit ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests, which could be reproduced with a non-standard glGetTexSubImage implementation (see bug report). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351 Reported-by: Nanley Chery <[email protected]> Cc: [email protected] Reviewed-by: Nanley Chery <[email protected]>
* i965: Handle non-zero texture buffer offsets in buffer object range calculation.Francisco Jerez2018-05-231-1/+3
| | | | | | | | | | Otherwise the specified surface state will allow the GPU to access memory up to BufferOffset bytes past the end of the buffer. Found by inspection. v2: Protect against out-of-range BufferOffset (Nanley). Cc: [email protected] Reviewed-by: Nanley Chery <[email protected]>
* i965: Move buffer texture size calculation into a common helper function.Francisco Jerez2018-05-231-23/+32
| | | | | | | | | | | | | The buffer texture size calculations (should be easy enough, right?) are repeated in three different places, each of them subtly broken in a different way. E.g. the image load/store path was never fixed to clamp to MaxTextureBufferSize, and none of them are taking into account the buffer offset correctly. It's easier to fix it all in one place. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481 Reviewed-by: Nanley Chery <[email protected]>
* Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"Francisco Jerez2018-05-231-13/+12
| | | | | | | | | | | | | | | | | | This reverts commit c0ed52f6146c7e24e1275451773bd47c1eda3145. It was preventing the image format validation from being done on buffer textures, which is required to ensure that the application doesn't attempt to bind a buffer texture with an internal format incompatible with the image unit format (e.g. of different texel size), which is not allowed by the spec (it's not allowed for *any* texture target, whether or not there is spec wording restricting this behavior specifically for buffer textures) and will cause the driver to calculate texel bounds incorrectly and potentially crash instead of the expected behavior. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465 Reviewed-by: Nanley Chery <[email protected]>
* ac: Use DPP for build_ddxy where possible.Bas Nieuwenhuizen2018-05-231-1/+15
| | | | | | | | | | | | WQM is pretty reliable now on LLVM 7, so let us just use DPP + WQM. This gives approximately a 1.5% performance increase on the vrcompositor built-in benchmark. v2: Use ac_build_quad_swizzle. Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: add {X,A}BGR2101010 to 'intel_image_formats'Miguel Casas2018-05-231-0/+6
| | | | | | | | | This patch adds {X,A}BGR2101010 entries to the list of supported 'intel_image_formats'. Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format.Miguel Casas2018-05-231-0/+8
| | | | | | | | | Add R10G10B10{A,X}2 translation between mesa_format and DRI format to driGLFormatToImageFormat() and driImageFormatToGLFormat(). Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* bin/get-pick-listh.sh: force git --pretty=mediumDylan Baker2018-05-231-1/+1
| | | | | | Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Andres Gomez <[email protected]>
* bin/bugzilla_mesa.sh: explicitly set the --pretty argumentDylan Baker2018-05-231-1/+1
| | | | | | Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Andres Gomez <[email protected]>
* docs: drop unnecessary out-of-frame targetEric Engestrom2018-05-231-12/+9
| | | | | | | | | I'm guessing an earlier version of the website used to have the page contents in <frames>, but this isn't the case anymore so just drop the unnecessary `target="_main"` :) Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* docs: fix various html tags mistakesEric Engestrom2018-05-233-1/+4
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ian Romanick <[email protected]>