summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: Add missing error-checking to si_create_compute_state (v2)Mun Gwan-gyeong2016-11-211-1/+5
| | | | | | | | | | | | | | | | When the uploading of shader fails on si_shader_binary_upload(), it returns -ENOMEM. We should handle si_shader_binary_upload() failure path on si_create_compute_state(). CID 1394027 v2: Fixes from Edward O'Callaghan's review a) Update explicitly return value check with "si_shader_binary_upload() < 0" b) Update commit message. Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* draw: drop some overflow computationsRoland Scheidegger2016-11-211-65/+46
| | | | | | | | | | | | | | | | | | | | | | | It turns out that noone actually cares if the address computations overflow, be it the stride mul or the offset adds. Wrap around seems to be explicitly permitted even by some other API (which is a _very_ surprising result, as these overflow computations were added just for that and made some tests pass at that time - I suspect some later fixes fixed the actual root cause...). So the requirements in that other api were actually sane there all along after all... Still need to make sure the computed buffer size needed is valid, of course. This ditches the shiny new widening mul from these codepaths, ah well... And now that I really understand this, change the fishy min limiting indices to what it really should have done. Which is simply to prevent fetching more values than valid for the last loop iteration. (This makes the code path in the loop minimally more complex for the non-indexed case as we have to skip the optimization combining two adds. I think it should be safe to skip this actually there, but I don't care much about this especially since skipping that optimization actually makes the code easier to read elsewhere.) Reviewed-by: Jose Fonseca <[email protected]>
* draw: simplify fetch some moreRoland Scheidegger2016-11-211-63/+55
| | | | | | | | | | | Don't keep the ofbit. This is just a minor simplification, just adjust the buffer size so that there will always be an overflow if buffers aren't valid to fetch from. Also, get rid of control flow from the instanced path too. Not worried about performance, but it's simpler and keeps the code more similar to ordinary fetch. Reviewed-by: Jose Fonseca <[email protected]>
* draw: unify linear and elts draw jit functionsRoland Scheidegger2016-11-213-89/+70
| | | | | | | | | | | | | | | | | | | | | | | | The code for elts and linear paths was nearly 100% identical by now - with the elts path simply having some additional gather for the elements in the main loop (with some additional small differences before the main loop). Hence nuke the separate functions and decide this at jit shader execution time (simply based on the presence of the elts pointer). Some analysis shows that the generated vs jit functions seem to be just very minimally more complex than the former elts functions, and almost none of the additional complexity is in the main loop (basically just the branch logic for the branch fetching the actual indices). Compared to linear, the codesize of the function is of course a bit larger, however the actual executed code in the main loop appears to be near 100% identical (the additional code looking up indices is skipped as expected). So, I would not expect a (meaningful) performance difference with the generated code, neither with elts nor linear, this does however roughly half the compilation time (the compiled shaders should also use only half the memory of course). Reviewed-by: Jose Fonseca <[email protected]>
* draw: use same argument order for jit draw linear / elts functionsRoland Scheidegger2016-11-213-34/+30
| | | | | | This is a bit simpler. Mostly to make it easier to unify the paths later... Reviewed-by: Jose Fonseca <[email protected]>
* draw: drop unnecessary index overflow handling from vsplit codeRoland Scheidegger2016-11-212-56/+28
| | | | | | | | | | | | | | | | | | | | This was kind of strange, since it replaced indices which were only overflowing due to bias with MAX_UINT. This would cause an overflow later in the shader, except if stride was 0, however the vertex id would be essentially random then (-1 + eltBias). No test cared about it, though. So, drop this and just use ordinary int arithmetic wraparound as usual. This is much simpler to understand and the results are "more correct" or at least more consistent (vertex id as well as actual fetch results just correspond to wrapped around arithmetic). There's only one catch, it is now possible to hit the cache initialization value also with ushort and ubyte elts path (this wouldn't be an issue if we'd simply handle the eltBias itself later in the shader). Hence, we need to make sure the cache logic doesn't think this element has already been emitted when it has not (I believe some seriously bad things could happen otherwise). So, borrow the logic which handled this from the uint case, but not before fixing it up... Reviewed-by: Jose Fonseca <[email protected]>
* draw: simplify vsplit elts code a bitRoland Scheidegger2016-11-213-40/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | vsplit_get_base_idx explicitly returned idx 0 and set the ofbit in case of overflow. We'd then check the ofbit and use idx 0 instead of looking it up. This was necessary because DRAW_GET_IDX used to return DRAW_MAX_FETCH_IDX and not 0 in case of overflows. However, this is all unnecessary, we can just let DRAW_GET_IDX return 0 in case of overflow. In fact before bbd1e60198548a12be3405fc32dd39a87e8968ab the code already did that, not sure why this particular bit was changed (might have been one half of an attempt to get these indices to actual draw shader execution - in fact I think this would make things less awkward, it would require moving the eltBias handling to the shader as well). Note there's other callers of DRAW_GET_IDX - those code paths however explicitly do not handle index buffer overflows, therefore the overflow value doesn't matter for them. Also do some trivial simplification - for (unsigned) a + b, checking res < a is sufficient for overflow detection, we don't need to check for res < b too (similar for signed). And an index buffer overflow check looked bogus - eltMax is the number of elements in the index buffer, not the maximum element which can be fetched. (Drop the start check against the idx buffer though, this is already covered by end check and end < start). Reviewed-by: Jose Fonseca <[email protected]>
* gallium: Add support for SWR compilationGeorge Kyriazis2016-11-213-0/+12
| | | | | | | | Include swr library and include -DHAVE_SWR in the compile line. v3: split to a separate commit Reviewed-by: Emil Velikov <[email protected]>
* gallium: swr: Added swr build for windowsGeorge Kyriazis2016-11-213-0/+218
| | | | | | | | v4: Add windows-specific gen_knobs.{cpp|h} changes v5: remove aggresive squashing of gen_knobs.py to this commit; added SConscript to EXTRA_DIST in Makefile.am Reviewed-by: Emil Velikov <[email protected]>
* swr: Modify gen_knobs.{cpp|h} creation scriptGeorge Kyriazis2016-11-212-26/+39
| | | | | | | | | | Modify gen_knobs.py so that each invocation creates a single generated file. This is more similar to how the other generators behave. v5: remove Scoscript edits from this commit; moved to commit that first adds SConscript Acked-by: Emil Velikov <[email protected]>
* scons: Add swr compile optionGeorge Kyriazis2016-11-211-0/+1
| | | | | | | | To buils The SWR driver (currently optional, not compiled by default) v3: add option as opposed to target Reviewed-by: Emil Velikov <[email protected]>
* swr: Windows-related changesGeorge Kyriazis2016-11-212-7/+29
| | | | | | | | | | | | | - Handle dynamic library loading for windows - Implement swap for gdi - fix prototypes - update include paths on configure-based build for swr_loader.cpp v2: split to multiple patches v3: split and reshuffle some more; renamed title v4: move Makefile.am changes to other commit. Modify header files Reviewed-by: Emil Velikov <[email protected]>
* swr: renamed duplicate swr_create_screen()George Kyriazis2016-11-213-2/+6
| | | | | | | | | | | There are 2 swr_create_screen() functions. One in swr_loader.cpp, which is used during driver init, and the other is hiding in swr_screen.cpp, which ends up in the arch-specific .dll/.so. Rename the second one to swr_create_screen_internal(), to avoid confusion in header files. Reviewed-by: Emil Velikov <[email protected]>
* swr: Handle windows.h and NOMINMAXGeorge Kyriazis2016-11-213-26/+17
| | | | | | | | | Reorder header files so that we have a chance to defined NOMINMAX before mesa include files include windows.h v3: split from bigger patch Reviewed-by: Emil Velikov <[email protected]>
* gallium: Added SWR support for gdiGeorge Kyriazis2016-11-211-5/+23
| | | | | | | | | | Added hooks for screen creation and swap. Still keep llvmpipe the default software renderer. v2: split from bigger patch v3: reword commit message Reviewed-by: Emil Velikov <[email protected]>
* scons: add llvm 3.9 support.George Kyriazis2016-11-211-2/+19
| | | | | | v2: reworded commit message Reviewed-by: Emil Velikov <[email protected]>
* scons: ignore .hpp files in parse_source_list()George Kyriazis2016-11-211-1/+1
| | | | | | | Drivers that contain C++ .hpp files need to ignore them too, along with .h files, when building source file lists. Reviewed-by: Emil Velikov <[email protected]>
* mesa: removed redundant #elseGeorge Kyriazis2016-11-211-1/+0
| | | | Reviewed-by: Emil Velikov <[email protected]>
* i965/hsw: Set integer mode in sampling state for stencil texturingJordan Justen2016-11-212-18/+9
| | | | | | | | | | | | | | | Fixes: ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_pot ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_npot ES31-CTS.functional.texture.border_clamp.unused_channels.depth24_stencil8_sample_stencil ES31-CTS.functional.texture.border_clamp.unused_channels.depth32f_stencil8_sample_stencil Cc: "13.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* reviewers: add Rob H for the Android EGL+build partsEmil Velikov2016-11-211-0/+2
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: recommend using --enable-mangling over the manual -DUSE...Emil Velikov2016-11-212-7/+6
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs: rework/update install.htmlEmil Velikov2016-11-211-40/+71
| | | | | | | | | | | | | | | | | Still far from perfect, but a few small steps in the right direction. - Split build systems, compilers, third party tools - Mention building mesa for Android (part of AOSP) - Drop explicit "other" dependencies. Reference to disto methods to get them. - HTML 4.01 Traditional compliance fixes - mixed ul and br tags. - nuke dead links README.{CYGWIN,VMS} v2: Squash typos, add note about buggy flex 2.6.2 (Eric), add Suse zipper command (Tobias). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs: sourcetree.html misc updatesEmil Velikov2016-11-211-9/+18
| | | | | | | | | | | | A mixed bag of updates/fixes - mostly aiming at removing no longer applicable directories. Add a few more state-trackers, drivers, etc. alongside "XXX more" where applicable. Attribute for the GLSL/NIR movement and nukage of src/egl/docs. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs: flesh out releasing.htmlEmil Velikov2016-11-212-239/+499
| | | | | | | | | | Properly document the whole process: - Brief on what, when, where - Picking, testing, branchpoints, pre-release announcement - Releasing, announcement, website and bugzilla updates Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs/submittingpatches: fix tags mis/abuseEmil Velikov2016-11-211-1/+5
| | | | | | | Fix the odd tag so that we're HTML 4.01 Traditional compliant Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs/submittingpatches: flesh out "how to nominate" methodsEmil Velikov2016-11-211-10/+20
| | | | | | | | | | | Currently they are buried within the text, making it hard to find. Move them to the top and be clear what is _not_ a good idea. v2: Minor commit polish, use only "resending" as suggested by Matt. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs/autoconf: update glx driver / enable-debug textEmil Velikov2016-11-211-18/+13
| | | | | | | | | | | | With earlier commit we folded all the xlib handling in --enable-glx, but we forgot to update the documentation. Elaborate on --enable-debug and drop mentions about depenencies. v2: Grammar - s|haven't|hasn't| (Eric) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs/repository: refer to Submitting patchesEmil Velikov2016-11-211-1/+2
| | | | | | | v2: Improve grammar - add missing "to" (Eric). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs: split Submitting Patches into separate documentEmil Velikov2016-11-213-285/+310
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs: split Codying style into separate documentEmil Velikov2016-11-213-126/+145
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs: mention/suggest testing your patch against dEQPEmil Velikov2016-11-211-2/+3
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* docs: mention that coding style can differ between driversEmil Velikov2016-11-211-0/+6
| | | | | | | | ... and point people to use/honour the EditorConfig/Emacs files, where applicable. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* revieweds: add Tomasz for the Android/EGL implementationEmil Velikov2016-11-211-0/+4
| | | | | | As mentioned/requested on the mailing list. Signed-off-by: Emil Velikov <[email protected]>
* mesa: fold always true conditionalEmil Velikov2016-11-211-4/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: drop unneeded assertEmil Velikov2016-11-211-1/+0
| | | | | | | | As seen a couple of lines above - there's no way for the assert to trigger. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* egl/wayland: remove non-applicable destroyDrawable from error pathEmil Velikov2016-11-211-3/+1
| | | | | | | | If we fail to create the drawable there's not much point in attampting to destroy it. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* loader: automake: whitespace cleanupEmil Velikov2016-11-211-1/+1
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* gbm: automake: remove unused definesEmil Velikov2016-11-211-2/+0
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* intel: aubinator: Fix resource leak in gen_spec_load_from_pathGwan-gyeong Mun2016-11-211-0/+1
| | | | | | | | | | This fixes resource leak in gen_spec_load_from_path XML_ParserCreate failure path CID 1373564 Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* egl/android: Use gralloc::lock_ycbcr for resolving YUV formats (v2)Tomasz Figa2016-11-211-27/+137
| | | | | | | | | | | | | | | | | There is an interface that can be used to query YUV buffers for their internal format. Specifically, if gralloc:lock_ycbcr() is given no SW usage flags, it's supposed to return plane offsets instead of pointers. Let's use this interface to implement support for YUV formats in Android EGL backend. v2: Fixes from Emil's review: a) Added comments for parts that might be not clear, b) Changed get_fourcc_yuv() to return -1 on failure, c) Changed is_yuv() to use bool. Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* egl/android: Get gralloc module in dri2_initialize_android() (v2)Tomasz Figa2016-11-212-12/+19
| | | | | | | | | | | | | | | Currently droid_open_device() gets a reference to the gralloc module only for its own use and does not store it anywhere. To make it possible to call gralloc methods from code added in further patches, let's refactor current code to get gralloc module in dri2_initialize_android() and store it in dri2_dpy. v2: fixes from Emil's review: a) remove duplicate initialization of 'err'. Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* egl/android: Remove handling of RGB_888 pixel formatTomasz Figa2016-11-211-6/+0
| | | | | | | | | | It is currently completely broken, as it ends up using RGBX_8888 on hardware side, due to no way of distinguishing between these two in the DRI API, while HAL_PIXEL_FORMAT_RGB_888 is clearly defined to be the 3-byte per pixel RGB format. Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radeonsi: Fix resource leak in gs_copy_shader allocation failure pathGwan-gyeong Mun2016-11-221-1/+7
| | | | | | | | | CID 1394028 Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl/lower_output_reads: remove unused mem_ctxNicolai Hähnle2016-11-211-4/+0
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl/lower_output_reads: bail early in tessellation control shadersNicolai Hähnle2016-11-211-2/+6
| | | | | | | | This whole pass is a no-op. Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl/lower_output_reads: fix geometry shader output handling with ↵Nicolai Hähnle2016-11-211-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | conditional emit Consider a geometry shader that contains code like this: some_out = expr; if (cond) { ... EmitVertex(); } else { ... EmitVertex(); } Both branches should see the correct value of some_out. Since this is a rather subtle and rare case, I'm submitting a piglit test for this as well. GLSL says that the values of output variables are undefined after EmitVertex(). With this change, the values will now be defined and unmodified. This may reduce optimization opportunities in the probably quite rare case where subsequent compiler passes cannot prove that the value of the output variable is overwritten. Cc: 13.0 <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: store group_size_variable in struct si_computeNicolai Hähnle2016-11-211-5/+8
| | | | | | | | | | | | | | For compute shaders, we free the selector after the shader has been compiled, so we need to save this bit somewhere else. Also, make sure that this type of bug cannot re-appear, by NULL-ing the selector pointer after we're done with it. This bug has been there since the feature was added, but was only exposed in piglit arb_compute_variable_group_size-local-size by commit 9bfee7047b70cb0aa026ca9536465762f96cb2b1 (which is totally unrelated). Cc: 13.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: don't flatten if-blocks with dynamic array indicesNicolai Hähnle2016-11-211-2/+17
| | | | | | | | | This fixes the regression of radeonsi in glsl-1.10/execution/variable-indexing/vs-output-array-vec3-index-wr caused by commit 74e39de9324d2d2333cda6adca50ae2a3fc36de2. Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* anv/state: enable coordinate address rounding for Min/Mag filtersIago Toral Quiroga2016-11-211-6/+11
| | | | | | | | | | | | | | This patch improves pass rate of dEQP-VK.texture.explicit_lod.2d.sizes.* from 68.0% (98/144) to 83.3% (120/144) by enabling sampler address rounding mode when the selected filter is not nearest, which is the same thing we do for OpenGL. These tests check texture filtering for various texture sizes and mipmap levels. The failures (without this patch) affect cases where the target texture has odd dimensions (like 57x35) and either the Min or the Mag filter is not nearest. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Implement a depth stall restriction on gen7Jason Ekstrand2016-11-203-0/+35
| | | | | | | Fixes around 60 Vulkan CTS tests on Haswell Reviewed-by: Jordan Justen <[email protected]> Cc: "13.0" <[email protected]>