aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* gallivm: implement accurate corner behavior for textureGather with cube mapsRoland Scheidegger2017-12-141-103/+201
| | | | | | | | | | | | | | | | The spec says the missing texel (when we wrap around both x and y axis) should be synthesized as the average of the 3 other texels. For bilinear filtering however we instead adjusted the filter weights (because, while the complexity looks similar, there would be 4 times as many color values to fix up than weights). Obviously this could not work for gather (hence accurate corner filtering was disabled with gather). Implement this by just doing it as the spec implies - calculate the 4th texel as the average of the other 3. With gather of course there's only one color to worry about, so it's not all that many instructions neither (albeit surely the whole cube map filtering is hilariously complex). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix an issue with NaNs with seamless cube filteringRoland Scheidegger2017-12-141-0/+11
| | | | | | | | | | | | | | | | | | | Cube texture wrapping is a bit special since the values (post face projection) always are within [0,1], so we took advantage of that and omitted some clamps. However, we can still get NaNs (either because the coords already had NaNs, or the face projection generated them), and in fact we didn't handle them quite safely. I've seen -INT_MAX + 1 been propagated through as the final int coord value, albeit I didn't observe a crash. (Not quite a coincidence, since any stride mul with -INT_MAX or -INT_MAX+1 will turn up as a small positive number - nevertheless, I'd rather not try my luck, I'm not entirely sure it can't really turn up negative neither due to seamless coord swapping, plus ifloor of a NaN is not guaranteed to return -INT_MAX by any standard. And we kill off NaNs similarly with ordinary texture wrapping too.) So kill off the NaNs by using the common max against zero method. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* intel/tools: Convert aubinator over to the common frameworkJason Ekstrand2017-12-143-690/+33
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch-decoder: Decode registersJason Ekstrand2017-12-141-0/+13
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch-decoder: Decode dynamic stateJason Ekstrand2017-12-141-0/+81
| | | | | | | | Unfortunately, in aubinator and aubinator_error_decode we don't always know how many of a given state we have, so we must guess. One day, we'll come up with a way to annotate the batch to solve this problem. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch-decoder: Decode constants, binding tables, and samplersJason Ekstrand2017-12-141-0/+73
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools: Switch aubinator_error_decode over to the gen_print_batchJason Ekstrand2017-12-143-205/+37
| | | | | | | The shared framework can now do everything that aubinator_error_decode ever did and more. It's time to make the switch. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch-decoder: Decode graphics shadersJason Ekstrand2017-12-141-0/+95
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch-decoder: Decode vertex and index buffersJason Ekstrand2017-12-142-0/+161
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch-decoder: Decode MEDIA_INTERFACE_DESCRIPTOR_LOADJason Ekstrand2017-12-141-0/+145
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools: Add the start of a generic batch decoderJason Ekstrand2017-12-142-0/+306
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Expose the raw field value in the iteratorJason Ekstrand2017-12-142-1/+3
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/disasm: Take a devinfo in gen_disasm_createJason Ekstrand2017-12-144-8/+7
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Take a bit offset in gen_print_groupJason Ekstrand2017-12-145-23/+27
| | | | | | | | | | Previously, if a group was nested in another group such that it didn't start on a dword boundary, we would decode it as if it started at the start of its first dword. This changes things to work even more in terms of bits so that we can properly decode these structs. This affects MOCS, attribute swizzles, and several other things. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Stop rounding down to the nearest dwordJason Ekstrand2017-12-141-11/+12
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Convert the iterator to work entirely in bitsJason Ekstrand2017-12-142-12/+9
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Drop gen_field_decode helperJason Ekstrand2017-12-142-11/+0
| | | | | | It's unused Reviewed-by: Lionel Landwerlin <[email protected]>
* amd/common: add ac_build_waitcnt()Samuel Pitoiset2017-12-146-27/+17
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: more use of i32_1Samuel Pitoiset2017-12-141-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: more use of i32_0Samuel Pitoiset2017-12-141-9/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: make use of ac_build_fdiv()Samuel Pitoiset2017-12-142-7/+2
| | | | | | | And move the comment to amd/common. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: export SampleMask from pixel shaders at full rateSamuel Pitoiset2017-12-142-16/+41
| | | | | | | | | | | Use 16_ABGR instead of 32_ABGR if Z isn't written. Ported from RadeonSI. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: make use of ac_get_spi_shader_z_format()Samuel Pitoiset2017-12-143-23/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: add ac_get_spi_shader_z_format()Samuel Pitoiset2017-12-144-1/+84
| | | | | | | | ac_shader_util.c will contain shader helpers for RadeonSI and RADV. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not load the local invocation index when it's unusedSamuel Pitoiset2017-12-144-2/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID componentsSamuel Pitoiset2017-12-141-3/+8
| | | | | | | | We should also not load the input SGPRs and VGPRS, but let's start with this for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: scan which components of gl_LocalInvocationID are usedSamuel Pitoiset2017-12-142-1/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: scan which components of gl_WorkGroupID are usedSamuel Pitoiset2017-12-142-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set FORCE_SIMD_DIST(1) for compute when profitableSamuel Pitoiset2017-12-141-0/+14
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: calculate best compute resource limitsSamuel Pitoiset2017-12-141-1/+14
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store the dispatch initiator into the deviceSamuel Pitoiset2017-12-143-11/+12
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: replace grid_components_used by uses_grid_sizeSamuel Pitoiset2017-12-143-5/+6
| | | | | | | | Use a boolean instead because the number of needed SGPRs is always 3. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always emit all compute block componentsSamuel Pitoiset2017-12-142-13/+11
| | | | | | | | | The number of grid components is always 3 when gl_NumWorkGroups is declared, because it relies on the number of components of nir_instrinsic_load_num_work_groups. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* egl/android: Provide an option for the backend to expose KHR_imageHarish Krupo2017-12-143-1/+4
| | | | | | | | | | | | | | | | | From android cts 8.0_r4, a new test case checks if all the required egl extensions are exposed. In the current implementation we expose KHR_image if KHR_image_base and KHR_image_pixmap are supported but KHR_image spec does not mandate the existence of both the extensions. This patch preserves the current check and also provides the backend with an option to expose the KHR_image extension. Test: run cts -m CtsOpenGLTestCases -t \ android.opengl.cts.OpenGlEsVersionTest#testRequiredEglExtensions Signed-off-by: Harish Krupo <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radv: Don't advertise VK_EXT_debug_report.Bas Nieuwenhuizen2017-12-141-1/+0
| | | | | | | We never supported it. Missed during copy and pasting. Fixes: 17201a2eb0b "radv: port to using updated anv entrypoint/extension generator." Reviewed-by: Samuel Pitoiset <[email protected]>
* i965: Don't allocate an MCS for 16x MSAA and width > 8192.Kenneth Graunke2017-12-141-0/+4
| | | | | | | | | | | | The hardware doesn't support this, and isl_surf_get_mcs_surf will fail. I feel a bit bad replicating this logic, but we want to decide up front. This fixes the following test when run with --deqp-surface-width=16384: - GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_framebuffers_different_sample_count Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* Android: fix missing generation of vtn_gather_types.cRob Herring2017-12-131-0/+5
| | | | | | | | | | Commit bb1e6ff161c9 ("spirv: Add a prepass to set types on vtn_values") added generation of vtn_gather_types.c, but forgot to add it to the Android build files. Fixes: bb1e6ff161c9 ("spirv: Add a prepass to set types on vtn_values") Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Rob Herring <[email protected]>
* mesa: Add glSpecializeShaderARB to common_desktop_functionsDylan Baker2017-12-131-0/+3
| | | | | | | | | CC: Nicolai Hähnle <[email protected]> CC: Mark Janes <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104231 Fixes: 46b21b8f906 ("mesa: add GL_ARB_gl_spirv boilerplate") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* egl/android: Partially handle HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINEDTomasz Figa2017-12-131-2/+39
| | | | | | | | | | | | | | There is no API available to properly query the IMPLEMENTATION_DEFINED format. As a workaround we rely here on gralloc allocating either an arbitrary YCbCr 4:2:0 or RGBX_8888, with the latter being recognized by lock_ycbcr failing. Reviewed-on: https://chromium-review.googlesource.com/566793 Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Chad Versace <[email protected]> Signed-off-by: Robert Foss <[email protected]> Signed-off-by: Rob Herring <[email protected]>
* swr: Correct texture allocation and limit max size to 2GBBruce Cherniak2017-12-132-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes piglit tex3d-maxsize by correcting 4 things: The total_size calculation was using 32-bit math, therefore a >4GB allocation request overflowed and was not returning false (unsupported). Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle >4GB allocations. Added error checking on texture allocations to fail gracefully. Finally, temporarily decreased supported max texture size from 4GB to 2GB. The gallivm texture-sampler needs some additional work to correctly handle larger than 2GB textures (offsets to LLVMBuildGEP are signed). I'm working on a follow-on patch to allow up to 4GB textures, as this is useful in HPC visualization applications. Fixes piglit tex3d-maxsize. v2: Updated patch description to clarify ">4GB". Reviewed-By: George Kyriazis <[email protected]>
* swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.Bruce Cherniak2017-12-131-2/+1
| | | | | | | | | | | | | Environment variable KNOB_MAX_WORKER_THREADS allows the user to override default thread creation and thread binding. Previous commit to adjust linux cpu topology caused setting this KNOB to bind all threads to a single core. This patch restores correct functionality of override. Cc: <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* meson: fix glx-test raceDylan Baker2017-12-131-1/+1
| | | | | | | This test should rely on dispatch.h being generated, but it doesn't. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium/docs: document behavior of set_sample_mask()Brian Paul2017-12-131-1/+4
| | | | | | | | The sample mask is used even if msaa is not explicity enabled when we have a framebuffer with multisampled surfaces. That's DX behavior and what the Radeon drivers do. Not sure about other drivers at this point. Reviewed-by: Roland Scheidegger <[email protected]>
* glsl: trivial whitespace fixes in link_varyings.cppBrian Paul2017-12-131-2/+2
|
* program: Don't reset SamplersValidated when restoring from shader cacheJordan Justen2017-12-131-7/+9
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988 Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: remove second include of errors.h in src/mesa/main/glspirv.cKai Wasserbäch2017-12-121-4/+0
| | | | | | Fixes: 5bc03d2508 ("mesa: implement SPIR-V loading in glShaderBinary") Signed-off-by: Kai Wasserbäch <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* radeonsi: create get_tcs_tes_buffer_address helperTimothy Arceri2017-12-131-12/+32
| | | | | | This will be shared between the NIR and TGSI backends. Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: fix nir_op_f2f64Timothy Arceri2017-12-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | Without this we get the error "FPExt only operates on FP" when converting the following: vec1 32 ssa_5 = b2f ssa_4 vec1 64 ssa_6 = f2f64 ssa_5 Which results in: %44 = and i32 %43, 1065353216 %45 = fpext i32 %44 to double With this patch we now get: %44 = and i32 %43, 1065353216 %45 = bitcast i32 %44 to float %46 = fpext float %45 to double Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: fix shift for uint64_tTimothy Arceri2017-12-131-2/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_nir: skip forced array splitting for tcsTimothy Arceri2017-12-131-1/+2
| | | | | | | | | nir_lower_io_to_temporaries() does not support tcs so we cannot assume there are no indirects here. Also the radeonsi backend (the only backend to support tess) has support for tcs indirects so there is no need to lower them anyway. Reviewed-by: Nicolai Hähnle <[email protected]>