summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* meson: fix tizonia compilationDylan Baker2018-03-071-0/+1
| | | | | | | | | | It needs to have src/egl in it's includes as well. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Jon Turney <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Julien Isorce <[email protected]> Tested-by: Karol Herbst <[email protected]>
* meson: combine state trackers and target if blocksDylan Baker2018-03-071-21/+7
| | | | | | | | | | This is needed later since tizonia requires dri Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Jon Turney <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Julien Isorce <[email protected]> Tested-by: Karol Herbst <[email protected]>
* draw: fix line stippling with aa linesRoland Scheidegger2018-03-071-4/+13
| | | | | | | | | | | | | | In contrast to non-aa, where stippling is based on either dx or dy (depending on if it's a x or y major line), stippling is based on actual distance with smooth lines, so adjust for this. (It looks like there's some minor artifacts with mesa demos line-sample and stippling, it looks like the line endpoints aren't quite right with aa + stippling - maybe due to the integer math in the stipple stage, but I can't quite pinpoint it.) Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: simplify (and correct) aaline fallback (v2)Roland Scheidegger2018-03-071-409/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The motivation actually was to get rid of the additional tex instruction, since that requires the draw fallback code to intercept all sampler / view calls (even if the fallback is never hit). Basically, the idea is to use coverage of the pixel to calculate the alpha value, and coverage is simply based on the distance to the center of the line (in both line direction, which is useful for wide lines, as well as perpendicular to the line). This is much closer to what hw supporting this natively actually does. It also fixes an issue with line width not quite being correct, as well as endpoints getting stretched too far (in line direction) with wide lines, which is apparent with mesa demo line-sample. (For llvmpipe, it would probably make sense to do something like this directly when drawing lines, since rendering two tris is twice as expensive as a line, but it would need some changes with state management.) Since we're no longer relying on mipmapping to get the alpha value, we also don't need to draw 3 rects (6 tris), one is sufficient. There's still issues (as before): - quite sure it's not correct without half_pixel_center, but can't test this with GL. - aaline + line stipple is incorrect (evident with line-sample demo). Looking at the spec the stipple pattern should actually be based on distance (not just dx or dy for x/y major lines as without aa). - outputs (other than pos + the one used for line aa) should be reinterpolated since we actually increase line length by half a pixel (but there's no tests which would care). v2: simplify the math (should be equivalent), don't need immediate v3: use float versions of atan2,cos,sin, minor cleanups Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: remove si_llvm_add_attributeMarek Olšák2018-03-073-25/+16
|
* radeonsi: fix passing address32_hi to LLVM for high valuesMarek Olšák2018-03-071-2/+5
| | | | The old function treats high values as negative, which LLVM interprets as 0.
* radeonsi: assume has_virtual_memory == trueMarek Olšák2018-03-072-34/+18
|
* radeonsi: add/update assertions for 32-bit address spaceMarek Olšák2018-03-072-3/+18
|
* radeonsi: prevent a negative buffer offset in si_upload_descriptorsMarek Olšák2018-03-071-4/+3
|
* radeonsi: properly extract a buffer address from a descriptorMarek Olšák2018-03-071-1/+7
|
* radeonsi: fix vertex buffer address computation with full 64-bit addressesMarek Olšák2018-03-071-3/+3
|
* radeonsi: mask out high VM address bits in registers where neededMarek Olšák2018-03-073-22/+24
|
* st/omx/tizonia/h264d: Add EGLImage supportGurkirpal Singh2018-03-067-6/+227
| | | | | | | | Example Gstreamer pipeline : MESA_ENABLE_OMX_EGLIMAGE=1 GST_GL_API=gles2 GST_GL_PLATFORM=egl gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! glimagesink Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/tizonia: Add H.264 encoderGurkirpal Singh2018-03-0618-403/+2106
| | | | | | | | | | v2: Refactor out screen functions to st/omx Example Gstreamer pipeline : gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videoconvert ! omxh264enc ! h264parse ! avdec_h264 ! videoconvert ! ximagesink Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/tizonia: Add H.264 decoderGurkirpal Singh2018-03-0624-1286/+2793
| | | | | | | | | | v2: Refactor out screen functions to st/omx Example Gstreamer pipeline : gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! videoconvert ! ximagesink Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/tizonia: Add entrypointGurkirpal Singh2018-03-064-2/+78
| | | | | | | Adds base files for adding components Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/tizonia: Add --enable-omx-tizonia flag and build filesGurkirpal Singh2018-03-067-3/+75
| | | | | | | | | | | | Allow only bellagio or tizonia to be used at the same time. Detect tizonia package config file Generate libomx_mesa.so and install it to libtizcore.pc::pluginsdir Only compile empty source (target.c) for now. GSoC Project link: https://summerofcode.withgoogle.com/projects/#4737166321123328 Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/bellagio: Rename st and target directoriesGurkirpal Singh2018-03-0620-35/+83
| | | | | | | | | | | | | | | | | | | v2: Refactor out screen functions to st/omx Allows to keep all the code under st/omx (st/omx/tizonia and st/omx/bellagio). Reverts targets/omx_bellagio to omx as additions to existing files is enough to compile for both bellagio and tizonia. * autotools changes: --enable-omx -> --enable-omx-bellagio * meson changes: -Dgallium-omx=false -> -Dgallium-omx=disabled -Dgallium-omx=true -> -Dgallium-omx=bellagio Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* ac: add ac_count_scratch_private_memory()Samuel Pitoiset2018-03-061-28/+4
| | | | | | | Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128Roland Scheidegger2018-03-061-1/+1
| | | | | | | Some state trackers require 128. (There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl state tracker it's unlikely more than 32 will be needed, if you need more use bindless.)
* tgsi/scan: use wrap-around shift behavior explicitly for file_maskRoland Scheidegger2018-03-063-4/+12
| | | | | | | | | | | | | | The comment said it will only represent the lowest 32 regs. This was not entirely true in practice, since at least on x86 you'll get masked shifts (unless the compiler could recognize it already and toss it out). It turns out this actually works out alright (presumably noone uses it for temp regs) when increasing max sampler views, so make that behavior explicit. Albeit it feels a bit hacky (but in any case, explicit behavior there is better than undefined behavior). Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* clover: Allow overriding platform/device version numbersAaron Watry2018-03-052-5/+14
| | | | | | | | | | | | | | | Useful for testing API, builtin library, and device completeness of not-yet-supported versions. Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (v3) Reviewed-by: Emil Velikov <[email protected]> Cc: Jan Vesely <[email protected]> v4: Remove redundant std::string wrapper around debug_get_option calls v3: mark CL version overrides as static and const v2: Make version_string in platform const in case
* clover/llvm: Pass device down to compileAaron Watry2018-03-051-4/+3
| | | | | | | | | | | | We'll need to be able to detect device version to define the appropriate __OPENCL_VERSION__ header. v2: Rebase after removing the previous patch (Pierre) - Removed "clover: Add device_clc_version to llvm::create_compiler_instance" Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* clover: Pass device to llvm::create_compiler_instanceAaron Watry2018-03-051-4/+5
| | | | | | | | | | | | | | | | | We'll be using dev.device_clc_version to select the default language version soon along with the existing ir_target field. Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> v4: Pass the device down instead of device_clc_version as a separate field v3: Revise to acknowledge that we now have the device in compile/link_program instead of the string values. v2: (Pierre) Move changes to create_compiler_instance invocation to correct patch to prevent temporary build breakage. (Jan) Use device_clc_version instead of device_version for compile/link
* clover/llvm: Use device in llvm compilation instead of copying fieldsAaron Watry2018-03-053-17/+15
| | | | | | | | | | | | | | | Copying the individual fields from the device when compiling/linking will lead to an unnecessarily large number of fields getting passed around. v3: Rebase on current master v2: Use device in function args before making additional changes in following patches Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* radeonsi/nir: fix handling of doubles for gs inputsTimothy Arceri2018-03-061-2/+6
| | | | | | | Fixes piglit test: tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move si_nir_load_input_gs() to si_shader.cTimothy Arceri2018-03-063-29/+20
| | | | | | | | All the tess shader and tgsi equivalents are here and it allows use to use llvm_type_is_64bit() in the following patch without exposing it externally. Reviewed-by: Dave Airlie <[email protected]>
* broadcom/vc4: Add support for HW perfmonBoris Brezillon2018-03-055-12/+249
| | | | | | | | | The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* r600: fix color export maskRoland Scheidegger2018-03-051-0/+1
| | | | | | | | | | | The r600 code (not the eg one) forgot to copy the ps_color_export_mask in commit 5b14e06d8b42e2b08ebc52b6c314ef8647d87a1f when updating the pixel state, leading to misrenderings (probably with MRT). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262 Tested-by: LoneVVolf <[email protected]> Tested-by: Pavel Vinogradov <[email protected]>
* gallium/aux/hud: Avoid possible buffer overflowGert Wollny2018-03-051-2/+6
| | | | | | | | | | Limit the length of acceptable cpu names for use in hud_get_num_cpufreq in order to avoid a buffer overflow later in add_object when this name is copied into cpufreq_info::name. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274 Signed-off-by: Gert Wollny <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* freedreno/ir3: start dealing with half-precisionRob Clark2018-03-053-30/+81
| | | | | | | | | | | | | | | | | Some instructions, assume src and/or dst is half-precision based on a type field (ie. f32/s32/u32 are full precision but others are half precision). So add some code to sanity check the src/dst registers to catch mixups. Also propagate half-precision flag for SSA sources. The instruction consuming a SSA value needs to be of the same type as the one producing it. This is probably not complete half-precision support, but a useful first step. We do still need to add support for nir alu instructions for converting between half/full precision. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix fixing-up register footprintRob Clark2018-03-052-18/+27
| | | | | | | | | | | | | It isn't just vertex shaders that need to fixup reg footprint for inputs populated before shader starts. This problem showed up with compute shaders. If you have (for example) a localregid sysval, but only the .x component is used, the hw still writes the .yz components, which could overflow into other threads causing corruption. Showed up in cl cts 'basic/test_basic intmath_int'. But in theory the same problem could crop up elsewhere. Signed-off-by: Rob Clark <[email protected]>
* freedreno: surfaces can be PIPE_BUFFERRob Clark2018-03-051-4/+10
| | | | | | At least for clover. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: handle compute resourcesRob Clark2018-03-051-2/+4
| | | | | | Not *entirely* sure why this is a different BIND bit, but it is. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: ignore return jumpRob Clark2018-03-051-0/+1
| | | | | | | I think this should also always only occur at the end of a BB (by definition), and the BB successor should be the end block. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add some more compute capsRob Clark2018-03-052-4/+21
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: don't expose 64b pointers yetRob Clark2018-03-051-2/+5
| | | | | | | Temporary hack, but since we can't do 64b math yet in ir3, pretend that we don't support 64b pointers. Signed-off-by: Rob Clark <[email protected]>
* freedreno: steal handy macro for compute caps from nouveauRob Clark2018-03-051-42/+17
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add global_bindings stateRob Clark2018-03-054-4/+85
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: small cleanupRob Clark2018-03-051-3/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add pctx->memory_barrier()Rob Clark2018-03-051-0/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cmdline compiler updates for spv shadersRob Clark2018-03-051-0/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* ac: add ac_build_fsign()Samuel Pitoiset2018-03-051-11/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_isign()Samuel Pitoiset2018-03-051-8/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_fract()Samuel Pitoiset2018-03-051-8/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* virgl: add offset alignment values to to v2 caps struct[email protected]2018-03-053-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | glBindBufferRange(..) in vrend_draw_bind_ubo is failing with more than one uniform block. This is due to improper alignment of the start of the second block. Let's query the proper alignment from the driver and pass it back to Mesa. Let's query for the texture alignment too, even though the Virgl renderer doesn't call glTexBufferRange yet. The default values are the widest workable range possible (for example, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256). Fixes: dEQP-GLES3.functional.ubo.* on Nvidia Example test: dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex Note: This is based on "virgl: reduce some default capset limits.", which hasn't landed in Mesa yet but should relatively soon. Signed-off-by: Dave Airlie <[email protected]>
* virgl: reduce some default capset limits.Dave Airlie2018-03-051-8/+8
| | | | | | | | | Since v2 might take a while to rollout, we should reduce these inside some gathered minimums and then v2 can increase them using host values. Reviewed-by: Stéphane Marchesin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* virgl: handle getting new capsets.Dave Airlie2018-03-056-31/+52
| | | | | | | | This checks the kernel api is new enough and asks for the larger caps size since the kernel won't mess it up now. Reviewed-by: Stéphane Marchesin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/nir: call ac_lower_indirect_derefs()Timothy Arceri2018-03-054-4/+6
| | | | | | | | Fixes piglit tests: tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add chip class to compiler_ctx_stateTimothy Arceri2018-03-053-0/+4
| | | | | | This will be used in the following patch. Reviewed-by: Bas Nieuwenhuizen <[email protected]>