summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ddebug: implement create_batch_queryMarek Olšák2016-07-261-0/+27
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: don't use abort()Marek Olšák2016-07-261-1/+1
| | | | | | We don't want a core dump. Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: make dd_get_file_stream accept the screen onlyMarek Olšák2016-07-261-7/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: clean up ddebug_screen_createMarek Olšák2016-07-261-16/+23
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: rework flags for pipe_context::dump_debug_stateMarek Olšák2016-07-264-17/+34
| | | | | | | | The pipelined hang detection mode will not want to dump everything. (and it's also time consuming) It will only dump shaders after a draw call and then dump the status registers separately if a hang is detected. Reviewed-by: Nicolai Hähnle <[email protected]>
* vc4: add hash table look-up for exported dmabufsRob Herring2016-07-264-3/+56
| | | | | | | | | | | | | It is necessary to reuse existing BOs when dmabufs are imported. There are 2 cases that need to be handled. dmabufs can be created/exported and imported by the same process and can be imported multiple times. Copying other drivers, add a hash table to track exported BOs so the BOs get reused. v2: Whitespace fixup (by anholt) Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Disable early Z with computed depth.Eric Anholt2016-07-263-2/+11
| | | | | We don't tell the hardware whether we're computing depth, so we need to manage early Z state manually. Fixes piglit early-z.
* ttn: Update shader->info as we generate code.Eric Anholt2016-07-261-0/+13
| | | | | | | We could use the nir_shader_gather_info() pass to update it after the fact, but this is what glsl_to_nir and prog_to_nir do. Reviewed-by: Rob Clark <[email protected]>
* mesa: standardize naming Mesa3D, MESA -> MesaVedran Miletić2016-07-264-5/+5
| | | | | Signed-off-by: Vedran Miletić <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* mesa: Make MESA_SHADER_CAPTURE_PATH skip shaders with Name == -1.Kenneth Graunke2016-07-261-1/+1
| | | | | | | | Shaders with shProg->Name == ~0 (aka 4294967295) are internal meta shaders that we don't really want to capture. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Use AC_HEADER_MAJOR to include correct header for major().Matt Turner2016-07-264-4/+18
| | | | | | Gentoo has been smoke testing an upcoming change to glibc. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=580392
* glsl: Remove references to tail_pred.Matt Turner2016-07-261-9/+9
|
* glx: Avoid aliasing violations.Matt Turner2016-07-262-24/+25
| | | | | | | | Compilers are perfectly capable of generating efficient code for calls like these to memcpy(). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Avoid aliasing violation in uniform_query.cpp.Matt Turner2016-07-261-14/+31
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Avoid aliasing violation in FXT1.Matt Turner2016-07-261-2/+2
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* swrast: Avoid aliasing violation.Matt Turner2016-07-261-2/+2
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Avoid aliasing violations.Matt Turner2016-07-262-10/+4
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Separate overlapping sentinel nodes in exec_list.Matt Turner2016-07-2625-137/+165
| | | | | | | | | | | I do appreciate the cleverness, but unfortunately it prevents a lot more cleverness in the form of additional compiler optimizations brought on by -fstrict-aliasing. No difference in OglBatch7 (n=20). Co-authored-by: Davin McCall <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/miptree: Stop multiplying cube depth by 6 in HiZ calculationsJason Ekstrand2016-07-261-17/+2
| | | | | | | | | intel_mipmap_tree::logical_depth0 is now in number of 2D slices so we no longer need to be multiplying by 6. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: "12.0" <[email protected]>
* i965/miptree/isl: Stop multiplying depth by 6 for cubesJason Ekstrand2016-07-261-5/+0
| | | | | | | | | | Now that the logical_depth0 field is in number of 2D slices, we don't need to be multiplying by 6 when creating the surface. It wasn't hurting anything primarily because we get the actual length from the view which was already handling it correctly. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/blorp/gen8: Stop multiplying depth by 6 for cubesJason Ekstrand2016-07-261-4/+1
| | | | | | | | intel_mipmap_tree::logical_depth0 is now in 2-D slices so there is no need for us to multiply by 6 when we go to fill out a blorp surface state. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* nvc0: use nvc0_m2mf_push_linear() to reduce code duplicationSamuel Pitoiset2016-07-261-13/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: use nve4_p2mf_push_linear() to reduce code duplicationSamuel Pitoiset2016-07-261-36/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* build: Remove unused AX_CHECK_COMPILE_FLAG macroAndreas Boll2016-07-251-72/+0
| | | | | | | | Unused since 1a6ae840413d7fb6d2e83f6a83081d5246c7ac9e Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* main: memcpy larger chunks in _mesa_propagate_uniforms_to_driver_storageNils Wallménius2016-07-251-6/+23
| | | | | | | | | | | | | | | | When possible, do the memcpy on larger blocks. This reduces cycles spent in _mesa_propagate_uniforms_to_driver_storage from 1.51 % to 0.62% according to perf during the Unigine Heaven benchmark. It did not affect the framerate of the benchmark. The system used for testing was an i5 6600K with a Radeon R9 380. Piglit hangs randomly on this system both with and without the patch so i could not make a comparison. v2: fixed whitespace Signed-off-by: Nils Wallménius <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/va: enable h264 VAAPI encodeBoyuan Zhang2016-07-251-5/+1
| | | | | | Enable H.264 VAAPI encoding through config. Currently only H.264 baseline is supported. Encode entrypoint is not accepted by driver. Signed-off-by: Boyuan Zhang <[email protected]>
* st/va: add function to handle misc param type frame rateBoyuan Zhang2016-07-251-5/+19
| | | | | | Frame rate can be passed to driver either through VAEncSequenceParameterBufferType or VAEncMiscParameterTypeFrameRate. Previous code only implement the former one, which is used by Gstreamer-Vaapi. Now adding implementation for VAEncMiscParameterTypeFrameRate. Also adding default frame rate as 30 just in case application never provides frame rate information to driver. Signed-off-by: Boyuan Zhang <[email protected]>
* st/va: add enviromental variable to disable interlaceBoyuan Zhang2016-07-251-0/+4
| | | | | | Add environmental variable to disable interlace mode. At VAAPI decoding stage, driver can not distinguish b/w pure decoding case and transcoding case. And since interlace encoding is not supported, we have to disable interlace for transcoding case. The temporary solution is to use enviromental variable to disable interlace mode. Signed-off-by: Boyuan Zhang <[email protected]>
* st/va: add preset values for VAAPI encodeBoyuan Zhang2016-07-251-0/+27
| | | | | | Add some hardcoded values hardware needs mainly for rate control purpose. With previously hardcoded values for OMX, the rate control result is not correct. This change fixed the rate control result by setting correct values for Vaapi. Signed-off-by: Boyuan Zhang <[email protected]>
* st/va: add functions for VAAPI encodeBoyuan Zhang2016-07-253-2/+178
| | | | | | Add necessary functions/changes for VAAPI encoding to buffer and picture. These changes will allow driver to handle all Vaapi encode related operations. This patch doesn't change the Vaapi decode behaviour. Signed-off-by: Boyuan Zhang <[email protected]>
* st/va: get rate control method from configattrib v2Boyuan Zhang2016-07-253-0/+15
| | | | | | | | | | | Rate control method is passed from app to driver through config attrib list. That is why we need to store this rate control method to config. And later on, we will pass this value to context->desc.h264enc.rate_ctrl.rate_ctrl_method. v2 (chk): fix broken build and commit message Signed-off-by: Boyuan Zhang <[email protected]> Signed-off-by: Christian König <[email protected]>
* st/va: add conversion for yv12 to nv12in putimage v2Boyuan Zhang2016-07-251-7/+27
| | | | | | | | | | | | For putimage call, if image format is yv12 (or IYUV with U V field swap) and surface format is nv12, then we need to convert yv12 to nv12 and then copy the converted data from image to surface. We can't use the existing logic where surface is destroyed and re-created with yv12 format. v2 (chk): fix some compiler warnings and commit message Signed-off-by: Boyuan Zhang <[email protected]> Signed-off-by: Christian König <[email protected]>
* vl/util: add copy func for yv12image to nv12surface v2Boyuan Zhang2016-07-251-0/+37
| | | | | | | | | | | | Add function to copy from yv12 image to nv12 surface for VAAPI putimage call. We need this function in VaPutImage call where copying from yv12 image to nv12 surface for encoding. Existing function can't be used because it only work for copying from yv12 surface to nv12 image in Vaapi. v2: cleanup variable types and commit message Signed-off-by: Boyuan Zhang <[email protected]> Signed-off-by: Christian König <[email protected]>
* st/va: add encode entrypoint v2Boyuan Zhang2016-07-254-39/+150
| | | | | | | | | | | | | | | | | VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. We will save this encode entry point in config. config_id was used as profile previously. Now, config has both profile and entrypoint field, and config_id is used to get the config object. Later on, we pass this entrypoint to context->templat.entrypoint instead of always hardcoded to PIPE_VIDEO_ENTRYPOINT_BITSTREAM for decoding case previously. Encode entrypoint is not accepted by driver until we enable Vaapi encode in later patch. v2 (chk): fix commit message to match 80 chars, use switch instead of ifs, fix memory leaks in the error path, implement vlVaQueryConfigEntrypoints as well, drop VAEntrypointEncPicture (only used for JPEG). Signed-off-by: Boyuan Zhang <[email protected]> Signed-off-by: Christian König <[email protected]>
* nvc0: upload sample locations on GM20xSamuel Pitoiset2016-07-243-5/+31
| | | | | | | | This fixes a bunch of multisample piglit tests on GM206, like bin/arb_texture_multisample-texelfetch 2 -auto -fbo Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: time-elapsed query should be active for clearsRob Clark2016-07-241-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* nvc0/ir: fix up an assertion in emitUADD()Samuel Pitoiset2016-07-241-4/+3
| | | | | | | | It's illegal to have neg modifiers on both sources for OP_ADD, and it's illegal to have OP_SUB with just src0 neg. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix wrong indentation in nvc0_validate_fb()Samuel Pitoiset2016-07-231-141/+141
| | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]>
* glsl: reuse main extension table to appropriately restrict extensionsIlia Mirkin2016-07-2314-354/+268
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we were only restricting based on ES/non-ES-ness and whether the overall enable bit had been flipped on. However we have been adding more fine-grained restrictions, such as based on compat profiles, as well as specific ES versions. Most of the time this doesn't matter, but it can create awkward situations and duplication of logic. Here we separate the main extension table into a separate object file, linked to the glsl compiler, which makes use of it with a custom function which takes the ES-ness of the shader into account (thus allowing desktop shaders to properly use ES extensions that would otherwise have been disallowed.) We can also now use this logic to generate #define's for all supported extensions automatically, removing the duplicate (and often inaccurate) list in glcpp. The effect of this change should be nil in most cases. However in some situations, extensions like GL_ARB_gpu_shader5 which were formerly available in compat contexts on the GLSL side of things will now become inaccessible. This regresses two ES CTS tests: ES3-CTS.shaders.shader_integer_mix.define ES31-CTS.shader_integer_mix.define however that is due to them using #version 100 instead of 300 es. As the extension is only defined for ES3, I believe this is the correct behavior. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v2) v2 -> v3: integrate glcpp defines into the same mechanism
* freedreno/a4xx: timestamp queriesRob Clark2016-07-233-1/+34
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: hw timestamp supportRob Clark2016-07-233-3/+16
| | | | | | If the kernel supports it, use hw counter for timestamps. Signed-off-by: Rob Clark <[email protected]>
* freedreno: prep work for timestamp queriesRob Clark2016-07-233-6/+10
| | | | | | | | | We need "NULL" state to be a valid bit in the bitmask, because timestamp queries are not restricted to draw/etc stages (ie. the only commands to submit may just be to read the timestamp). And just because there are no draws, isn't a reason to skip the flush and return zero. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: ensure sample locations are set for line and polygon smoothingNicolai Hähnle2016-07-231-2/+1
| | | | | | | Since commit d938b8c, the sample locations are no longer set unconditionally, so we need to set the atom to dirty on all chips, not just Polaris. Cc: 12.0 <[email protected]>
* radeonsi: fix Polaris MSAA regressionNicolai Hähnle2016-07-232-15/+20
| | | | | | | | | | | | | The regression was introduced by commit d938b8c. The problem here is that in order to use the small primitive filter, we need to explicitly set the sample locations to 0. But the DB doesn't properly process the change of sample locations without a flush, and so we can end up with incorrect Z values. Instead of doing a flush, just disable the small primitive filter when MSAA is force-disabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908 Cc: 12.0 <[email protected]>
* freedreno/ir3: Add missing braces in initializer[email protected]2016-07-231-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a2xx: silence missing case 'SHADER_COMPUTE' warning (v2)[email protected]2016-07-231-0/+2
| | | | | | | v2: no need for break after an unreachable (Matt Turner) Signed-off-by: Francesco Ansanelli <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* radeonsi: implement buffer_subdata without indirect callsMarek Olšák2016-07-235-5/+41
| | | | | | There is less noise in CPU profile data now. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/util: don't modify usage in pipe_buffer_writeMarek Olšák2016-07-232-9/+7
| | | | | | All drivers were already doing it except virgl. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: split transfer_inline_write into buffer and texture callbacksMarek Olšák2016-07-2357-389/+383
| | | | | | | | | | | | | | | | | | | | | | | | | | to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* nir: Lower interp_var_at_* like a normal load_var for flat inputs.Kenneth Graunke2016-07-221-0/+4
| | | | | | | | | | "flat centroid" and "flat sample" both just mean "flat", so we should ignore interpolateAtCentroid/Sample and just return the flat value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97032 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>