summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi/gfx9: don't set gs_table_depthMarek Olšák2017-11-071-2/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: limit the scissor bug workaround to Vega10 and Raven onlyMarek Olšák2017-11-071-4/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove unused field in the PCI ID tableMarek Olšák2017-11-073-3/+3
| | | | Reviewed-by: Alex Deucher <[email protected]>
* mesa: fix deleting the dummy ATI_fsMiklós Máté2017-11-071-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The DummyShader is used by GenFragmentShadersATI() as a placeholder to mark IDs as allocated. Context cleanup wants to delete everything in ctx->Shared->ATIShaders, and crashes on these placeholders with this backtrace: ==15060== Invalid free() / delete / delete[] / realloc() ==15060== at 0x482F478: free (vg_replace_malloc.c:530) ==15060== by 0x57694F4: _mesa_delete_ati_fragment_shader (atifragshader.c:68) ==15060== by 0x58B33AB: delete_fragshader_cb (shared.c:208) ==15060== by 0x5838836: _mesa_HashDeleteAll (hash.c:295) ==15060== by 0x58B365F: free_shared_state (shared.c:377) ==15060== by 0x58B3BC2: _mesa_reference_shared_state (shared.c:469) ==15060== by 0x578687F: _mesa_free_context_data (context.c:1366) ==15060== by 0x595E9EC: st_destroy_context (st_context.c:642) ==15060== by 0x5987057: st_context_destroy (st_manager.c:772) ==15060== by 0x5B018B6: dri_destroy_context (dri_context.c:217) ==15060== by 0x5B006D3: driDestroyContext (dri_util.c:511) ==15060== by 0x4A1CBE6: dri3_destroy_context (dri3_glx.c:170) ==15060== Address 0x7b5dae0 is 0 bytes inside data symbol "DummyShader" Also, DeleteFragmentShadersATI() should not assert on DummyShader, just remove the hash entry. Normally one would define a shader after GenFragmentShadersATI(), and BindFragmentShaderATI() replaces the placeholder with a real object. However, the specification doesn't say that one has to define a shader for each allocated ID. Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: Guard assertions by NDEBUG instead of DEBUGMichel Dänzer2017-11-071-1/+1
| | | | | | | This matches the standard assert.h header. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* meson: drop GLESv1 .so version back to 1.0.0Eric Engestrom2017-11-071-1/+1
| | | | | | | | autotools generates libGLESv1_CM.so.1.0.0, so let's make sure meson does the same. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* meson: standardize .so version to major.minor.patchEric Engestrom2017-11-078-7/+8
| | | | | | | | | | | | | | This `version` field defines the filename for the .so. The plan .so as well as .so.$major are always symlinks to this. Unless I'm mistaken, only the major is ever used, so this shouldn't matter, but for consistency with autotools (and in case it does matter), let's always have all 3 major.minor.patch components. (The soname isn't affected, and is always .so.$major) Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* ac/nir: for ubo load use correct num_componentsDave Airlie2017-11-071-1/+1
| | | | | | | | | I was hacking something stupid in doom, and hit an assert for the bitcast following this, it definitely looks like this should be the number of 32-bit components, not the instr level ones. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: fix a typoGwan-gyeong Mun2017-11-061-1/+1
| | | | | Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* glsl: Allow precision mismatch on dead data with GLSL ES 1.00Tomasz Figa2017-11-061-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 259fc505454ea6a67aeacf6cdebf1398d9947759 added linker error for mismatching uniform precision, as required by GLES 3.0 specification and conformance test-suite. Several Android applications, including Forge of Empires, have shaders which violate this rule, on a dead varying that will be eliminated. The problem affects a big number of applications using Cocos2D engine and other GLES implementations accept this, this poses a serious application compatibility issue. Starting from GLSL ES 3.0, declarations with conflicting precision qualifiers are explicitly prohibited. However GLSL ES 1.00 does not clearly specify the behavior, except that "Uniforms are defined to behave as if they are using the same storage in the vertex and fragment processors and may be implemented this way. If uniforms are used in both the vertex and fragment shaders, developers should be warned if the precisions are different. Conversion of precision should never be implicit." The word "used" is not clear in this context and might refer to 1) declared (same as GLES 3.x) 2) referred after post-processing, or 3) linked after all optimizations are done. Looking at existing applications, 2) or 3) seems to be widely adopted. To avoid compatibility issues, turn the error into a warning if GLSL ES version is lower than 3.0 and the data is dead in at least one of the shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532 Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: disable NIR linking on HSW and belowTimothy Arceri2017-11-071-1/+4
| | | | | | | Fixes: 379b24a40d3d "i965: make use of nir linking" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103537 Reviewed-by: Iago Toral Quiroga <[email protected]>
* radv: move is_local up to the winsys level.Dave Airlie2017-11-064-3/+6
| | | | | | | | We can avoid adding the buffer in the non-local case, this will avoid all the overhead of the indirect call. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: wrap cs_add_buffer in an inline. (v2)Dave Airlie2017-11-066-41/+49
| | | | | | | | | The next patch will try and avoid calling the indirect function. v2: add a missing conversion. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: when loading regs no need to add bufferDave Airlie2017-11-061-2/+0
| | | | | | | | The function that calls us has just added the buffer to the list already, no need to try and add it again. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: pre-calculate user_data_0 registers and store in pipelineDave Airlie2017-11-065-52/+55
| | | | | | | | There's no point recalculating these the whole time on descriptor emission, just store them at pipeline creation. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965: Enable flush controlNeil Roberts2017-11-062-1/+21
| | | | | | | | Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Neil Roberts <[email protected]>
* drisw: Enable flush control for llvmpipe and softpipeAdam Jackson2017-11-061-0/+1
| | | | | | | | | | | | | | | | Hilariously this is a fairly big win. Neil's multi-context-test improves from ~24 to ~36 fps with llvmpipe on a Core i5-3317U. softpipe also improves, from about 2.25 to 3.09 fps (when it's that slow, you're allowed to be that precise). I'd have added it to swrast classic, but the testcase wants GL 3.0 and shaders, and that's not a thing classic has, so I figured making it work on softpipe was crime enough. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* gallium: Wire up flush controlAdam Jackson2017-11-063-1/+9
| | | | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* egl: Implement EGL_KHR_context_flush_controlAdam Jackson2017-11-066-1/+24
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* glx: Implement GLX_ARB_context_flush_controlNeil Roberts2017-11-067-9/+62
| | | | | | | Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Neil Roberts <[email protected]>
* dri: Add a flush control extensionNeil Roberts2017-11-062-2/+21
| | | | | | | | | | This advertises that the driver can accept a new context attribute __DRI_CTX_ATTRIB_RELEASE_BEHAVIOR. Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Neil Roberts <[email protected]>
* dri: Change __DriverApiRec::CreateContext to take a struct for attribsNeil Roberts2017-11-0614-131/+152
| | | | | | | | | | | | | | | | | | | | Previously the CreateContext method of __DriverApiRec took a set of arguments to describe the attribute values from the window system API's CreateContextAttribs function. As more attributes get added this could quickly get unworkable and every new attribute needs a modification for every driver. To fix that, pass the attribute values in a struct instead. The struct has a bitmask to specify which members are used. The first three members (two for the GL version and one for the flags) are always set. If the bit is not set in the attribute mask then it can be assumed the attribute has the default value. Drivers will error if unknown bits in the mask are set. Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Neil Roberts <[email protected]>
* intel: Don't flush the old context in intelMakeCurrentNeil Roberts2017-11-062-18/+0
| | | | | | | | | | | | | | | | | | | It shouldn't be necessary to flush the context within the driver implementation because the old context is explicitly flushed in _mesa_make_current which is called a little further on. It is useful to only have a single place that flushes when switching contexts to make it easier to later implement the GL_KHR_context_flush_control extension. The flush in intelMakeCurrent was added in commit 5505865 to implement the GLX semantics that the context should be flushed when it is released. When the commit was made there was no flush in _mesa_make_current because it was only added later in 93102b4c. I think that later commit effectively makes the first commit redundant. Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Neil Roberts <[email protected]>
* egl/dri2: Factor out context attribute initializationAdam Jackson2017-11-061-24/+7
| | | | | | | Signed-off-by: Adam Jackson <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* etnaviv: Don't over-pad compressed texturesWladimir J. van der Laan2017-11-061-9/+15
| | | | | | | | | HALIGN_FOUR/SIXTEEN has no meaning for compressed textures, and we can't render to them anyway. So use the tightest possible packing. This avoids bugs with non-power-of-two block sizes. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: ASTC texture supportWladimir J. van der Laan2017-11-067-2/+57
| | | | | | | | Add ASTC texture support for hardware that supports this (currently only GC3000 on i.MX6qp is known to have this). Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: Update from rnndbWladimir J. van der Laan2017-11-0613-320/+1015
| | | | | | | Updated as of etnav_viv commit 3b4a8ec. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* radv: add initial copy descriptor support. (v2)Dave Airlie2017-11-061-2/+53
| | | | | | | | | | | | It appears the latest dota2 vulkan uses this, and we get a hang in VR mode without it. v2: remove finishme I left in after finishing. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Andres Rodriguez <[email protected]> Cc: "17.2 17.3" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/u_vbuf: use signed vertex buffers offsets for optimal uploadsMarek Olšák2017-11-061-2/+10
| | | | | | | | | | | | | | | Uploaded data must start at (stride * start), because we can't modify start in all cases. If it's the first allocation, it's also the amount of memory wasted. If the starting offset is larger than the size of the upload buffer, the buffer is re-created, used for 1 upload, and then thrown away. If the upload is small, most of the buffer space is unused and wasted. Keep doing that and the OOM killer comes. It's actually pretty quick. With signed VB offsets, we can set min_out_offset = 0 in u_upload_alloc/u_upload_data. This fixes OOM situations with SPECviewperf.
* radeonsi: enable signed vertex buffer offsetsMarek Olšák2017-11-062-15/+12
|
* gallium: add PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSETMarek Olšák2017-11-0618-0/+21
|
* automake: include git_sha1.h.in in release tarballJuan A. Suarez Romero2017-11-061-1/+1
| | | | | | | | | | | | Fixes: make[2]: Leaving directory '/home/local/mesa/mesa-17.4.0-devel/_build/sub/src' make[2]: *** No rule to make target '../../../src/git_sha1.h.in', needed by 'git_sha1.h'. Stop. Makefile:660: recipe for target 'all-recursive' failed Fixes: 16be271c6ee618e79c7d "git_sha1_gen: use git_sha1.h.in on all build systems" Reviewed-by: Eric Engestrom <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>
* radeonsi: don't map big VRAM buffers for the first upload directlyMarek Olšák2017-11-062-0/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_threaded: don't map big VRAM buffers for the first upload directlyMarek Olšák2017-11-063-2/+28
| | | | | | | This improves Paraview "many spheres" performance 4x along with the radeonsi commit. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_threaded: clean up tc_improve_map_buffer_flags and prevent reentryMarek Olšák2017-11-061-7/+12
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: move descriptor sets out of cmd_state.Dave Airlie2017-11-063-17/+20
| | | | | | | | | | | Instead of storing all the pointers and zeroing them all out, just store a valid bitmask in the state. This also moves the CmdBindPipeline path down the cpu usage path for the multithreading demo as it no longer has to traverse MAX_SETS to find the active descriptor sets. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: add helper for setting a descriptor.Dave Airlie2017-11-063-10/+17
| | | | | | | This is just a simple refactor. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: move vertex binding out of cmd state.Dave Airlie2017-11-062-4/+4
| | | | | | | | | This isn't required to be cleared, since buffers are only linked by vertex elements, so if elements are clear then no buffers should be referenced. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: reorder cmd_state to remove a hole.Dave Airlie2017-11-061-1/+1
| | | | | | | | This just removes a hole in the cmd_state and packs some bools together. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: free attachments on end command buffer.Dave Airlie2017-11-061-0/+2
| | | | | | | | | | | | | | If we allocate attachments in the begin command buffer due to the render pass continue bit, we were leaking them. Since renderpasses inside a cmd buffer malloc/free these properly, and set to NULL, we just need to call free at end. Fixes a memory leak with multithreading demo. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2 17.3" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Optimize calling radv_save_descriptors.Bas Nieuwenhuizen2017-11-041-4/+2
| | | | | | | | uint32_t data[MAX_SETS * 2] = {}; was getting executed before the exit and took significant amounts of time. By having the check outside the function, we skip the execution of the clear. Reviewed-by: Dave Airlie <[email protected]>
* radv: Use an array to store descriptor sets.Bas Nieuwenhuizen2017-11-042-26/+50
| | | | | | | | | | | | The vram_list linked list resulted in lots of pointer chasing. Replacing this with an array instead improves descriptor set allocation CPU usage by 3x at least (when also considering the free), because it had to iterate through 300-400 sets on average. Not a huge improvement as the pre-improvement CPU usage was only about 2.3% in the busiest thread. Reviewed-by: Dave Airlie <[email protected]>
* nv50,nvc0: Display shared memory usage in pipe_debug_messagePierre Moreau2017-11-042-6/+8
| | | | Signed-off-by: Pierre Moreau <[email protected]>
* nv50,nvc0: Copy shared memory per block to the program info structure and backPierre Moreau2017-11-042-0/+4
| | | | | | | | In OpenCL/CUDA kernels, shared memory usage can be defined within the kernel code. Those usage will only be picked up while parsing the SPIR-V, during the translation phase of the program. Signed-off-by: Pierre Moreau <[email protected]>
* nv50/ir: Store shared memory per block in nv50_ir_prog_infoPierre Moreau2017-11-041-0/+1
| | | | Signed-off-by: Pierre Moreau <[email protected]>
* i965/gen10: Implement Wa3DStateModeAnuj Phogat2017-11-032-0/+16
| | | | | | | | | | | | | | This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Remove the bits enabling Float blend optimization. It is enabled through CACHE_MODE_SS register. Update the comment. Move gen10 if block on top of gen9 if block. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965/gen10: Enable float blend optimizationAnuj Phogat2017-11-032-0/+9
| | | | | | | | | This optimization is enabled for previous generations too. See Mesa commit c17e214a6b On CNL this bit has been moved to CACHE_MODE_SS register. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965/gen10: Implement WaForceRCPFEHangWorkaroundAnuj Phogat2017-11-031-0/+23
| | | | | | | | | | | | | This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Add the check for Post Sync Operation. Update the workaround comment. Use braces around if-else. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965/gen10: Implement WaSampleOffsetIZ workaroundAnuj Phogat2017-11-032-0/+50
| | | | | | | | | | | | | | | | | | | | | | | There are few other (duplicate) workarounds which have similar recommendations: WaFlushHangWhenNonPipelineStateAndMarkerStalled WaCSStallBefore3DSamplePattern WaPipeControlBefore3DStateSamplePattern WaPipeControlBefore3DStateSamplePattern has some extra recommendations if driver is using mid batch context restore. Ignoring it for now because We're not doing mid-batch context restore in Mesa. This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Use brw_load_register_imm32() to program CACHE_MODE_0. Get rid of brw_flush_gpu_caches(). V3: Make the workaround helper functions static. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by :Nanley Chery <[email protected]>
* i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1Anuj Phogat2017-11-031-0/+10
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>