summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: replace glsl130 debug flag with glsl120Rob Clark2015-03-082-12/+10
| | | | | | | | | | | | | | Now that relative-dst works, we should never fall back to the old compiler. (Which is almost true, other than a couple edge case sched fails in piglit). So replace glsl130 flag to force GLSL 130 and integers on a3xx/a4xx with a glsl120 flag to force GLSL 120 and !integers. If this commit breaks any game/app/etc use FD_MESA_DEBUG=glsl120 as a workaround and please let me know. Signed-off-by: Rob Clark <[email protected]>
* gallium/docs: add some freedreno compiler docsRob Clark2015-03-086-1/+457
| | | | | | | | | | | | | | | Enable the 'sphinx.ext.graphviz' extension, and add in a section for driver specific docs, with freedreno compiler docs beneath. The goal is for more complete compiler docs, and hopefully some docs about other parts of the driver (such as how tiling works, etc). Note that there is also a Distribution -> Drivers section. Although that appears to be simply just a list of drivers. Not sure if that should move under the 'Drivers' section or left alone. I did add a one-line section for freedreno in the existing Distribution -> Drivers section. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: relative dstRob Clark2015-03-085-42/+244
| | | | | | | | | To simplify RA, assign arrays that are written to first. Since enough dependency information is in the graph to preserve order of reads and writes of array, so all SSA names for the array collapse into one, just assign the entire thing by array-id. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split out array_fanin() helperRob Clark2015-03-081-17/+30
| | | | | | We'll need this too for relative dst.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop deref nodesRob Clark2015-03-087-80/+53
| | | | | | | | | | | | | | | | | | | | | The meta-deref instruction doesn't really do what we need for relative destination. Instead, since each instruction can reference at most a single address value, track the dependency on the address register via instr->address. This lets us express the dependency regardless of whether it is used for dst and/or src. The foreach_ssa_src{_n} iterator macros now also iterates the address register so, at least in SSA form, the address register behaves as an additional virtual src to the instruction. Which is pretty much what we want, as far as scheduling/etc. TODO: For now, the foreach_src{_n} iterators are unchanged. We could wrap the address in an ir3_register and make the foreach_src_{_n} iterators behave the same way. But that seems unnecessary at this point, since we mainly care about the address dependency when in SSA form. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: helpful iterator macrosRob Clark2015-03-089-111/+109
| | | | | | | | | I remembered that we are using c99.. which makes some sugary iterator macros easier. So introduce iterator macros to iterate all src registers and all SSA src instructions. The _n variants also return the src #, since there are a handful of places that need this. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix register usage calculationsRob Clark2015-03-081-7/+14
| | | | | | | | | For cat1 instructions, use reg() as well for relative src, to ensure proper accounting of register usage. Also, for relative instructions, use reg->size rather than reg->wrmask to determine the number of components read/written. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: couple tweaks for cmdline compilerRob Clark2015-03-081-1/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split up ssa_dstRob Clark2015-03-081-17/+25
| | | | | | And a couple other trivial renames, to prepare for relative dst. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix failed assert in groupingRob Clark2015-03-081-27/+44
| | | | | | | | | | | | | | | | | | | | | Turns out there are scenarios where we need to insert mov's in "front" of an input. Triggered by shaders like: VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[9] DCL SAMP[0] DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[1].xyyy 1: MOV TEMP[0].w, IN[1].wwww 2: TXF TEMP[0], TEMP[0], SAMP[0], 1D_ARRAY 3: MOV OUT[1], TEMP[0] 4: MOV OUT[0], IN[0] 5: END Signed-off-by: Rob Clark <[email protected]>
* i915: Fix GCC unused-variable warning in release build.Vinson Lee2015-03-061-2/+1
| | | | | | | | | | i915_debug_fp.c: In function ‘i915_disassemble_program’: i915_debug_fp.c:302:11: warning: unused variable ‘size’ [-Wunused-variable] GLuint size = program[0] & 0x1ff; ^ Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* r300g: Fix build, invalid extern "C" around header inclusion.Mark Janes2015-03-061-0/+8
| | | | | | | | | | | | A previous patch to fix header inclusion within extern "C" neglected to fix the occurences of this pattern in r300 files. When the helper to detect this issue was pushed to master, it broke the build for the r300 driver. This patch fixes the r300 build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* nouveau: Fix build, invalid extern "C" around header inclusion.Mark Janes2015-03-062-2/+7
| | | | | | | | | | | | A previous patch to fix header inclusion within extern "C" neglected to fix the occurences of this pattern in nouveau files. When the helper to detect this issue was pushed to master, it broke the build for the nouveau driver. This patch fixes the nouveau build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* nv50,nvc0: remove bogus 64_FLOAT formatsIlia Mirkin2015-03-061-5/+0
| | | | | | | There is no HW support for these and the VBO pusher doesn't know about them. No need to, either, since the st will be lowering them to 2x32. Signed-off-by: Ilia Mirkin <[email protected]>
* ilo: clarify valid and preferred tilingsChia-I Wu2015-03-072-15/+29
| | | | | We did it right until the switch to gen_surface_tiling, which has GEN8_TILING_W. Generally, GEN8_TILING_W may be valid but not preferred.
* ilo: clean up Gen6 WAsChia-I Wu2015-03-071-34/+62
| | | | | | Add a help function for each WA and make PIPE_CONTROL flags match the WA descriptions. Call gen6_wa_pre_pipe_contro() only before PIPE_CONTROLs. Fix missing gen6_wa_pre_3dstate_vs_toggle() in the rectlist path.
* ilo: add generic ilo_render_3dprimitive()Chia-I Wu2015-03-074-53/+29
| | | | It replaces gen[6-8]_3dprimitive().
* ilo: add generic ilo_render_pipe_control()Chia-I Wu2015-03-075-101/+56
| | | | | It replaces gen[6-8]_pipe_control() and a direct gen6_PIPE_CONTROL() call in ilo_render_emit_flush().
* ilo: fix padding of linear sampler viewsChia-I Wu2015-03-071-4/+2
| | | | Should use the temporary variable in the loop instead of layout->bo_height.
* ilo: do not check for interleaved_samplesChia-I Wu2015-03-071-2/+1
| | | | | interleaved_samples is only zero-initialized when layout_want_mcs() is called. We should not check for it. There is also no need to.
* Revert "egl/main: use c11/threads' mutex directly"Emil Velikov2015-03-0611-47/+111
| | | | | | This reverts commit 6cee785c69a5c8d2d32b6807f9c502117f5a74b0. Not meant to go in yet. Lacking review.
* Revert "egl/main: convert thread management to use c11 threads"Emil Velikov2015-03-061-6/+42
| | | | | | This reverts commit 33eff853363d7eba5e61b00431b95f7aa0d7b0a5. Not meant to go in yet. Lacking review.
* Revert "glx: remove final reference to THREADS"Emil Velikov2015-03-061-0/+4
| | | | | | This reverts commit 8b15a883e0ba72c9156d7192a798bb272e0bc528. Not meant to go in yet. Lacking review.
* Revert "glx: remove support for non-multithreaded platforms"Emil Velikov2015-03-063-2/+29
| | | | | | This reverts commit 38591295cd4b68f89f257b20f476f98de3772a47. Not meant to go in yet. Lacking review.
* glx: remove unneeded ifdef _WIN32 guardEmil Velikov2015-03-061-2/+0
| | | | | | | The C99 header exists on other platforms as well. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* util: rework _MSC_VER >= 1200 checksEmil Velikov2015-03-061-5/+3
| | | | | | | | | Replace the _MSC_VER >= 1200 with defined (_MSC_VER) and compact if/else statements. We require MSVC 2008 or later with commit 46110c5d564. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* glx: remove support for non-multithreaded platformsEmil Velikov2015-03-063-29/+2
| | | | | | | | Implicitly required for a while, although commit 9385c592c68 (mapi: remove u_thread.h) was the one that put the final nail on the coffin. Signed-off-by: Emil Velikov <[email protected]>
* glx: remove final reference to THREADSEmil Velikov2015-03-061-4/+0
| | | | | | | Left over from commit 18db13f5865(mapi: THREADS was always defined, remove it) Signed-off-by: Emil Velikov <[email protected]>
* egl/main: convert thread management to use c11 threadsEmil Velikov2015-03-061-42/+6
| | | | | | | | Convert the code to use the C11 threads implementation, and nuke the Windows non-pthreads code-path. The c11/threads_win32.h abstraction should be better than the current code. Signed-off-by: Emil Velikov <[email protected]>
* egl/main: use c11/threads' mutex directlyEmil Velikov2015-03-0611-111/+47
| | | | | | Remove the inline wrappers/abstraction layer. Signed-off-by: Emil Velikov <[email protected]>
* include: Add helper header to help trap includes inside extern C.José Fonseca2015-03-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is just to help repro and fixing these issues with any C++ compiler -- Commiting this will of course wait until all issues are addressed. $ scons src/glsl/ scons: Reading SConscript files ... Checking for GCC ... yes Checking for Clang ... no Checking for X11 (x11 xext xdamage xfixes glproto >= 1.4.13)... yes Checking for XCB (x11-xcb xcb-glx >= 1.8.1 xcb-dri2 >= 1.8)... yes Checking for XF86VIDMODE (xxf86vm)... yes Checking for DRM (libdrm >= 2.4.38)... yes Checking for UDEV (libudev >= 151)... yes warning: LLVM disabled: not building llvmpipe scons: done reading SConscript files. scons: Building targets ... scons: building associated VariantDir targets: build/linux-x86_64-debug/glsl Compiling src/glsl/ast_array_index.cpp ... Compiling src/glsl/ast_expr.cpp ... Compiling src/glsl/ast_function.cpp ... Compiling src/glsl/ast_to_hir.cpp ... Compiling src/glsl/ast_type.cpp ... Compiling src/glsl/builtin_functions.cpp ... In file included from include/c99_compat.h:28:0, from src/mapi/u_compiler.h:4, from src/mapi/u_thread.h:47, from src/mapi/glapi/glapi.h:47, from src/mesa/main/mtypes.h:42, from src/mesa/main/errors.h:47, from src/mesa/main/imports.h:41, from src/mesa/main/core.h:44, from src/glsl/builtin_functions.cpp:58: include/no_extern_c.h:48:1: error: template with C linkage template<class T> class _IncludeInsideExternCNotPortable; ^ In file included from include/c99_compat.h:28:0, from include/c11/threads.h:38, from src/mapi/u_thread.h:49, from src/mapi/glapi/glapi.h:47, from src/mesa/main/mtypes.h:42, from src/mesa/main/errors.h:47, from src/mesa/main/imports.h:41, from src/mesa/main/core.h:44, from src/glsl/builtin_functions.cpp:58: include/no_extern_c.h:48:1: error: template with C linkage template<class T> class _IncludeInsideExternCNotPortable; ^ Compiling src/glsl/builtin_types.cpp ... Compiling src/glsl/builtin_variables.cpp ... scons: *** [build/linux-x86_64-debug/glsl/builtin_functions.os] Error 1 scons: building terminated because of errors. Reviewed-by: Mark Janes <[email protected]>
* i965: free scratch buffers when destroying the contextIago Toral Quiroga2015-03-061-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | If scratch space is needed for a shader stage we try to reuse the last scratch buffer bound to that stage. If we can't, we free the old scratch buffer and allocate a new one. This means we always keep the last scratch buffer for a particular shader stage around for the entire life span of the context. These buffers are being reported by Valgrind as definitely lost after destroying the OpenGL context. For example, for the geometry shader stage: ==18350== 248 bytes in 1 blocks are definitely lost in loss record 85 of 150 ==18350== at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==18350== by 0xA1B35D6: drm_intel_gem_bo_alloc_internal (intel_bufmgr_gem.c:724) ==18350== by 0xA1B383F: drm_intel_gem_bo_alloc (intel_bufmgr_gem.c:794) ==18350== by 0xA1AEFA3: drm_intel_bo_alloc (intel_bufmgr.c:52) ==18350== by 0x9D08E31: brw_get_scratch_bo (brw_program.c:226) ==18350== by 0x9D2A0F2: do_gs_prog (brw_vec4_gs.c:280) ==18350== by 0x9D2A635: brw_gs_precompile (brw_vec4_gs.c:401) ==18350== by 0x9D14F68: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_shader.cpp:76) ==18350== by 0x9D157B8: brw_link_shader (brw_shader.cpp:269) ==18350== by 0x9B0941E: _mesa_glsl_link_shader (ir_to_mesa.cpp:3038) ==18350== by 0x99AE4ED: link_program (shaderapi.c:917) ==18350== by 0x99AF365: _mesa_LinkProgram (shaderapi.c:1385) So make sure that by the time we destroy the context we check if we have live scratch buffers for the various stages and release them if that is the case. Reviewed-by: Jordan Justen <[email protected]>
* i965: Fix URB size for CHVVille Syrjälä2015-03-061-1/+1
| | | | | | | | Increase the device info .urb.size for CHV to match the default URB size (192kB). Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]>
* i965/vec4: Don't lose the saturate modifier in copy propagation.Andrey Sudnik2015-03-051-1/+1
| | | | | | | Cc: 10.4, 10.5 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Handle saturate in dump_instruction().Matt Turner2015-03-051-0/+2
| | | | Reviewed-by: Ian Romanick <[email protected]>
* ilo: enable L3 cache in MOCSChia-I Wu2015-03-065-17/+85
| | | | This enables L3 cache in MOCS almost everywhere.
* ilo: track if a ilo_view_surface is a scanoutChia-I Wu2015-03-062-16/+16
| | | | Scanouts require a different cache type.
* ilo: clean up SURFACE_STATE and BINDING_TABLE_STATEChia-I Wu2015-03-062-21/+35
| | | | | Add ilo_builder_surface_pointer() to replace ilo_builder_surface_write(). Make Gen8+ take a different path in gen6_SURFACE_STATE().
* mapi: actually remove unused u_thread.hBrian Paul2015-03-051-126/+0
| | | | | | I thought this was in the previous commit in the series. Reviewed-by: Emil Velikov <[email protected]>
* freedreno/ir3: fix silly typo for binning pass shadersRob Clark2015-03-051-1/+1
| | | | | | | | | Was resulting in gl_PointSize write being optimized out, causing particle system type shaders to hang if hw binning enabled. Fixes neverball, OGLES2ParticleSystem, etc. Signed-off-by: Rob Clark <[email protected]>
* glsl: let interface linking code validate its arraysTimothy Arceri2015-03-061-1/+2
| | | | | | Currently intrastage arrays are validated twice for interface blocks. Reviewed-by: Mark Janes <[email protected]>
* glsl: use common intrastage array validationTimothy Arceri2015-03-061-37/+37
| | | | | | | | | | | Use common intrastage array validation for interface blocks. This change also allows us to support interface blocks that are arrays of arrays. V2: Reinsert unsized array asserts in interstage_match() Reviewed-by: Mark Janes <[email protected]>
* glsl: move array validation into its own functionTimothy Arceri2015-03-062-39/+55
| | | | | | V2: return true when var->type is unsized but max access is within valid range Reviewed-by: Mark Janes <[email protected]>
* i965: Split Gen4-5 BlitFramebuffer code; prefer BLT over Meta.Kenneth Graunke2015-03-051-1/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | A while back I switched intel_blit_framebuffer to prefer Meta over the BLT. This meant that Gen8 platforms would start using the 3D engine for blits, just like we do on Gen6-7.5. However, I hadn't considered Gen4-5 when making that change. The BLT engine appears to be substantially faster on 965GM than using Meta to drive the 3D engine. This isn't too surprising: original Gen4 doesn't support tile offsets (that came on G45), and the level/layer fields don't work for cubemap rendering, so for inconvenient miplevel alignments, we end up blitting or copying data to/from temporaries in order to render to it. We may as well just use the blitter. I chose to use the BLT on Gen4-5 because they use the same ring for both 3D and BLT; Gen6+ splits it out. Fixes regressions on 965GM due to botched tile offset code (we should fix those properly as well, but they're longstanding bugs - for now, put things back to the status quo). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89430 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Cc: "10.5" <[email protected]>
* ilo: add more convenient intel_bo_{ref,unref}()Chia-I Wu2015-03-068-47/+35
| | | | | They both check for NULL and intel_bo_ref() returns the referenced bo. They replace intel_bo_{reference,unreference}().
* ilo: add intel_bo_set_tiling()Chia-I Wu2015-03-066-80/+71
| | | | | Make intel_winsys_alloc_bo() always allocate a linear bo, and add intel_bo_set_tiling() to set the tiling. Document the purpose of tiling.
* ilo: replace intel_tiling_mode by gen_surface_tilingChia-I Wu2015-03-0610-136/+167
| | | | | | The former is used by the kernel driver to set up fence registers and to pass tiling info across processes. It lacks INTEL_TILING_W, which made our code less expressive.
* ilo: update genhw headersChia-I Wu2015-03-067-782/+986
| | | | The main change is non-inline <enum>s are now generated as C enums.
* Fix invalid extern "C" around header inclusion.Mark Janes2015-03-0526-38/+96
| | | | | | | | | | | System headers may contain C++ declarations, which cannot be given C linkage. For this reason, include statements should never occur inside extern "C". This patch moves the C linkage statements to enclose only the declarations within a single header. Reviewed-by: Jose Fonseca <[email protected]>
* i965: Tell intel_get_memcpy() which direction the memcpy() is going.Matt Turner2015-03-055-42/+106
| | | | | | | | | | | | | | | | | | | | | | The SSSE3 swizzling code was written for fast uploads to the GPU and assumed the destination was always 16-byte aligned. When we began using this code for fast downloads as well we didn't do anything to account for the fact that the destination pointer given by glReadPixels() or glGetTexImage() is not guaranteed to be suitably aligned. With SSSE3 enabled (at compile-time), some applications would crash when an SSE aligned-store instruction tried to store to an unaligned destination (or an assertion that the destination is aligned would trigger). To remedy this, tell intel_get_memcpy() whether we're uploading or downloading so that it can select whether to assume the destination or source is aligned, respectively. Cc: 10.5 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89416 Tested-by: Uriy Zhuravlev <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>