summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Don't compute_to_mrf() on Gen >= 7.Matt Turner2014-11-031-0/+4
| | | | | | No differences in shader-db on Haswell (Gen 7.5). Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Remove now useless dot optimization on basis vectMatt Turner2014-11-033-92/+3
| | | | | | | The optimization in commit d056863b covers these cases, which were the first optimizations I added to the GLSL compiler. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Emit mul instead of dot if only one component left.Matt Turner2014-11-031-1/+4
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85683 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85691 Reviewed-by: Ian Romanick <[email protected]>
* clover: Fix clBuildProgram piglit regressionTom Stellard2014-11-031-4/+4
| | | | | | | | | | | Should trigger CL_INVALID_VALUE if device_list is NULL and num_devices is greater than zero. Introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563 Reported by: EdB Reviewed-by: Francisco Jerez <[email protected]>
* gallivm: Disable frame-pointer-omission on x86 to ensure right stack alignment.José Fonseca2014-11-031-1/+3
| | | | | | | | | | | | | | | | | | | Between release 3.2 and 3.3 LLVM stopped aligning properly when certain conditions (no allocas, but large number of vectors causing spills to the stack, and frame pointer omission enabled). We were already disabling frame-pointer-omission on several build types, but we now disable it on all build types. It's not clear whether this affects 32-bits x86 processes only, or if it can also affect 64-bits x86_64 processes when AVX registers are available and used. So disable frame-pointer-omission on both x86/x86_64 to be on the safe side. See also: - http://llvm.org/PR21435 Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: When disassemble a function, start by printing out its name.José Fonseca2014-11-031-0/+1
| | | | | | To help recognize what's supposed to do. Reviewed-by: Roland Scheidegger <[email protected]>
* i965/chv: Increase VS and GS thread countsBen Widawsky2014-11-021-2/+2
| | | | | | | | AFAICT the number of threads is 80, not 70. I am not sure if Ken knows something I do not. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallium/docs: fix NRM, NRM4 docsBrian Paul2014-11-011-12/+24
| | | | | | | | | Need to do a sqrt(). FWIW, the html that Sphinx 1.1.3 generates for the math expressions looks completely broken. Reviewed-by: José Fonseca <[email protected]>
* softpipe: use the tgsi_free_tokens() functionBrian Paul2014-10-311-6/+6
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* tgsi: add a tgsi_free_tokens() functionBrian Paul2014-10-312-0/+13
| | | | | | To match tgsi_alloc_tokens(). Reviewed-by: Charmaine Lee <[email protected]>
* util: simplify u_pstipple.c codeBrian Paul2014-10-311-123/+62
| | | | | | | Use the new helper functions in the tgsi_transform.h file to emit declarations and instructions. Reviewed-by: Charmaine Lee <[email protected]>
* util: simplify temp register selection in u_pstipple.cBrian Paul2014-10-311-27/+18
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* util: simplify util_pstipple_create_fragment_shader() paramsBrian Paul2014-10-313-38/+28
| | | | | | | | Pass and return tgsi_token buffers instead of pipe_shader_state. And update softpipe driver (the only user of this function). Reviewed-by: Charmaine Lee <[email protected]>
* softpipe: remove unused softpipe_create_fs_variant_exec() parameterBrian Paul2014-10-313-5/+3
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* softpipe: check for SP_NEW_STIPPLE when building quad pipelineBrian Paul2014-10-311-0/+1
| | | | | | | Fixes polygon stipple if both DO_PSTIPPLE_IN_DRAW_MODULE and DO_PSTIPPLE_IN_HELPER_MODULE are zero/off. Reviewed-by: Charmaine Lee <[email protected]>
* r600g: Fix build with opencl and radeonsi disabledTom Stellard2014-10-311-6/+6
|
* clover: Fix bug when binary programs are passed to clBuildProgram() v2Tom Stellard2014-10-312-6/+14
| | | | | | | | | | | | | This was a regression introduced by 611d66fe4513e53bde052dd2bab95d448c909a2a Passing a binary program to clBuildProgram() is legal, but passing one to clCompileProgram() is not. v2: - Code cleanups. Reviewed-by: Francisco Jerez <[email protected]>
* clover: Factor input validation of clCompileProgram into a new function v2Tom Stellard2014-10-311-10/+23
| | | | | | | | | This factors out the validation that is common with clBuildProgram(). v2: - Code cleanups. Reviewed-by: Francisco Jerez <[email protected]>
* radeonsi/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2Tom Stellard2014-10-314-59/+127
| | | | | | v2: - Drop dependency on LLVM >= 3.5.1 - Rename si_create_shader() to si_shader_binary_read()
* r600g/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2Tom Stellard2014-10-318-97/+180
| | | | | v2: - Drop dependency on LLVM >= 3.5.1
* gallium/radeon: Add query for symbol specific config informationTom Stellard2014-10-313-0/+86
| | | | | | | This adds a query which allows drivers to access the config information of a specific function within the LLVM generated ELF binary. This makes it possible for the driver to handle ELF binaries with multiple kernels / global functions.
* r300g: remove enabled/disabled hyperz and AA compression messagesMarek Olšák2014-10-301-2/+0
| | | | | | It's annoying with octave. Reported by Michael Burian. Cc: 10.2 10.3 <[email protected]>
* r600g: Delete unused variable 'max_global_size' in 'r600_get_compute_param'Dieter Nützel2014-10-301-1/+0
| | | | Signed-off-by: Dieter Nützel <[email protected]>
* mesa: protect the debug state with a mutexChia-I Wu2014-10-302-47/+126
| | | | | | | | | | We are about to change mesa to spawn threads for deferred glCompileShader and glLinkProgram, and we need to make sure those threads can send compiler warnings/errors to the debug output safely. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: protect glsl_type with a mutexChia-I Wu2014-10-302-10/+62
| | | | | | | | | | glsl_type has several static hash tables and a static ralloc context. They need to be protected by a mutex as they are not thread-safe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69200 Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: protect anonymous struct id with a mutexChia-I Wu2014-10-301-2/+8
| | | | | | | | | There may be two contexts compiling shaders at the same time, and we want the anonymous struct id to be globally unique. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* util: initialize locale_t with a static objectChia-I Wu2014-10-301-10/+8
| | | | | | | | | | | | _mesa_strtod and _mesa_strtof may be called from multiple threads. They need to be thread-safe. v2: platform checks are now done in configure.ac Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* configure: check for xlocale.h and strtofChia-I Wu2014-10-301-8/+4
| | | | | | | | With the assumptions that xlocale.h implies newlocale and strtof_l. SCons is updated to define HAVE_XLOCALE_H on linux and darwin. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* util: add _mesa_strtod and _mesa_strtofChia-I Wu2014-10-3010-40/+21
| | | | | | | | | Both core mesa and glsl have their own wrappers for strtof_l. Merge and move them to util/. They are compiled with a C++ compiler so that we can make them thread-safe in a following commit. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/gallium: Signal _NEW_TRANSFORM from glClipControl.Mathias Fröhlich2014-10-302-13/+6
| | | | | | | | | This removes the need for the gallium rasterizer state to listen to viewport changes. Thanks to Marek Olšák <[email protected]>. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]>
* Revert "i965/compaction: Disable compaction on SNB temporarily."Matt Turner2014-10-291-6/+0
| | | | | | | | This reverts commit cabc93c5adc9ea62be901621eff5ce4cb9574791. Mark thinks the failures on the SNB GT2 in the lab are actually because of faulty hardware, not instruction compaction. The GT1 didn't see any problems after changes to the compaction code.
* i965/vec4: Perform CSE on MAD instructions with final arguments switched.Matt Turner2014-10-291-1/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Perform CSE on MAD instructions with final arguments switched.Matt Turner2014-10-291-1/+5
| | | | | | | | Multiplication is commutative. instructions in affected programs: 48314 -> 47954 (-0.75%) Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Drop constant 0.0 components from dot products.Matt Turner2014-10-291-0/+27
| | | | | | | | | Helps a small number of vertex shaders in the games Dungeon Defenders and Shank, as well as an internal benchmark. instructions in affected programs: 2801 -> 2719 (-2.93%) Reviewed-by: Kenneth Graunke <[email protected]>
* glx/dri3: Implement LIBGL_SHOW_FPS=1 for DRI3/Present.Kenneth Graunke2014-10-292-2/+36
| | | | | | | | | | | | | | | v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event rather than gettimeofday(), which gives us the presentation time instead of the time when SwapBuffers was called. Suggested by Keith Packard. This relies on the fact that the X DRI3/Present implementations use microseconds for UST. v3: Properly ignore PresentCompleteKindMSCNotify; multiply in 64 bits (caught by Keith Packard). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Keith Packard <[email protected]> [v3] Reviewed-by: Marek Olšák <[email protected]> [v1]
* i965: Rename brw_vec4_gs.[ch] to brw_gs.[ch].Kenneth Graunke2014-10-295-5/+4
| | | | | | | | | | | These source files support actual geometry shaders, so using "gs" for the name makes a lot of sense. We're going to be adding SIMD8 geometry shader support as well, at which point "vec4_gs" will be a misnomer. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Iago Toral Quiroga <[email protected]>
* i965: Rename brw_gs{,_emit}.[ch] to brw_ff_gs{,_emit}.[ch].Kenneth Graunke2014-10-295-5/+5
| | | | | | | | | | | | | | | | The brw_gs.[ch] and brw_gs_emit.c source files contain code for emulating fixed-function unit functionality (VF primitive decomposition or SOL) using the GS unit. They do not contain code to support proper geometry shaders. We've taken to calling that code "ff_gs" (see brw_ff_gs_prog_key, brw_ff_gs_prog_data, brw_context::ff_gs, brw_ff_gs_compile, brw_ff_gs_prog). So it makes sense to make the filenames match. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Iago Toral Quiroga <[email protected]>
* i965: Rename intel_bufferobj_* functions to match GL and DD hooks.Kenneth Graunke2014-10-291-65/+64
| | | | | | | | | | | | | | | | | | | | The GL functions and driver hooks use corresponding names---for example, glMapBufferRange and Driver.MapBufferRange. But our implementation was called "intel_bufferobj_map_range," which has the words "map" and "buffer" swapped, as well as randomly adding "obj." FlushMappedBufferRange was even trickier: it ordered the words 3, "obj", 1, 2, 4: intel_bufferobj_flush_mapped_range. Even though the old names were consistent, I always had trouble rearranging the jumble of words when searching for a function, and it took a few tries to eventually land there. The new names match the word order of GL and the driver hooks; FlushMappedBufferRange is simply brw_flush_mapped_buffer_range. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* radeon/llvm: Dynamically allocate branch/loop stack arraysMichel Dänzer2014-10-292-6/+37
| | | | | | | | | | | This prevents us from silently overflowing the stack arrays, and allows arbitrary stack depths. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454 Cc: [email protected] Reported-and-Tested-by: Nick Sarnie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Fix order of errors for glDrawTransformFeedbackStreamChris Forbes2014-10-291-5/+5
| | | | | | | | | | | | | | | | | | | The OpenGL 4.0 core profile specification, section 2.17.3 Transform Feedback Draw Operations says: "The error INVALID_VALUE is generated if <stream> is greater than or equal to the value of MAX_VERTEX_STREAMS. ... The error INVALID_OPERATION is generated if EndTransformFeedback has never been called while the object named by id was bound." Fixes the piglit test: ARB_transform_feedback3/arb_transform_feedback3-draw_using_invalid_stream_index (with the test itself fixed to eliminate an unrelated failure) Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* vc4: Add support for ARL and indirect register access on TGSI_FILE_CONSTANT.Eric Anholt2014-10-2810-34/+407
| | | | | Fixes 14 ARB_vp tests (which had no lowering done), and should improve performance of indirect uniform array access in GLSL.
* vc4: Fix mixup of return type in reloc_tex().Eric Anholt2014-10-281-2/+2
|
* vc4: Drop redundant check for is_tmu_write().Eric Anholt2014-10-281-3/+0
| | | | This function is only called when it would return true.
* vc4: Don't forget to validate code that's got PROG_END on it.Eric Anholt2014-10-281-5/+6
| | | | | This signal doesn't terminate the program now, it terminates the program soon. So you have to actually validate the code in the instruction.
* vc4: Add .dir-locals.el for kernel style in the kernel code.Eric Anholt2014-10-281-0/+12
|
* vc4: Fix a couple missing '\n's in error output.Eric Anholt2014-10-282-2/+2
|
* st/mesa: use PIPE_BIND_DISPLAY_TARGET when checking for sRGB capabilityBrian Paul2014-10-281-1/+2
| | | | | | | When we're checking if the framebuffer is sRGB capable, call is_format_supported() with the PIPE_BIND_DISPLAY_TARGET flag. Reviewed-by: Charmaine Lee <[email protected]>
* Revert "st/mesa: set MaxUnrollIterations = 255"Marek Olšák2014-10-281-2/+1
| | | | | | | | | | | | | | | | | | This reverts commit 20836c81851e0df29a8ee9c86e5e5388738c840b. 255 is a huge number. If you have a loop with 255 iterations, unrolling it will exceed the SM3 instruction limit. Let's use the default again. The comment about a SM3 limit doesn't make sense. For SM3, we generally want 32 (default) or a lower number due to the SM3 instruction limit, which is 512 instructions. For SM4, we can try higher numbers if needed, but some shaders can end up being pretty huge and shader compilation can take more time. This fixes a shader compile failure on R500/SM3. Reported on IRC. Cc: 10.2 10.3 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r300g/vdpau: enable againDavid Heidelberger2014-10-281-0/+1
| | | | | Signed-off-by: David Heidelberger <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r300g: only set clip_halfz for chips with HW TCLMarek Olšák2014-10-281-1/+1
| | | | | I forgot that we cannot emit vertex shader state on a chip without VS. In such a case, clip_halfz is handled by the Draw module.