aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* mesa: Make validate_shader_target() non-static.Paul Berry2014-01-212-4/+7
| | | | | Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program.Paul Berry2014-01-214-21/+4
| | | | | | | | | | | | | | | | In my recent zeal to refactor Mesa's handling of the gl_shader_stage enum, I accidentally wound up with two functions that do the same thing: _mesa_program_index_to_target(), and _mesa_shader_stage_to_program(). This patch keeps _mesa_shader_stage_to_program(), since its name is more consistent with other related functions. However, it changes the signature so that it accepts an unsigned integer instead of a gl_shader_stage--this avoids awkward casts when the function is called from C++ code. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: dump geometry shaders when using LP_DEBUG=tgsiDave Airlie2014-01-221-1/+4
| | | | | | | for consistency with vs and fs dumpers. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: Generate GL_INVALID_OPERATION for unsupported DSA TexStorage functionsIan Romanick2014-01-211-3/+15
| | | | | | | | | We have to make the functions available to work around a GLEW bug (see comments already in the code), but if an application calls one of these functions we should still generate GL_INVALID_OPERATION. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Silence many unused parameter warningsIan Romanick2014-01-211-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | main/texstorage.c: In function '_mesa_alloc_texture_storage': main/texstorage.c:240:53: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:241:37: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c:241:53: warning: unused parameter 'depth' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage1DEXT': main/texstorage.c:464:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:464:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:464:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:465:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:466:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage2DEXT': main/texstorage.c:473:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:473:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:473:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:474:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:475:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:475:50: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage3DEXT': main/texstorage.c:483:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:483:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:483:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:484:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:485:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:485:50: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c:485:66: warning: unused parameter 'depth' [-Wunused-parameter] Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Ignore 'centroid' interpolation qualifier in case of persample shadingAnuj Phogat2014-01-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch handles the use of 'centroid' qualifier with 'in' variables in a fragment shader when persample shading is enabled. Per sample shading for the whole fragment shader can be enabled by: glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID} builtin variables in fragment shader. Explaining it below in more detail. /* Enable sample shading using OpenGL API */ glEnable(GL_SAMPLE_SHADING); glMinSampleShading(1.0); Example fragment shader: in vec4 a; centroid in vec4 b; main() { ... } Variable 'a' will be interpolated at sample location. But, what interpolation should we use for variable 'b' ? ARB_sample_shading recommends interpolation at sample position for all the variables. GLSL 400 (and earlier) spec says that: "When an interpolation qualifier is used, it overrides settings established through the OpenGL API." But, this text got deleted in later versions of GLSL. NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3) interpolates at sample position. This convinces me to use the similar approach on intel hardware. Cc: [email protected] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Use sample barycentric coordinates with per sample shadingAnuj Phogat2014-01-214-6/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation of arb_sample_shading doesn't set 'Barycentric Interpolation Mode' correctly. We use pixel barycentric coordinates for per sample shading. Instead we should select perspective sample or non-perspective sample barycentric coordinates. It also enables using sample barycentric coordinates in case of a fragment shader variable declared with 'sample' qualifier. e.g. sample in vec4 pos; A piglit test to verify the implementation has been posted on piglit mailing list for review. V2: Do not interpolate all the 'in' variables at sample position if fragment shader uses 'sample' qualifier with one of them. For example we have a fragment shader: #version 330 #extension ARB_gpu_shader5: require sample in vec4 a; in vec4 b; main() { ... } Only 'a' should be sampled at sample location, not 'b'. Cc: [email protected] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Add an option to ignore sample qualifierAnuj Phogat2014-01-215-7/+9
| | | | | | | | | | This will be useful in my next patch which depends on a functionality of _mesa_get_min_invocations_per_fragment() to ignore the sample qualifier (prog->IsSample) based on a flag passed to it. Cc: [email protected] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa/x86: Remove dead read_rgba_span_x86.h.Matt Turner2014-01-211-56/+0
| | | | Dead since 304f7a13.
* i965/fs: Optimize LRP with x == y into a MOV.Matt Turner2014-01-211-0/+10
| | | | | | | | | total instructions in shared programs: 1487331 -> 1485988 (-0.09%) instructions in affected programs: 45638 -> 44295 (-2.94%) GAINED: 7 LOST: 0 Reviewed-by: Jordan Justen <[email protected]>
* glsl: Optimize open-coded lrp into lrp.Jordan Justen2014-01-211-0/+52
| | | | | | | | | | total instructions in shared programs: 1498191 -> 1487051 (-0.74%) instructions in affected programs: 669388 -> 658248 (-1.66%) GAINED: 1 LOST: 0 Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Jordan Justen <[email protected]>
* i965: Enable AOS optimizations for the geometry shader.Matt Turner2014-01-211-0/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Vectorize multiple scalar assignmentsMatt Turner2014-01-214-0/+325
| | | | | | | | | | Reduces vertex shader instruction counts in DOTA2 by 6.42%, L4D2 by 4.61%, and CS:GO by 5.71%. total instructions in shared programs: 1500153 -> 1498191 (-0.13%) instructions in affected programs: 59919 -> 57957 (-3.27%) Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add parameter to .equals() to ignore an IR type.Matt Turner2014-01-212-36/+38
| | | | | | | Only implemented for ir_swizzles currently, but perhaps will be useful for other IR types in the future. Reviewed-by: Ian Romanick <[email protected]>
* mesa: rename PreferDP4 to OptimizeForAOS.Matt Turner2014-01-218-10/+13
| | | | | | | | | This flag was really just a proxy for determining whether the backend was vector (AOS) or scalar (SOA). It will be used to apply a future optimization only for vector backends. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Print the maximum register pressure.Matt Turner2014-01-211-1/+3
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Show register pressure in dump_instructions() output.Kenneth Graunke2014-01-213-1/+16
| | | | | | | | | | Dumping the number of live registers at each IP allows us to see register pressure and identify any local maxima. This should aid in debugging passes designed to reduce register pressure, as well as optimizations that suddenly trigger spilling. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Compute the number of live registers at each IP.Kenneth Graunke2014-01-213-0/+22
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/fs: Call opt_peephole_sel later in the optimization loop.Matt Turner2014-01-211-1/+1
| | | | | | | | | | | | Calling it after value numbering (added in the next commit) prevents some instruction count regressions. total instructions in shared programs: 1524387 -> 1523905 (-0.03%) instructions in affected programs: 13112 -> 12630 (-3.68%) GAINED: 0 LOST: 3 Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Calculate interference better in register_coalesce.Matt Turner2014-01-211-7/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we simply considered two registers whose live ranges overlapped to interfere. Cases such as set A ------ ... | mov B, A -- | ... | B | A use B -- | ... | use A ------ would be considered to interfere, even though B is an unmodified copy of A whose live range fit wholly inside that of A. If no writes to A or B occur between the mov B, A and the use of B then we can safely coalesce them. Instead of removing MOV instructions, we make them NOPs and remove them at once after the main pass is finished in order to avoid recomputing live intervals (which are needed to perform the previous step). total instructions in shared programs: 1543768 -> 1513077 (-1.99%) instructions in affected programs: 951563 -> 920872 (-3.23%) GAINED: 46 LOST: 22 Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Support coalescing registers of size > 1.Matt Turner2014-01-211-23/+59
| | | | | | | total instructions in shared programs: 1550048 -> 1549880 (-0.01%) instructions in affected programs: 1896 -> 1728 (-8.86%) Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Assert that var < num_vars.Matt Turner2014-01-211-0/+2
| | | | | | Helped to track down a problem in a version of the next commit. Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Add a comment explaining how register coalescing works.Matt Turner2014-01-211-0/+12
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Add and use MAX_SAMPLER_MESSAGE_SIZE definition.Matt Turner2014-01-213-5/+10
| | | | Reviewed-by: Jordan Justen <[email protected]>
* mesa: Add STRINGIFY macro.Matt Turner2014-01-211-0/+2
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Fix the example about overwriting uniforms in SIMD16.Matt Turner2014-01-211-5/+5
| | | | | | | mov takes only a single source argument. Example instruction inexplicably changed from add to mov in commit f10f5e49. Reviewed-by: Jordan Justen <[email protected]>
* i965: Print reg_offset for vgrf of size > 1 in dump_instruction().Matt Turner2014-01-212-4/+4
| | | | | | | Previously we wouldn't print the +0 for the first part of a VGRF of size greater than 1. Reviewed-by: Jordan Justen <[email protected]>
* glsl: Match unnamed record types across stages.Grigori Goronzy2014-01-211-0/+4
| | | | | | | | | | | | | | Unnamed record types are assigned to separate types per stage, e.g. if uniform struct { ... } a; is defined in both vertex and fragment shader, two separate types will result with different names. When linking the shader, this results in a type conflict. However, there is no reason why this should not be allowed according to GLSL specifications. Compare and match record types when linking shader stages to avoid this conflict. Reviewed-by: Matt Turner <[email protected]>
* glsl: Extract function for record comparisons.Grigori Goronzy2014-01-212-30/+44
| | | | Reviewed-by: Matt Turner <[email protected]>
* svga: implement TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFSBrian Paul2014-01-214-8/+47
| | | | | | | | | | | | Fixes several colorbuffer tests, including piglit "fbo-drawbuffers-none" for "gl_FragColor" and "glDrawPixels" cases. v2: rework patch to only avoid creating extra shader variants when TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS is not specified. Per Jose. Use a write_color0_to_n_cbufs key field to replicate color0 to N color buffers only when N > 0 and WRITES_ALL_CBUFS is set. Reviewed-by: José Fonseca <[email protected]>
* svga: rename color output variablesBrian Paul2014-01-213-9/+10
| | | | | | Just to be bit more readable. Reviewed-by: José Fonseca <[email protected]>
* svga: fix clearing for null color buffersBrian Paul2014-01-211-3/+3
| | | | | | Fixes piglit "fbo-drawbuffers-none glClear" test. Reviewed-by: José Fonseca <[email protected]>
* mesa: add missing TYPE_DOUBLEN_2 cases in get.cBrian Paul2014-01-211-0/+12
| | | | | | | | | | | | The new TYPE_DOUBLEN_2 type was added in 0e60d850 but the code to return values of that type wasn't completed. Fixes conform's default state test. glGetFloatv(GL_DEPTH_RANGE) wasn't returning anything. v2: remove stray 'break' statements. Reviewed-by: Jose Fonseca <[email protected]>
* i965: Modify some error messages to refer to "vec4" instead of "vs".Paul Berry2014-01-212-5/+5
| | | | | | | | | These messages are in code that is shared between the VS and GS back-ends, so use the terminology "vec4" to avoid confusion. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add GS support to INTEL_DEBUG=shader_time.Paul Berry2014-01-218-10/+37
| | | | | | | Previously, time spent in geometry shaders would be counted as part of the vertex shader time. Reviewed-by: Kenneth Graunke <[email protected]>
* draw: fix points with negative w coords for d3d style point clippingRoland Scheidegger2014-01-211-2/+6
| | | | | | | | | | | | | Even with depth clipping disabled, vertices which have negative w coords must be discarded. And since we don't have a proper guardband implementation yet (relying on driver to handle all values except infs/nans in rasterization for such points) we need to kill them off manually (as they can end up with coordinates inside viewport otherwise). v2: use 0.0f instead of 0 (spotted by Brian). Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Reserve space for "Vertex Count" in GS outputs.Kenneth Graunke2014-01-212-0/+13
| | | | | | | | v2: Also increment ir->offset in the GS visitor, rather than at the final assembly generation stage (requested by Paul). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Update blitter code for 48-bit addresses.Kenneth Graunke2014-01-201-16/+48
| | | | | | | v2: Rebase on Eric's SET_FIELD changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1]
* i965: Update PIPE_CONTROL packet lengths for Broadwell.Kenneth Graunke2014-01-201-2/+20
| | | | | | | | | On Broadwell, PIPE_CONTROL needs an extra DWord to accomodate the 48-bit addressing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Re-combine the Gen4-5 and Gen6+ write_depth_count functions.Kenneth Graunke2014-01-203-23/+10
| | | | | | | | | Now that we have a helper function that handles the PIPE_CONTROL variations between the various platforms, these are basically the same. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Create a helper function for emitting PIPE_CONTROL writes.Kenneth Graunke2014-01-204-93/+69
| | | | | | | | | | | | | | | | | | | There are a lot of places that use PIPE_CONTROL to write a value to a buffer (either an immediate write, TIMESTAMP, or PS_DEPTH_COUNT). Creating a single function to do this seems convenient. As part of this refactor, we now set the PPGTT/GTT selection bit correctly on Gen7+. Previously, we set bit 2 of DW2 on all platforms. This is correct for Sandybridge, but actually part of the address on Ivybridge and later! Broadwell will also increase the length of these packets by 1; with the refactoring, we should have to adjust that in substantially fewer places, giving us confidence that we've hit them all. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use full-length PIPE_CONTROL packets for workaround writes.Kenneth Graunke2014-01-201-6/+9
| | | | | | | | | | | | | | | I believe that PIPE_CONTROL uses the length field to decide whether to do 32-bit or 64-bit writes. A length of 4 would do a 32-bit write, while a length of 5 would do a 64-bit write. (I haven't verified this, though.) For workaround writes, we don't care what value gets written, or how much data. We're only writing something because hardware bugs mandate that do so. So using a 64-bit write should be fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Emit full-length PIPE_CONTROLs for (non-write) flushes.Kenneth Graunke2014-01-201-2/+3
| | | | | | | | | | | | | | | | The PIPE_CONTROL packet actually has 5 DWords on Gen6+: 1. Header 2. Flags 3. Address 4. Immediate Data: Lower DWord 5. Immediate Data: Upper DWord We just never emitted the last one. While it appears to work, it's probably safer to emit the entire thing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Create a helper function for emitting PIPE_CONTROL flushes.Kenneth Graunke2014-01-204-86/+66
| | | | | | | | | | | | | | | These days, we need to emit PIPE_CONTROL flushes all over the place. Being able to do that via a single function call seems convenient. Broadwell will also increase the length of these packets by 1; with the refactoring, we should have to do this in substantially fewer places. v2: Add back forgotten intel_emit_post_sync_nonzero_flush (caught by Eric Anholt). Drop unlikely() from BLT_RING check. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix MI_STORE_REGISTER_MEM for Broadwell.Kenneth Graunke2014-01-201-10/+23
| | | | | | | It now takes a 48-bit address. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Introduce an OUT_RELOC64 macro.Kenneth Graunke2014-01-202-0/+34
| | | | | | | | | Broadwell uses 48-bit addresses. The first DWord is the low 32 bits, and the second DWord is the high 16 bits. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Use the new drm_intel_bo offset64 field.Kenneth Graunke2014-01-2012-30/+30
| | | | | | | | | | | | | | | | | | | libdrm 2.4.52 introduces a new 'uint64_t offset64' field, intended to replace the old 'unsigned long offset' field. To preserve ABI, libdrm continues to store the presumed offset in both locations. On Broadwell, a 64-bit kernel may place BOs at "high" (> 4G) addresses. However, with a 32-bit userspace, the 'unsigned long offset' field will only be 32-bit, which is not large enough to hold this value. We need to use a proper uint64_t (like the kernel does). Technically, a lot of this code doesn't affect Broadwell, so we could leave it using the old field. But it makes sense to just switch to the new, properly typed field. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Delete intel_batchbuffer_emit_reloc_fenced.Kenneth Graunke2014-01-202-30/+0
| | | | | | | | Nothing in i965 uses it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i915: Silence warning: unused parameter warning in intel_bufferobj_bufferIan Romanick2014-01-203-13/+5
| | | | | | | | | | | intel_buffer_objects.c: In function 'old_intel_bufferobj_buffer': intel_buffer_objects.c:471:17: warning: unused parameter 'flag' [-Wunused-parameter] The parameter hasn't been used since the i915 and i965 drivers had their breakup. i965 got the flags, and i915 got to cry itself to sleep. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i915: Ensure that intel_bufferobj_map_range meets alignment guaranteesIan Romanick2014-01-201-7/+21
| | | | | | | | | | | | Not actually tested, but the changes are identical to the i965 changes that are tested. v2: Remove MAX2(64, ...). Suggested by Ken (in the i965 version of this patch). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: Siavash Eliasi <[email protected]>