summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Bump version to 10.3 (final)Emil Velikov2014-09-191-1/+1
| | | | Signed-off-by: Emil Velikov <[email protected]>
* r300g: set register classes before interferencesConnor Abbott2014-09-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 567e2769b81863b6dffdac3826a6b729ce6ea37c ("ra: make the p, q test more efficient") I unknowingly introduced a new requirement to the register allocator API: the user must set the register class of all nodes before setting up their interferences, because ra_add_conflict_list() now uses the classes of the two interfering nodes. i965 already did this, but r300g was setting up register classes interleaved with setting up the interference graph. This led to us calculating the wrong q total, and in certain cases e78a01d5e6f77e075fe667a0f0ccb10d89c0dd58 (" ra: optimistically color only one node at a time") made it so that this bug caused a segfault. In particular, the error occurred if the q total was decremented to 1 below 0 for the last node to be pushed onto the stack. Since q_total is an unsigned integer, it overflowed to 0xffffffff, which is what lowest_q_total happens to be initialzed to. This means that we would fail the "new_q_total < lowest_q_total" check on line 476 of register_allocate.c, and so the node would never be pushed onto the stack, which led to segfaults in ra_select() when we failed to ever give it a register. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82828 Cc: "10.3" <[email protected]> Signed-off-by: Connor Abbott <[email protected]> Tested-by: Pavel Ondračka <[email protected]> Reviewed-by: Tom Stellard <[email protected]> (cherry picked from commit afd82dcad127b64381ca6d80d0e499368074f474)
* i965: add support for RGBA dma_buf imports.Gwenole Beauchesne2014-09-161-0/+6
| | | | | | | | | | This allows for importing foreign buffers in RGB32 native endian byte order, i.e. DRM_FORMAT_XBGR8888, and DRM_FORMAT_ABGR8888. Signed-off-by: Gwenole Beauchesne <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Cc: "10.3" <[email protected]> (cherry picked from commit e1c50abf8a0ca1d541c4e2dbd5ed1805ed958ba7)
* i965: Mark delta_x/y as BAD_FILE if remapped away completely.Kenneth Graunke2014-09-162-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | Commit afe3d1556f6b77031f7025309511a0eea2a3e8df (i965: Stop doing remapping of "special" regs.) stopped remapping delta_x/delta_y, and additionally stopped considering them always-live. We later realized delta_x was used in register allocaiton, so we actually needed to remap it, which was fixed in commit 23d782067ae834ad53522b46638ea21c62e94ca3 (i965/fs: Keep track of the register that hold delta_x/delta_y.). However, that commit didn't restore the "always consider it live" part. If all the code using delta_x was eliminated, fs_visitor::delta_x would be left pointing at its old register number. Later code in register allocation would handle that register number specially...even though it wasn't actually delta_x. To combat this, set delta_x/y to BAD_FILE if they're eliminated, and check for that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83127 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.3" <[email protected]> (cherry picked from commit 78bd12619474e98503965541c61c5d7e9c408110)
* gallivm: Fix uses of 2^24Richard Sandiford2014-09-161-4/+4
| | | | | | | | | | | Fallback cases in lp_bld_arit.c used 2^24 to mean "2 to the power 24", but in C it's "2 xor 24", i.e. 26. Fixed by using 1<< instead. Signed-off-by: Richard Sandiford <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Cc: "10.2 10.3" <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit 1a65629ccc590fe04a97b6df63d73e349b793619)
* nouveau: change internal variables to avoid conflicts with macro argsIlia Mirkin2014-09-161-10/+10
| | | | | | | | Reported by Coverity Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]> (cherry picked from commit b13a4ca3f7f622cbf688eec14d3f4156533af44e)
* mesa: fix _mesa_free_pipeline_data() use-after-free bugBrian Paul2014-09-161-2/+2
| | | | | | | | | | | | | | Unreference the ctx->_Shader object before we delete all the pipeline objects in the hash table. Before, ctx->_Shader could point to freed memory when _mesa_reference_pipeline_object(ctx, &ctx->_Shader, NULL) was called. Fixes crash when exiting the piglit rendezvous_by_location test on Windows. Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 0d73ac6b02cac46d4a8f3cd1ffa591e071577fa7)
* gallium/util: add missing u_debug includeAndreas Boll2014-09-161-0/+1
| | | | | | | | | | | | | | | | | | | | Needed for assert. Fixes build on BE archs with -Werror=implicit-function-declaration. In file included from ../../../../../src/gallium/auxiliary/draw/draw_fs.c:30:0: ../../../../../src/gallium/auxiliary/util/u_math.h: In function 'util_memcpy_cpu_to_le32': ../../../../../src/gallium/auxiliary/util/u_math.h:810:4: error: implicit declaration of function 'assert' [-Werror=implicit-function-declaration] assert(n % 4 == 0); ^ Cc: "10.3" <[email protected]> Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 2a13ff954d3d8cea73bbcf728edffa867828cb78)
* nouveau: only enable stencil func if the visual has stencil bitsIlia Mirkin2014-09-162-2/+2
| | | | | | | | The _Enabled property already has the relevant information. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]> (cherry picked from commit 3c81de58512f0615df1d90aa79a22c9a44c7189e)
* nouveau: only enable the depth test if there actually is a depth bufferIlia Mirkin2014-09-165-4/+9
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]> (cherry picked from commit 79959e5de518c59b327a9df4a6fa80a68213b873)
* nouveau: remove unneeded assertMaarten Lankhorst2014-09-161-1/+0
| | | | | | | | | No idea why it was added, but the code runs fine even on videos where it triggers. Signed-off-by: Maarten Lankhorst <[email protected]> Cc: "10.2 10.3" <[email protected]> (cherry picked from commit 8ab85bfcd5ddd44c50e5b384222731cb2a1a1496)
* nouveau: rework reference frame handlingMaarten Lankhorst2014-09-163-4/+37
| | | | | | | | | | | | | Fixes a regression from "nouveau/vdec: small fixes to h264 handling" New picking order for frames: 1. Vidbuf pointer matches. 2. Take the first kicked ref. 3. If that fails, take a ref that has a different last_used. Signed-off-by: Maarten Lankhorst <[email protected]> Cc: "10.2 10.3" <[email protected]> (cherry picked from commit a41aad843108cec1901c88a76d5ceb4ede2e062b)
* nouveau: fix MPEG4 hw decodingMaarten Lankhorst2014-09-161-3/+3
| | | | | | | | Reorder some fields to make I-frame decoding work correctly. Signed-off-by: Maarten Lankhorst <[email protected]> Cc: "10.2 10.3" <[email protected]> (cherry picked from commit 121ceb38f45daacc938349d9d5aa82776b78dbab)
* nouveau: re-allocate bo's on overflowMaarten Lankhorst2014-09-164-11/+87
| | | | | | | | | | The BSP bo might be too small to contain all of the bsp data, bump its size on overflow. Also bump inter_bo when this happens, it might be too small otherwise. Signed-off-by: Maarten Lankhorst <[email protected]> Cc: "10.2 10.3" <[email protected]> (cherry picked from commit f6afed7076a6ef446dbec7cb10c8f8c60efafccd)
* i965/vec4: Only examine virtual_grf_end for GRF sourcesIan Romanick2014-09-161-8/+12
| | | | | | | | | | | | | | If the source is not a GRF, it could have a register >= virtual_grf_count. Accessing virtual_grf_end with such a register would lead to out-of-bounds access. Make sure the source is a GRF before accessing virtual_grf_end. Fixes Valgrind complaints while compiling some shaders. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected] (cherry picked from commit 7aeb853c90c2e84fdd4b6b0af97566562c912861)
* i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams.Iago Toral Quiroga2014-09-161-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far we have been using CL_INVOCATION_COUNT to resolve this query but this is no good with streams, as only stream 0 reaches the clipping stage. From ARB_transform_feedback3: "When a generated primitive query for a vertex stream is active, the primitives-generated count is incremented every time a primitive emitted to that stream reaches the Discarding Rasterization stage (see Section 3.x) right before rasterization. This counter is incremented whether or not transform feedback is active." Unfortunately, we don't have any registers that provide the number of primitives written to a specific stream other than the ones that track the number of primitives written to transform feedback in the SOL stage, so we can't implement this exactly as specified. In the past we implemented this feature by activating the SOL unit even if transform feeback was disabled, but making it so that all buffers were disabled and it only recorded statistics, which gave us the right semantics (see 3178d2474ae5bdd1102fb3d76a60d1d63c961ff5). Unfortunately, this came with a significant performance impact and had to be reverted. This new take does not intend to implement the exact semantics required by the spec, but improves what we have now, since now we return the primitive count for stream 0 in all cases. With this patch we use GEN7_SO_PRIM_STORAGE_NEEDED to resolve GL_PRIMITIVES_GENERATED queries for non-zero streams. This would return the number of primitives written to transform feedback for each stream instead. Since non-zero streams are only useful in combination with transform feedback this should not be too bad, and the only case that I think we would not be supporting would be the one in which we want to use both GL_PRIMITIVES_GENERATED and GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN on the same non-zero stream to detect buffer overflow. This patch also fixes the following piglit test: arb_gpu_shader5-xfb-streams-without-invocations This test uses both GL_PRIMITIVES_GENERATED and GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries on non-zero streams, but it does never hit the overflow case, so both queries are always expected to return the same value. Reviewed-by: Kenneth Graunke <[email protected]> Cc: "10.3" <[email protected]> (cherry picked from commit f976b4c1bf2271cf986be8204147ae986380cc91) Nominated-by: Kenneth Graunke <[email protected]>
* glsl: Speed up constant folding for swizzles.Kenneth Graunke2014-09-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ir_rvalue::constant_expression_value() recursively walks down an IR tree, attempting to reduce it to a single constant value. This is useful when you want to know whether a variable has a constant expression value at all, and if so, what it is. The constant folding optimization pass attempts to replace rvalues with their constant expression value from the bottom up. That way, we can optimize subexpressions, and ideally stop as soon as we find a non-constant subexpression. In order to obtain the actual value of an expression, the optimization pass calls constant_expression_value(). But it should only do so if it knows the value can be combined into a constant. Otherwise, at each step of walking back up the tree, it will walk down the tree again, only to discover what it already knew: it isn't constant. We properly avoided this call for ir_expression nodes, but not for ir_swizzle nodes. This patch fixes that, drastically reducing compile times on certain shaders where tree grafting has given us huge expression trees. It also fixes SuperTuxKart. Thanks to Iago and Mike for help in tracking this down. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78468 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: [email protected] (cherry picked from commit 84a40ce86b1010873b194eb9bf0b8744234b829c)
* i965/vec4: Make type_size() return 0 for samplers.Kenneth Graunke2014-09-121-3/+3
| | | | | | | | | | | | | | | The FS backend has always used 0, and the VS backend has always used 1. I think 1 is just working around other problems, and is incorrect. Samplers are baked in; nothing uses the UNIFORM register we would create, and we shouldn't upload any constant values for them. Fixes ES3-CTS.shaders.struct.uniform.sampler_array_vertex. Signed-off-by: Kenneth Graunke <[email protected]> Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]> (cherry picked from commit 7865026c04f6cc36dc81f993bc32ddda2806ecb5)
* i965: Skip allocating UNIFORM file storage for uniforms of size 0.Kenneth Graunke2014-09-122-6/+6
| | | | | | | | | | | | | | | | | | | | | | Samplers take up zero slots and therefore don't exist in the params array, nor are they included in stage_prog_data->nr_params. There's no need to store their size in param_size, as it's only used for dealing with arrays of "real" uniforms (ones uploaded as shader constants). We run into all kinds of problems trying to refer to the uniform storage for variables that don't have uniform storage. For one, we may use some other variable's index, or access out of bounds in arrays. In the FS backend, our extra 2 * MaxSamplerImageUnits params for texture rectangle rescaling paper over a lot of problems. In the VS backend, we claim samplers take up a slot, which also papers over problems. Instead, just skip allocating storage for variables that don't have any. Signed-off-by: Kenneth Graunke <[email protected]> Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]> (cherry picked from commit 2408f166db1d81f2e9cc86b3f413ddba5ba537fa)
* i965: Disable guardband clipping in the smaller-than-viewport case.Kenneth Graunke2014-09-121-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | Apparently guardband clipping doesn't work like we thought: objects entirely outside fthe guardband are trivially rejected, regardless of their relation to the viewport. Normally, the guardband is larger than the viewport, so this is not a problem. However, when the viewport is larger than the guardband, this means that we would discard primitives which were wholly outside of the guardband, but still visible. We always program the guardband to 8K x 8K to enforce the restriction that the screenspace bounding box of a single triangle must be no more than 8K x 8K. So, if the viewport is larger than that, we need to disable guardband clipping. Fixes ES3 conformance tests: - framebuffer_blit_functionality_negative_height_blit - framebuffer_blit_functionality_negative_width_blit - framebuffer_blit_functionality_negative_dimensions_blit - framebuffer_blit_functionality_magnifying_blit - framebuffer_blit_functionality_multisampled_to_singlesampled_blit v2: Mention the acronym expansion for TA/TR/MC in the comments. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> (cherry picked from commit 0bac2551e40410e2251daf4fd9faf69310ab34ce)
* i965: Separate gl_InstanceID and gl_VertexID uploading.Kenneth Graunke2014-09-125-16/+42
| | | | | | | | | | | | | | | | | We always uploaded them together, mostly out of laziness - both required an additional vertex element. However, gl_VertexID now also requires an additional vertex buffer for storing gl_BaseVertex; for non-indirect draws this also means uploading (a small amount of) data. This is extra overhead we don't need if the shader only uses gl_InstanceID. In particular, our clear shaders currently use gl_InstanceID for doing layered clears, but don't need gl_VertexID. Signed-off-by: Kenneth Graunke <[email protected]> Cc: "10.3" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]> (cherry picked from commit 6b6145204dd4a1112f6e1fe10162636141495b79)
* i965: Fix reference counting in new basevertex upload code.Kenneth Graunke2014-09-121-0/+3
| | | | | | | | | | | | | | | | | | | | | In the non-indirect draw case, we call intel_upload_data to upload gl_BaseVertex. It makes brw->draw.draw_params_bo point to the upload buffer, and increments the upload BO reference count. So, we need to unreference it when making brw->draw.draw_params_bo point at something else, or else we'll retain a reference to stale upload buffers and hold on to them forever. This also means that the indirect case should increment the reference count on the indirect draw buffer when making brw->draw.draw_params_bo point at it. That way, both paths increment the reference count, so we can safely unreference it every time. Signed-off-by: Kenneth Graunke <[email protected]> Cc: "10.3" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]> (cherry picked from commit e980fe607155c79ccba56ef78854093b7730bef6)
* i965: Request lowering gl_VertexIDIan Romanick2014-09-121-0/+1
| | | | | | | | | | | | | | Fixes the (new) piglit tests gles-3.0-drawarrays-vertexid, gl-3.0-multidrawarrays-vertexid, and gl-3.2-basevertex-vertexid. Fixes gles3conform failure in: ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80247 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 927f5db46135b3eb63f401833b1e40a3be9ca4e0)
* i965: Expose gl_BaseVertex via a vertex attribute.Kenneth Graunke2014-09-123-20/+65
| | | | | | | | | | | | | | | | | | | | Now that we have the data available, we need to expose it to the shaders. We can reuse the same vertex element that we use for gl_VertexID, but we need to back it by an actual vertex buffer. A hardware restriction requires that vertex attributes coming from a buffer (STORE_SRC) must come before any other types (i.e. STORE_0). So, we have to make gl_BaseVertex be the .x component of the vertex attribute. This means moving gl_VertexID to a different component. I chose to move gl_VertexID and gl_InstanceID to the .z and .w components, respectively, to make room for gl_BaseInstance in the .y component (which would also come from a buffer, and therefore be STORE_SRC). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit fbb353bc13a8924f9c6cd8c2572d299e3c4b9162)
* i965: Refactor Gen4-7 VERTEX_BUFFER_STATE emission into a helper.Kenneth Graunke2014-09-121-30/+47
| | | | | | | | | | We'll need to emit another VERTEX_BUFFER_STATE for gl_BaseVertex; pulling this into a helper function will save us from having to deal with cross-generation differences in that code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 87b10c4a7161905e3a9dc2b2ddc77fbf6908ebd5)
* i965: Make gl_BaseVertex available in a buffer object.Kenneth Graunke2014-09-123-0/+31
| | | | | | | | | | | | This will be used for GL_ARB_shader_draw_parameters, as well as fixing gl_VertexID, which is supposed to include gl_BaseVertex's value. For indirect draws, we simply point at the indirect buffer; for normal draws, we upload the value via the upload buffer. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit fdbabf22e183d478cd076215052fa877b125629b)
* i965: Calculate start/base_vertex_location after preparing vertices.Kenneth Graunke2014-09-126-12/+34
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit c89306983c07e5a88c0d636267e5ccf263cb4213)
* i965: Handle SYSTEM_VALUE_VERTEX_ID_ZERO_BASEIan Romanick2014-09-121-0/+1
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 9975792abd8be891bbe7245cb1d5c347dff65465)
* mesa: Fix glGetActiveAttribute for gl_VertexID when lowered.Kenneth Graunke2014-09-121-1/+13
| | | | | | | | | | | | | | The lower_vertex_id pass converts uses of the gl_VertexID system value to the gl_BaseVertex and gl_VertexIDMESA system values. Since gl_VertexID is no longer accessed, it would not be considered active. Of course, it should be, since the shader uses gl_VertexID. v2: Move the var->name dereference past the var != NULL check. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 26e949b26effe741f9792e3efcba31c7209f3465)
* mesa: Replace string comparisons with SYSTEM_VALUE enum checks.Kenneth Graunke2014-09-121-2/+2
| | | | | | | | This is more efficient. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 26c9514155bfcd1cf2cfeb2cfdb5d4b78a8072b0)
* glsl: Add a lowering pass for gl_VertexIDIan Romanick2014-09-126-0/+165
| | | | | | | | | | | | | | | | | | | | | | | | | | Converts gl_VertexID to (gl_VertexIDMESA + gl_BaseVertex). gl_VertexIDMESA is backed by SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, and gl_BaseVertex is backed by SYSTEM_VALUE_BASE_VERTEX. v2: Put the enum in struct gl_constants and propoerly resolve the scope in C++ code. Fix suggested by Marek. v3: Reabase on Matt's foreach_in_list changes (was using foreach_list). v4 (Ken): Use a systemvalue instead of a uniform because STATE_BASE_VERTEX has been removed. v5: Use a boolean to select lowering, and only allow one lowering method. Suggested by Ken. v6 (Ken): Replace strcmp against literal "gl_BaseVertex"/"gl_VertexID" with SYSTEM_VALUE enum checks, for efficiency. v7: Rebase on context constant initialization work. Signed-off-by: Ian Romanick <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> (cherry picked from commit ec08b5e768271aa100be87c1ca6dd2b0109049d9)
* glsl/linker: Make get_main_function_signature publicIan Romanick2014-09-122-4/+8
| | | | | | | | | The next patch will use this function in a different file. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> (cherry picked from commit 04d3323d4b7b7cae4954d80946f0ca202770dd14)
* mesa: Add SYSTEM_VALUE_BASE_VERTEXIan Romanick2014-09-122-1/+15
| | | | | | | | | | This system value represents the basevertex value passed to glDrawElementsBaseVertex and related functions. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> (cherry picked from commit 1e87fbd78f15f262b3dd2cbc16099e9f484c42a0)
* mesa: Add SYSTEM_VALUE_VERTEX_ID_ZERO_BASEIan Romanick2014-09-122-0/+13
| | | | | | | | | | | | | There exists hardware, such as i965, that does not implement the OpenGL semantic for gl_VertexID. Instead, that hardware does not include the value of basevertex in the gl_VertexID value. SYSTEM_VALUE_VERTEX_ID_ZERO_BASE is the system value that represents this semantic. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> (cherry picked from commit 5964a4f344fa4fd631bcccf67c065b9e66b94108)
* mesa: Document SYSTEM_VALUE_VERTEX_ID and SYSTEM_VALUE_INSTANCE_IDIan Romanick2014-09-121-0/+57
| | | | | | | | | | v2: Additions to the documentation for SYSTEM_VALUE_VERTEX_ID. Quote the GL_ARB_shader_draw_parameters spec and mention DirectX SV_VertexID. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> (cherry picked from commit 9afb5ae8cae4460b87a5c03da3955f2e83434430)
* i965/vec4: Reswizzle sources when necessary.Matt Turner2014-09-102-11/+25
| | | | | | | | | | | Despite the comment above the function claiming otherwise, the function did not reswizzle sources, which would lead to bad code generation since commit 04895f5c, which began claiming we could do such swizzling when we could not. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932 Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 1ee1d8ab468cafd25cfcc513319f3f046492947f)
* configure.ac: strip _GNU_SOURCE from llvm-config outputJonathan Gray2014-09-091-0/+1
| | | | | | | | | | | | Mesa already defines _GNU_SOURCE for glibc based systems and defining _GNU_SOURCE will break the Mesa build on other systems such as OpenBSD. _GNU_SOURCE only seems to be included in llvm-config output when LLVM is built via autoconf and not when it is built by cmake. Cc: "10.2 10.3" <[email protected]> Signed-off-by: Jonathan Gray <[email protected]> (cherry picked from commit c68073e65f15b0df43bec2df1d7470ed4cddd761)
* configure: enable the gallium loader only when neededEmil Velikov2014-09-091-10/+16
| | | | | | | | | | | | | | | | With the gallium megadrivers we've converted most ST to optionally use either statically linked in or shared pipe-drivers. The hardcoded switch forgot to conditionally enable the build of the shared pipe-drivers which resulted in them being constantly build. Cc: "10.3" <[email protected]> Cc: James Ausmus <[email protected]> Reported-by: James Ausmus <[email protected]> Tested-by: James Ausmus <[email protected]> Bugzilla: https://code.google.com/p/chromium/issues/detail?id=412089 Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 44ec468e8033553c26a112cebba41c343db00eb1)
* configure: bail out if building svga without libdrmEmil Velikov2014-09-091-0/+3
| | | | | | | | | | | | | | | | | | With recent commit we removed the NEED_NONNULL_WINSYS checks when selecting the hardware (inc svga) winsys. svga has only one winsys that explicitly requires libdrm (via it's bundled version of vmwgfx_drm.h) but configure.ac never really checks for it. Add the check early to prevent people from shooting themselves when they select the driver but lack libdrm. $ ./autogen.sh --disable-dri --disable-egl --disable-gallium-llvm --with-dri-drivers=swrast --with-gallium-drivers=svga,swrast Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82539 Cc: "10.2 10.3" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 40bb6f93139971a459dadf88d6dfc05791071e37)
* nv50/ir: avoid array overrun when checking for supported modsIlia Mirkin2014-09-092-2/+2
| | | | | | | | Reported by Coverity Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]> (cherry picked from commit 874a9396c5adfdcff63139bf6ababb55c1253402)
* i965: Handle ir_binop_ubo_load in boolean expression code.Kenneth Graunke2014-09-092-4/+4
| | | | | | | | | | | | | | | | | | | | UBO loads can be boolean-valued expressions, too, so we need to handle them in emit_bool_to_cond_code() and emit_if_gen6(). However, unlike most expressions, it doesn't make sense to evaluate their operands, then do something with the results. We just want to evaluate the UBO load as a whole---which performs the read from memory---then load the boolean result into the flag register. Instead of adding code to handle it, we can simply bypass the ir_expression handling, and fall through to the default code, which will do exactly that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83468 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: [email protected] (cherry picked from commit a20cc2796f5d55e49956ac0bc5d61ca027eec7f9)
* i965: Handle ir_triop_csel in emit_if_gen6().Kenneth Graunke2014-09-092-4/+33
| | | | | | | | | | | | | ir_triop_csel can return a boolean expression, so we need to handle it here; we simply forgot when we added ir_triop_csel, and forgot again when adding it to emit_bool_to_cond_code. Fixes Piglit's EXT_shader_integer_mix/{vs,fs}-mix-if-bool on Sandybridge. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: [email protected] (cherry picked from commit 6272e60ca394a8da178d3352831a48f4c429a3bc)
* gallivm: Fix Altivec pack intrinsics for little-endianUlrich Weigand2014-09-081-5/+21
| | | | | | | | | | | | | | | | | | | | | | This patch fixes use of Altivec pack intrinsics on little-endian PowerPC systems. Since little-endian operation only affects the load and store instructions, the semantics of pack (and other) instructions that take two input vectors implicitly change: the pack instructions still fill a register placing values from the first operand into the "high" parts of the register, and values from the second operand into the "low" parts of the register, but since vector loads and stores perform an endian swap, the high parts end up at high memory addresses. To still achieve the desired effect, we have to swap the two inputs to the pack instruction on little-endian systems. This is done automatically by the back-end for instructions generated by LLVM, but needs to be done manually when emitting intrisincs (which still result in that instruction being emitted directly). Signed-off-by: Ulrich Weigand <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]> (cherry picked from commit 0feb977bbfb0d6bb2c8d3178246acb035a739f37) Nominated-by: Maarten Lankhorst <[email protected]>
* mesa/st: don't advertise NV_vdpau_interop if it doesn't work.Christian König2014-09-081-1/+7
| | | | | | | | | | | | As long as we don't have a workaround for frame based decoding in VDPAU we should not advertise NV_vdpau_interop. v2: fix commit message, check if get_video_param is present Signed-off-by: Christian König <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 12fb74fe895fe9954df127ca0ec6e4422fffb156)
* i965: Adjust fast-clear resolve rect for BDWKristian Høgsberg2014-09-081-4/+10
| | | | | | | | | | | | The scale factors for the resolve rectangle change for BDW and we have to look at brw->gen now to figure out how big it should be. Fixes: https://bugs.freedesktop.org/attachment.cgi?id=105777 Cc: "10.3" <[email protected]> Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 2d6d3461d307636b61d0f483677aaad11d1fd42a) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83046
* nvc0/ir: clarify recursion fix to finding first tex usesChristoph Bumiller2014-09-081-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple shader for reproducing the case mentioned: FRAG DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0..1], LOCAL IMM[0] FLT32 { 0.0000, -1.0000, 1.0000, 0.0000} 0: MOV TEMP[0].x, CONST[0].wwww 1: MOV TEMP[1].x, CONST[0].wwww 2: BGNLOOP 3: IF TEMP[0].xxxx 4: BRK 5: ENDIF 6: ADD TEMP[0].x, TEMP[0], IMM[0].zzzz 7: IF CONST[0].xxxx 8: TEX TEMP[1].x, CONST[0], SAMP[0], 2D 9: ENDIF 10: IF CONST[0].zzzz 11: MOV TEMP[1].x, CONST[0].zzzz 12: ENDIF 13: ENDLOOP 14: MOV OUT[0], TEMP[1].xxxx 15: END Cc: "10.2 10.3" <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (cherry picked from commit ca9ab05d45ebf407485af2daa3742b897ff99162)
* nv50/ir/util: fix BitSet issuesChristoph Bumiller2014-09-083-3/+10
| | | | | | | | | | | | | | BitSet::allocate() is being used with the expectation that it would leave the bitfield untouched if its size hasn't changed, however, the function always zeroed the last word, which led to obscure bugs with live set computation. This also fixes BitSet::resize(), which was broken, but luckily not being used. Cc: "10.2 10.3" <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (cherry picked from commit b9f9e3ce03dbd8d044a72a00e1e8856a500b5f72)
* i965/blorp: Pass image formats seperately from the miptreeJason Ekstrand2014-09-084-19/+43
| | | | | | | | | | | | | | | | When a texture is wrapped in a texture view, we can't trust the format in the miptree itself. This patch allows us to pass the format seperately through blorp so we can proprerly handled wrapped textures. It's worth noting here that we can use the miptree format directly for depth/stencil formats because they cannot be reinterpreted by a texture view. Signed-off-by: Jason Ekstrand <[email protected]> CC: "10.3" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> (cherry picked from commit 7599886b26853163ef354476be70aa7fd9ae35c5)
* Increment version to 10.3.0-rc3mesa-10.3-rc3Emil Velikov2014-09-051-1/+1
| | | | Signed-off-by: Emil Velikov <[email protected]>
* st/mesa: use 1.0f as boolean true on drivers without integer supportMarek Olšák2014-09-051-2/+3
| | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82882 Cc: 10.2 10.3 [email protected] Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit 1a00f247512f22e58548053a99a706615a178672)