summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* mesa/teximage: fix GL_FLOAT in commentIlia Mirkin2016-05-231-1/+1
| | | | | | Noticed by Brian. Trivial. Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: reenable cullingDave Airlie2016-05-241-1/+1
| | | | | | Now the lowering pass is fixed, reenable ARB_cull_distance. Signed-off-by: Dave Airlie <[email protected]>
* i965: reenable ARB_cull_distance.Dave Airlie2016-05-241-1/+1
| | | | | | Now the lowering pass is fixed we can reenable culling. Signed-off-by: Dave Airlie <[email protected]>
* glsl: make max array trackers ints and use -1 as base. (v2)Dave Airlie2016-05-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a bug that breaks cull distances. The problem is the max array accessors can't tell the difference between an never accessed unsized array and an accessed at location 0 unsized array. This leads to converting an undeclared unused gl_ClipDistance inside or outside gl_PerVertex to a size 1 array. However we need to the number of active clip distances to work out the starting point for the cull distances, and this offset by one when it's not being used isn't possible to distinguish from the case were only the first element is accessed. I tried to use ->used for this, but that doesn't work when gl_ClipDistance is part of an interface block. So this changes things so that max_array_access is an int and initialised to -1. This also allows unsized arrays to proceed further than that could before, but we really shouldn't mind as they will get eliminated if nothing uses them later. For initialised uniforms we no longer change their array size at runtime, if these are unused they will get eliminated eventually. v2: use ralloc_array (Ilia) Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965: Unset alpha blend for R10G10B10_SNORM_A2_UNORMNanley Chery2016-05-231-1/+1
| | | | | | | This format does not support alpha blending, according to the SNB PRM. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: deindent blorp code.Dave Airlie2016-05-241-9/+9
| | | | | | | gcc6 warns about this. Acked-by: Matt Turner <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: allow GL_FRAMEBUFFER_DEFAULT_LAYERS to be queried with ES geometryIlia Mirkin2016-05-231-2/+2
| | | | | | | When we have the geometry extensions, enable querying of the new param. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa: allow xfb to be active in GLES when geometry shader is enabled.Ilia Mirkin2016-05-231-2/+4
| | | | | | | | OES_geometry_shader has wording to allow xfb when using Draw*Indirect and DrawElements. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* main: check driver float texture support before upgrading to 16F/32FIlia Mirkin2016-05-231-28/+33
| | | | | | | | | | When passing in GL_RGBA or other base formats, we will try to upgrade the format to whatever the passed in type was. However not all drivers (notably nv30) support 32F textures, and so this would lead to crashes down the line. Only upgrade when the relevant extensions are available. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: update inst->info along with inst->opIlia Mirkin2016-05-231-0/+1
| | | | | | | | | | | | Otherwise we still have TGSI_OPCODE_CMP's info, which causes a number of later logic to go wrong. This fixes dEQP-GLES2.functional.shaders.functions.control_flow.return_in_if_vertex on nv30. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Implement glGet*(GL_PRIMITIVE_RESTART_FOR_PATCHES_SUPPORTED).Kenneth Graunke2016-05-235-0/+6
| | | | | | | | | | | | | | | | | | | | | Technically, this was introduced with GL 4.4. However, I believe it was intended to be retroactive. As far as I know, AMD has never supported primitive restart with patches, while NVidia and Intel do. This necessitated the need for a query which would allow applications to figure out whether this was usable or not. I decided to expose it everywhere ARB_tessellation_shader is exposed. (It's also in both OES and EXT_tessellation_shader.) Enable this for i965 and Gallium drivers which expose the capability. v2: Fix a bug in the state_tracker code (caught by Ilia Mirkin). Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10364 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Mark UBO uniform pull constant loads as force_writemask_all.Francisco Jerez2016-05-232-1/+4
| | | | | | | | | | This lets the rest of the backend know that the uniform pull constant load opcodes don't respect channel enables -- Without this the register allocator has no way to know that the return payload of a pull constant load is not per-channel and spills of the destination will be broken under non-uniform control flow. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Allow spilling of non-contiguous registers.Francisco Jerez2016-05-231-19/+2
| | | | | | | This should be working fine now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94997 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Calculate the (un)spill block size correctly.Francisco Jerez2016-05-231-3/+17
| | | | | | | | | | | | | | | | | | | | | Currently the spilling code attempts to guess the scratch message block size from the dispatch width of the shader, which is plain wrong for SIMD-lowered instructions (frequently but not exclusively encountered in SIMD32 shaders) or for instructions with register region data types of size other than 32 bit. Instead try to use the SIMD component size of the instruction which in some cases will allow the dataport to apply the correct channel mask to the scratch data read or written. In the spill case the block size needs to be clamped to the number of MRF registers reserved for spilling. In the unspill case I didn't even bother because we currently have no 100% accurate way to determine whether a source region is per-channel or whether it contains things like headers that don't respect channel boundaries -- That's fine, because the unspill is marked force_writemask_all we can just use the largest allowable scratch message size. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Set exec_all on spills not matching the channel layout of the ↵Francisco Jerez2016-05-231-3/+16
| | | | | | | | | | | instruction. This prevents the application of an incorrect channel mask by the scratch write instruction for spilled variables that don't have an exact one-to-one correspondence between channels of the variable and 32-bit components of the scratch write instruction. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Set exec_all on unspills.Francisco Jerez2016-05-231-2/+10
| | | | | | | | | | This makes sure that unspills restore the exact contents of the variable in scratch space into the GRF without applying channel masking, which is incorrect under control flow for things like message headers or vectors of heterogeneous types that don't properly respect channel boundaries. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Move scratch block size calculation into the caller of emit_(un)spill.Francisco Jerez2016-05-231-21/+23
| | | | | | | | | | | | | This makes emit_(un)spill even more stupid by removing the logic that decides what execution size each scratch read or write send message should have and instead relying on the caller to specify an appropriate execution size via the builder argument. This makes sense because the caller will need to act differently based on the scratch message width (e.g. emit an additional unspill before the instruction if the execution width and channel layout of the spill doesn't match the instruction's). Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Make emit_spill/unspill static functions taking builder as argument.Francisco Jerez2016-05-232-24/+21
| | | | | | | | | This seems cleaner than exposing an implementation detail of brw_fs_reg_allocate.cpp to the world, and will give the caller control over the instruction execution flags (e.g. force_writemask_all) that are applied to the scratch read and write instructions. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Apply execution controls from the instruction to scratch messages.Francisco Jerez2016-05-231-6/+5
| | | | | | | | | | | | | Until now the execution controls (e.g. channel group, force_writemask_all, exec_size) of the instruction had been completely ignored by spilling, even though that can lead to a mismatch between the channel mask applied to the contents of the (un)spilled memory and the GRF source or destination of the instruction. In some cases we'll actually want the (un)spill messages to be marked force_writemask_all regardless of whether the instruction has it set, but that will have to be handled specially by the caller. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Fix signedness of local variables and arguments of emit_(un)spill.Francisco Jerez2016-05-232-8/+8
| | | | | | | To avoid some some spurious warnings about comparison signedness in the following commits. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Factor out calculation of the block of MRFs reserved for spilling.Francisco Jerez2016-05-231-9/+45
| | | | | | | And as we're at it fix the calculation to allocate a larger block of registers for 32-wide dispatch. Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: dri: Add shared glapi to LIBADD on AndroidNicolas Boichat2016-05-231-0/+8
| | | | | | | | | | | | | | | | | | | | /system/vendor/lib/dri/*_dri.so actually depend on libglapi: without this, loading the so file fails with: cannot locate symbol "__emutls_v._glapi_tls_Context" On non-Android (non-bionic) platform, EGL uses the following workflow, which works fine: dlopen("libglapi.so", RTLD_LAZY | RTLD_GLOBAL); dlopen("dri/<driver>_dri.so", RTLD_NOW | RTLD_GLOBAL); However, bionic does not respect the RTLD_GLOBAL flag, and the dri library cannot find symbols in libglapi.so, so we need to link to libglapi.so explicitly. Android.mk already does this. Signed-off-by: Nicolas Boichat <[email protected]> Reviewed-by: Emil Velikov <[email protected]> [Emil Velikov: s/explicitely/explicitly/] Signed-off-by: Emil Velikov <[email protected]>
* i965/fs: do not depend on std140 alignment rules for UBO loadsFrancisco Jerez2016-05-231-46/+13
| | | | | | | | | | | | The previous implementation relied on the std140 alignment rules to avoid handling misalignment in the case where we are loading more than 2 double components from a vector, which requires to emit a second load message. This alternative implementation deals with misalignment and is more flexible going forward. Reviewed-by: Iago Toral Quiroga <[email protected]>
* subroutines: handle explicit indexes properlyDave Airlie2016-05-232-4/+13
| | | | | | | | | | | | | | The code didn't deal with explicit function indexes properly. It also handed out the indexes at link time, when we really need them in the lowering pass to create the correct if ladder. So this patch moves assigning the non-explicit indexes earlier, fixes the lowering pass and the lookups to get the correct values. This fixes a few of: GL45-CTS.explicit_uniform_location.subroutine-index-* Signed-off-by: Dave Airlie <[email protected]>
* mesa/subroutines: fix reset on bindpipelineDave Airlie2016-05-232-1/+7
| | | | | | | | Fixes: GL45-CTS.shader_subroutine.subroutine_uniform_reset Reviewed-by: Chris Forbes <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/subroutines: count number subroutines properly.Dave Airlie2016-05-232-2/+3
| | | | | | | | | | | | | The code was implementing the ACTIVE_SUBROUTINE_UNIFORMS incorrectly, using the number of types not the number of uniforms. This is different than the locations as the locations may be sparsly allocated. This fixes: GL43-CTS.shader_subroutine.four_subroutines_with_two_uniforms Reviewed-by: Chris Forbes <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/subroutines: don't generate error in GetSubroutineIndex.Dave Airlie2016-05-231-1/+0
| | | | | | | | | | GLSL spec says this doesn't generate an error. Fixes: GL45-CTS.explicit_uniform_location.subroutine-loc Reviewed-by: Chris Forbes <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* arb_shader_subroutine: check active subroutine limitDave Airlie2016-05-231-0/+5
| | | | | | | | | | _mesa_GetActiveSubroutineUniformiv needs to check against the number of types here. Noticed while playing with ogl conform. Reviewed-by: Chris Forbes <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/queryobject: return INVALID_VALUE if offset < 0 (v2)Dave Airlie2016-05-231-0/+5
| | | | | | | | | | | | This fixes: GL45-CTS.direct_state_access.queries_errors The ARB_direct_state_access spec agrees. v2: move check down further (Ilia) Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: Unlock mutex on error path.Matt Turner2016-05-221-0/+1
| | | | Caught by Coverity (CID 1362021). Caused by commit 015f2207c.
* i965: remove redundant NULL checkTimothy Arceri2016-05-221-1/+1
| | | | | | We would have segfaulted in the above code if prog could be NULL. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Just read the existing tally on EndTransformFeedback if paused.Kenneth Graunke2016-05-201-20/+22
| | | | | | | | | | | | | | If the transform feedback object is paused when ending, then there are no new snapshots to add to the tally. In fact, we haven't written a starting snapshot, so we'd best not try and compute (end - start). Just load the existing tally so we can convert it to the number of vertices written and store it to the final result location. This is the Haswell+ equivalent of the previous commit. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Don't write a counter snapshot on EndTransformFeedback if paused.Kenneth Graunke2016-05-201-1/+2
| | | | | | | | | | | | | | If the transform feedback object is paused, then we've already written an ending counter snapshot. We don't want to write another one. This fixes assertions in GL33-CTS.transform_feedback.api_errors_test, which calls EndTransformfeedback after PauseTransformFeedback. On the next BeginTransformFeedback, we tried to tally up the results, and saw an odd number of snapshots (due to the double-end), and tripped an assertion. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa: Call TransformFeedback driver hooks before setting flags.Kenneth Graunke2016-05-201-5/+5
| | | | | | | | | | | This way, the driver's EndTransformFeedback() hook can tell whether the transform feedback operation was paused. It's also convenient to have Paused remain false until the driver's PauseTransformFeedback hook finishes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir: remove dead glsl variables before lowering io.Dave Airlie2016-05-211-0/+1
| | | | | | | | | | | | | For cull distance GLSL will let unsized unused arrays get into the backend, we should nuke those straight away, to save caring about them later. This fixes: arb_separate_shader_objects/linker/large-number-of-unused-varyings as a side effect (even without culling changes). Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965: Delete dead dFdy flipping code.Kenneth Graunke2016-05-201-19/+5
| | | | | | | | | | | Rob's nir_lower_wpos_ytransform() pass flips dFdy in the opposite case of what I expected, so we always take the negate_value case. It doesn't really matter. v2: Write src0 before src1 in ADD instructions (requested by Matt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Delete brw_wm_prog_key::render_to_fbo and drawable_height.Kenneth Graunke2016-05-202-46/+0
| | | | | | | | | | | | Now that we handle flipping and other gl_FragCoord transformations via a uniform, these key fields have no users. This patch actually eliminates the associated recompiles. The Tomb Raider benchmark's minimum FPS increases from ~1 FPS to a reasonable number. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965, anv: Use NIR FragCoord re-center and y-transform passes.Kenneth Graunke2016-05-207-57/+34
| | | | | | | | | | | | | | This handles gl_FragCoord transformations and other window system vs. user FBO coordinate system flipping by multiplying/adding uniform values, rather than recompiles. This is much better because we have no decent way to guess whether the application is going to use a shader with the window system FBO or a user FBO, much less the drawable height. This led to a lot of recompiles in many applications. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix brw_regs_equal() for NaN and positive/negative zero.Kenneth Graunke2016-05-201-1/+2
| | | | | | | | We'd like the comparisons to mean "the exact same bits". Comparing doubles won't do that for NaN values or positive vs. negative zero. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Replace uses of Shared->Mutex with hash-table mutexesMatt Turner2016-05-209-50/+78
| | | | | | | | | | | | | | | We were locking the Shared->Mutex and then using calling functions like _mesa_HashInsert that do additional per-hash-table locking internally. Instead just lock each hash-table's mutex and use functions like _mesa_HashInsertLocked and the new _mesa_HashRemoveLocked. In order to do this, we need to remove the locking from _mesa_HashFindFreeKeyBlock since it will always be called with the per-hash-table lock taken. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* hash: Add _mesa_HashRemoveLocked() function.Matt Turner2016-05-202-4/+17
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Pass nir_src/nir_dest by reference.Matt Turner2016-05-204-18/+18
| | | | | | | | | | Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before 5766074 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Fix strerror error code signMark Janes2016-05-201-1/+1
| | | | | | | This trivial fix to error-handling corrects the sign of drm error codes before passing them to strerror. Identified by Coverity: CID1358581
* i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz.Matt Turner2016-05-191-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ken suggested instead of a big and complicated optimization pass, to just recognize the operations here. It's certainly less code and a lot prettier, but it seems to actually perform worse for currently unknown reasons. total instructions in shared programs: 8923452 -> 8904108 (-0.22%) instructions in affected programs: 814563 -> 795219 (-2.37%) helped: 3336 HURT: 10 total cycles in shared programs: 66970734 -> 66651476 (-0.48%) cycles in affected programs: 10582686 -> 10263428 (-3.02%) helped: 2438 HURT: 691 total spills in shared programs: 1811 -> 1789 (-1.21%) spills in affected programs: 85 -> 63 (-25.88%) helped: 4 total fills in shared programs: 3143 -> 3109 (-1.08%) fills in affected programs: 167 -> 133 (-20.36%) helped: 4 LOST: 2 GAINED: 36 Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add infrastucture for sample lod-zero operations.Matt Turner2016-05-196-0/+33
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add and use get_nir_src_imm().Matt Turner2016-05-192-4/+19
| | | | | | | | The next patch wants to inspect the LOD argument and do something different if it's 0.0f. But at that point we've emitted a MOV for it and we just have a register to look at. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Silence warnings related to use of uninitialized valuesEduardo Lima Mitev2016-05-191-2/+2
| | | | | | | | | | | | | | | | | | | brw_fs.cpp: In function ‘const unsigned int* brw_compile_fs(const [...] brw_fs.cpp:6093:64: warning: ‘simd16_grf_start’ may be used uninitialized [...] prog_data->base.dispatch_grf_start_reg = simd16_grf_start; brw_fs.cpp:5996:29: note: ‘simd16_grf_start’ was declared here uint8_t simd8_grf_start, simd16_grf_start; brw_fs.cpp:6094:52: warning: ‘simd16_grf_used’ may be used uninitialized [...] prog_data->reg_blocks_0 = brw_register_blocks(simd16_grf_used); brw_fs.cpp:5997:29: note: ‘simd16_grf_used’ was declared here unsigned simd8_grf_used, simd16_grf_used; (and more) Reviewed-by: Anuj Phogat <[email protected]>
* Revert "i965/urb: fixes division by zero"Matt Turner2016-05-181-5/+19
| | | | This reverts commit 2a8aa1e3deb99a1ae16d942318da648c1327ece5.
* i965/urb: fixes division by zeroArdinartsev Nikita2016-05-181-19/+5
| | | | | | | Fixes regression introduced by af5ca43f2676bff7499f93277f908b681cb821d0 Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419
* mesa: fclose() filename on error.Matt Turner2016-05-181-1/+5
| | | | | Pretty useless, as it's in debugging code. Found by Coverity (CID 1257016).