summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* nir: Add a simple nir_lower_wpos_center() pass for Vulkan drivers.Kenneth Graunke2016-05-203-0/+109
| | | | | | | | | | | | | | | | | | | | nir_lower_wpos_ytransform() is great for OpenGL, which allows applications to choose whether their coordinate system's origin is upper left/lower left, and whether the pixel center should be on integer/half-integer boundaries. Vulkan, however, has much simpler requirements: the pixel center is always half-integer, and the origin is always upper left. No coordinate transform is needed - we just need to add <0.5, 0.5>. This means that we can avoid using (and setting up) a uniform. I thought about adding more options to nir_lower_wpos_ytransform(), but making a new pass that never even touched uniforms seemed simpler. v2: Use normal iterator rather than _safe variant (noticed by Matt). Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Rob Clark <[email protected]>
* nir: Don't use ffma in nir_lower_wpos_ytransform().Kenneth Graunke2016-05-201-12/+8
| | | | | | | | | | | ffma is an explicitly fused multiply add with higher precision. The optimizer will take care of promoting mul/add to fma when it's beneficial to do so. This fixes failures on Gen4-5 when using this pass, as those platforms don't actually implement fma(). Signed-off-by: Kenneth Graunke <[email protected]>
* nir: Handle fddy_fine and fddy_coarse in nir_lower_wpos_ytransform.Kenneth Graunke2016-05-201-1/+3
| | | | | | | These also need flipping! Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: Make lower_wpos_ytransform_block a void function.Kenneth Graunke2016-05-201-3/+1
| | | | | | | | The return value was used for the old nir_foreach_block callback system, but at this point it no longer means anything. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: Make nir_lower_wpos_ytransform() match FragCoord by location.Kenneth Graunke2016-05-201-1/+2
| | | | | | | | | | | | | gl_FragCoord is a shader input with location == VARYING_SLOT_POS. ARB_fragment_programs have an equivalent input at VARYING_SLOT_POS, but it isn't called gl_FragCoord. We do want to transform it. Matching by location guarantees we catch both. Fixes several fp tests on a branch which uses this pass on i965. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: Add interp_var_at_offset flipping.Kenneth Graunke2016-05-201-0/+21
| | | | | | | The Y-offset needs flipping as well, similar to ddy. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: Fix fddy swizzles in nir_lower_wpos_ytransform().Kenneth Graunke2016-05-201-0/+3
| | | | | | | | | | | The original value might have been swizzled. That's taken care of in the fmul source - we don't want to reswizzle it again. Fixes validation failures in glsl-derivs-varyings on a branch of mine which uses this pass in i965. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: Fix wpos_ytransform lowering state_slot swizzle.Kenneth Graunke2016-05-201-0/+2
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* i965: Fix brw_regs_equal() for NaN and positive/negative zero.Kenneth Graunke2016-05-201-1/+2
| | | | | | | | We'd like the comparisons to mean "the exact same bits". Comparing doubles won't do that for NaN values or positive vs. negative zero. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* virgl: handle cull distance cap.Dave Airlie2016-05-211-0/+1
| | | | Signed-off-by: Dave Airlie <[email protected]>
* virgl: Add missing texture transfer_inline_writeRob Herring2016-05-211-1/+1
| | | | | | | | | | | | | transfer_inline_write cannot be NULL and the virgl renderer doesn't support inline writes for textures, so add the default version. This fixes a crash in st_TexSubImage since commit fb9fe352ea41 ("st/mesa: use transfer_inline_write for memcpy TexSubImage path"). Cc: Marek Olšák <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* anv: Merge in my TODO list itemsKristian Høgsberg Kristensen2016-05-201-0/+11
| | | | Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* mesa: Replace uses of Shared->Mutex with hash-table mutexesMatt Turner2016-05-209-50/+78
| | | | | | | | | | | | | | | We were locking the Shared->Mutex and then using calling functions like _mesa_HashInsert that do additional per-hash-table locking internally. Instead just lock each hash-table's mutex and use functions like _mesa_HashInsertLocked and the new _mesa_HashRemoveLocked. In order to do this, we need to remove the locking from _mesa_HashFindFreeKeyBlock since it will always be called with the per-hash-table lock taken. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* hash: Add _mesa_HashRemoveLocked() function.Matt Turner2016-05-202-4/+17
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Pass nir_src/nir_dest by reference.Matt Turner2016-05-204-18/+18
| | | | | | | | | | Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before 5766074 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Guard against NULL dereferenceMark Janes2016-05-201-1/+1
| | | | | | | | | This trivially corrects mesa 3ca1c221, which introduced a check that crashes when a match is not found. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Fixes: piglit.spec.glsl-1_50.compiler.interface-blocks-name-reused-globally-4.vert Reviewed-by: Alejandro Piñeiro <[email protected]>
* anv: Enable textureCompressionASTC_LDR on Gen9+Nanley Chery2016-05-202-29/+29
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/format: Reorder ASTC mappings to match ISL enum orderingNanley Chery2016-05-201-14/+14
| | | | | | | Keep the lists consistent for ease of use. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* genxml: Expand SKL's SurfaceFormat field width for ASTCNanley Chery2016-05-201-2/+1
| | | | | | | | | In the expanded field, only ASTC format enums have the MSB set to 1. Expanding the field width makes the process of handling these formats identical to the way other formats are handled. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* isl: Handle npot ASTC block dimensions on Gen9+Nanley Chery2016-05-201-4/+4
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* isl: Add 2D ASTC format layouts and enumsNanley Chery2016-05-203-1/+59
| | | | | | | | Also, make changes needed for successful compilation and registration as a texture compression mode. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/validate: assume() that hashtable entry existsRob Clark2016-05-201-0/+3
| | | | | | | | | | | | | | | At this point, it would require a logic error in nir_validate to not have already populated this hashtable entry, but coverity doesn't realize that: CID 1265547 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. CID 1271039 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: coverity unitialized pointer readRob Clark2016-05-201-0/+2
| | | | | | | | | Not sure how coverity arrives at the conclusion that we can read comp[j] unitialized (around line 204), other than not being aware that ncomp is greater than 1 so it won't underflow in the 'if (tex->is_array)' case. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: coverity sign-extension fixRob Clark2016-05-201-1/+1
| | | | | | | | | | | | | | | | Not 100% sure, but I think being an unsigned literal will help: CID 1358505 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension: load1->def.num_components with type unsigned char (8 bits, unsigned) is promoted in load1->def.num_components * (load1->def.bit_size / 8) to type int (32 bits, signed), then sign-extended to type unsigned long (64 bits, unsigned). If load1->def.num_components * (load1->def.bit_size / 8) is greater than 0x7FFFFFFF, the upper bits of the result will all be 1. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/glsl_to_nir: quell some uninit_member coverity errorsRob Clark2016-05-201-0/+6
| | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Matt Turner <[email protected]>
* freedreno/ir3: need to lower fmod tooRob Clark2016-05-201-0/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* i965: Fix strerror error code signMark Janes2016-05-201-1/+1
| | | | | | | This trivial fix to error-handling corrects the sign of drm error codes before passing them to strerror. Identified by Coverity: CID1358581
* nir/spirv: Handle the NonReadable decoration on struct membersJason Ekstrand2016-05-191-0/+1
|
* anv/pipeline: Bounds-check resource indices when robuts_buffer_access is enabledJason Ekstrand2016-05-191-17/+35
|
* anv/pipeline: Only do buffer bounds checks if robustBufferAccess is enabledJason Ekstrand2016-05-191-1/+7
|
* anv/apply_dynamic_offsets: Use rewrite_src instead of a regular assignmentJason Ekstrand2016-05-191-4/+5
| | | | | | Originally we removed the instruction, changed the source, and then re-inserted it. This works, but nir_instr_rewrite_src is a bit more obviously correct.
* anv/device: Add a boolean for robust buffer accessJason Ekstrand2016-05-192-0/+4
|
* anv: Add a TODO fileJason Ekstrand2016-05-191-0/+23
|
* glsl: handle same struct redeclaration (v2)Dave Airlie2016-05-203-4/+11
| | | | | | | | | | | | | | | | This works around a bug in older version of UE4, where a shader defines the same structure twice. Although we aren't sure this is correct GLSL (it most likely isn't) there are enough UE4 based things out there we should deal with this. This drops the error to a warning if the struct names and contents match. v1.1: do better C++ on record_compare declaration (Rob) v2: restrict this to desktop GL only (Ian) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz.Matt Turner2016-05-191-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ken suggested instead of a big and complicated optimization pass, to just recognize the operations here. It's certainly less code and a lot prettier, but it seems to actually perform worse for currently unknown reasons. total instructions in shared programs: 8923452 -> 8904108 (-0.22%) instructions in affected programs: 814563 -> 795219 (-2.37%) helped: 3336 HURT: 10 total cycles in shared programs: 66970734 -> 66651476 (-0.48%) cycles in affected programs: 10582686 -> 10263428 (-3.02%) helped: 2438 HURT: 691 total spills in shared programs: 1811 -> 1789 (-1.21%) spills in affected programs: 85 -> 63 (-25.88%) helped: 4 total fills in shared programs: 3143 -> 3109 (-1.08%) fills in affected programs: 167 -> 133 (-20.36%) helped: 4 LOST: 2 GAINED: 36 Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add infrastucture for sample lod-zero operations.Matt Turner2016-05-196-0/+33
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add and use get_nir_src_imm().Matt Turner2016-05-192-4/+19
| | | | | | | | The next patch wants to inspect the LOD argument and do something different if it's 0.0f. But at that point we've emitted a MOV for it and we just have a register to look at. Reviewed-by: Kenneth Graunke <[email protected]>
* nvc0: account for shader-allocated local memory needsIlia Mirkin2016-05-192-2/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: treat addresses as localIlia Mirkin2016-05-191-1/+1
| | | | | | | | | Address registers are always loaded right before use. Don't treat them as "global", which will cause them to be put into the function's linkage, and will make the register allocator hold onto that register until the end of the function. Signed-off-by: Ilia Mirkin <[email protected]>
* swr: [rasterizer] utility functions for shared libsTim Rowley2016-05-192-2/+64
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer jitter] fix assert in AVX implementation of MASKLOADDTim Rowley2016-05-191-2/+7
| | | | | | llvm changed the mask type to vector of ints with 3.8. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] apply KNOB_TOSS_DRAW to more functionsTim Rowley2016-05-191-0/+20
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer jitter] add instancing to non-gather fetch pathTim Rowley2016-05-191-5/+37
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] move MultisampleTrait static from header to cppTim Rowley2016-05-193-4/+7
| | | | | | | Move a MultisampleTrait static from header to cpp as clang seemed to get confused with some specializations in the header vs some in cpp. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] clang override for _mm_undefined*Tim Rowley2016-05-191-1/+1
| | | | | | Not supported in older xcode versions. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer common] add OSX to unix portability sectionsTim Rowley2016-05-192-2/+9
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] rename _aligned_malloc to AlignedMallocTim Rowley2016-05-198-25/+40
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer jitter] rename MEMCPY function to MEMCOPYTim Rowley2016-05-191-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer common] guard definition of __cdecl/__stdcallTim Rowley2016-05-191-0/+4
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer common] include cstddef for offsetofTim Rowley2016-05-191-0/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>