summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* intel/genxml: Use blend function and factor enums where applicableKristian H. Kristensen2016-11-295-130/+124
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum 3D_Vertex_Component_Control where applicableKristian H. Kristensen2016-11-295-20/+20
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum 3D_Stencil_Operation where applicableKristian H. Kristensen2016-11-295-84/+63
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum SURFACE_FORMAT where applicableKristian H. Kristensen2016-11-295-10/+10
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum 3D_Prim_Topo_Type where applicableKristian H. Kristensen2016-11-295-15/+15
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use 3D_Compare_Function for gen8+ test functionsKristian H. Kristensen2016-11-292-8/+8
| | | | | | | | | When the state fields where shuffled around for gen8, the compare function enums were downgraded to just uints. Change them to enum 3D_Compare_Function. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Emit genxml enums as C enumsKristian H. Kristensen2016-11-291-4/+4
| | | | | | | | | The previous commits got rid of any clashes between #defines and enum values and we can now emit the genxml enums as debugger friendly C enums. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Remove duplicate COMPAREFUNCTION valuesKristian H. Kristensen2016-11-293-120/+12
| | | | | | | | These values were defined both as an enum and as inline values. Remove the inline values and reference the 3D_Compare_Function enum instead. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Allow referencing enums in type attributesKristian H. Kristensen2016-11-291-0/+7
| | | | | | | This lets us reference enums in the type attribute of a field. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Emit cherryview SF state without including gen9_pack.hKristian H. Kristensen2016-11-291-13/+23
| | | | | | | | | Cleaner this way and we avoid including gen9_pack.h when we compile with gen8_pack.h. We also avoid the if (cherryview) condition for non-gen8 gens that don't need it. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Don't include two different pack headersKristian H. Kristensen2016-11-291-3/+5
| | | | | | | | | | The batch chain logic only needs the pre-gen8 size of MI_BATCH_BUFFER_START, which seems like something we can make a special case for. The other two gen7 references, MI_BATCH_BUFFER_END and MI_NOOP, are the same on all gens. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Move enums above structsKristian H. Kristensen2016-11-295-1726/+1726
| | | | | | | | | We'll need to define them before we can reference them in structs and instructions. Enums have no dependencies, so move them first in the file. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* genxml: Add values for Barycentric Interpolation ModeKristian H. Kristensen2016-11-295-5/+40
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: remove per-sample shading from TODOIlia Mirkin2016-11-301-1/+0
| | | | | | | This was done some time ago. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: clean up VkPhysicalDeviceFeatures listIlia Mirkin2016-11-301-3/+3
| | | | | | | | | | Remove duplicate .alphaToOne, add missing .shaderResourceMinLod, and reorder a few entries to match their vulkan.h order. All the sparse features are still left out entirely. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vulkan/wsi/x11: Destroy Present event context when destroying swapchainMichel Dänzer2016-11-301-0/+6
| | | | | | | | | | | | | | Without this, the X server may accumulate stale Present event contexts if a client creates and destroys multiple swapchains using the same window. v2: Based on Chris Wilson's review: * Use xcb_present_select_input_checked so that protocol errors generated by old X servers can be handled gracefully * Use xcb_discard_reply() instead of free(xcb_request_check()) v3: Rebased on top of this code having been refactored out of anv Reviewed-by: Dave Airlie <[email protected]>
* glsl: use linked_shaders bitmask to iterate stages for subroutine fieldsTimothy Arceri2016-11-302-31/+26
| | | | | | | | | This should be faster than looping over every stage and null checking, but will also make the code a bit cleaner when we switch to getting more fields from gl_program rather than from gl_linked_shader as we can just copy the pointer and not need to worry about null checking then copying. Reviewed-by: Ian Romanick <[email protected]>
* mesa: optimise interleaved sso validationTimothy Arceri2016-11-301-11/+14
| | | | | | | | | | Now that we have a linked_stages bitfield we can use this to check if the program is used at a later stage. This change is also required to be able to use gl_program rather than gl_shader_program in the CurrentProgram array. Reviewed-by: Ian Romanick <[email protected]>
* mesa/glsl: add bitmask to track stages a program was linked againstTimothy Arceri2016-11-302-0/+4
| | | | | | | | | | | | | | | | This will be used to enable us to store the current gl_program rather than gl_shader_program in the gl_pipline_object allowing us to simplify handing of validation. Also we should not be depending on _LinkedShader for this information as it may contain shaders from a failed linking attempt rather than the current program still in use. We could also use this mask to iterate over the stages during linking with _mesa_bit_scan() rather then the current method of NULL checking each stage. Reviewed-by: Ian Romanick <[email protected]>
* swr: [rasterizer jit] use signed integer representation for logic opIlia Mirkin2016-11-291-5/+12
| | | | | | | | | | Instead of (incorrectly) biasing the snorm value to make it look like a unorm, just use signed integer math. This fixes arb_color_buffer_float-render GL_RGBA8_SNORM Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: add missing rgbx8_srgb variantIlia Mirkin2016-11-291-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: reorder renderable formats, add grouping commentsIlia Mirkin2016-11-291-65/+87
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: use util_copy_framebuffer_state helperIlia Mirkin2016-11-291-12/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: enable cubemap arraysIlia Mirkin2016-11-291-1/+1
| | | | | | | Everything is in place for these. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: rearrange caps into limits/supported/unsupported groupsIlia Mirkin2016-11-291-129/+84
| | | | | | | | | | I find this a lot more readable and compact - much easier to scan through the list and see what's on and what's off. No functional change intended. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: only store up to the LOD sizeIlia Mirkin2016-11-291-1/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer common] add SwrTrace() and macrosTim Rowley2016-11-292-15/+95
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* radeonsi: don't fetch 8 dwords for samplerBuffer and imageBufferMarek Olšák2016-11-291-51/+43
| | | | | | | | | | | | | | The compiler doesn't shrink s_load_dwordx8, so we always wasted 4 SGPRs. Also, the extraction of the descriptor created some really ugly asm code with lots of VALU bitwise ops and v_readfirstlane. Totals from *affected* shaders: SGPRS: 13880 -> 13253 (-4.52 %) VGPRS: 15200 -> 15088 (-0.74 %) Code Size: 499864 -> 459816 (-8.01 %) bytes Max Waves: 1554 -> 1564 (0.64 %) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable XNACK to free 2 SGPRs on APUsMarek Olšák2016-11-291-1/+1
| | | | | | My LLVM commit disables it for dGPUs, but not APUs. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: count and report temp arrays in scratch separatelyMarek Olšák2016-11-292-4/+40
| | | | | | v2: only do this if debug output of shader dumping is enabled Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
* radeonsi: don't try to eliminate trivial VS outputs for PS and CSMarek Olšák2016-11-291-1/+4
| | | | | | PS and CS don't have any param exports, so it's a no-op. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable RB+ blend optimizations for dual source blendingMarek Olšák2016-11-291-0/+11
| | | | | | | | This fixes dual source blending on Stoney. The fix was copied from Vulkan. The problem was discovered during internal testing. Cc: 13.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blendingMarek Olšák2016-11-291-0/+4
| | | | | | | copied from Vulkan Cc: 13.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: always set all blend registersMarek Olšák2016-11-291-5/+5
| | | | | | | better safe than sorry Cc: 13.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set the smallest possible CB_TARGET_MASKMarek Olšák2016-11-291-5/+5
| | | | | | better safe than sorry; set_framebuffer_state always makes this dirty Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't print bodies of header-only packetsMarek Olšák2016-11-291-0/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print unknown registers with correct formattingMarek Olšák2016-11-291-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: fix hang detection with deferred flushesMarek Olšák2016-11-291-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: set spi_baryc_cntl.pos_float_location to 0Dave Airlie2016-11-291-1/+1
| | | | | | | | | | | This fixes: dEQP-VK.pipeline.multisample_interpolation.offset_interpolate_at_sample_position.* This should probably be 2 when sample shading is enabled, but I'm not sure. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: force persample shading when required.Dave Airlie2016-11-294-6/+23
| | | | | | | | | | | | | | | | | | | We need to force persample shading when a) shader uses sample_id b) shader uses sample_position c) shader uses sample qualifier. Also since ps_iter_samples can now change independently of the rasterizer samples we need to move setting the regs more often. This fixes: dEQP-VK.pipeline.multisample_interpolation.centroid_interpolate_at_consistency.* dEQP-VK.pipeline.multisample_interpolation.centroid_qualifier_inside_primitive.137_191_1.* dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.* dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.* Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: print var binding in dumps.Dave Airlie2016-11-291-1/+1
| | | | | | | | This only useful for spir-v shaders, but I keep finding myself having to add it. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* docs: fix small typoEric Engestrom2016-11-291-1/+1
| | | | | | Fixes: ba28f2136febca32fe56 ("docs: add note about r-b/other tags when resending") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965/sched: Schedule trivial blocks.Matt Turner2016-11-291-3/+0
| | | | | | | | | | In commit 45cd76e342d1e8e schedule_instructions(bblock_t *) began setting bblock_t::cycle_count, but that function was not called on trivial blocks. Remove the code to skip trivial blocks so that cycle_count is set. Reviewed-by: Francisco Jerez <[email protected]>
* i965/sched: Make 'time' a local variable.Matt Turner2016-11-291-3/+1
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* i965/cfg: Initialize bblock_t::cycle_count.Matt Turner2016-11-291-1/+1
| | | | | | | | | | | schedule_instructions(bblock_t *) isn't called on blocks with a single instruction, and since it is the only thing that set cycle_count, cycle_count would be uninitialized. A non-empty block with bblock_t::cycle_count == 0 is arguably a bug. That'll be fixed in the next commit. Reviewed-by: Francisco Jerez <[email protected]>
* i965/cfg: Initialize cfg_t::cycle_count.Matt Turner2016-11-292-1/+2
| | | | | | This reverts commit b4001af1744a02f472bd1204458662088307981b. Reviewed-by: Francisco Jerez <[email protected]>
* ac/nir: Fix accessing an unitialized value.Bas Nieuwenhuizen2016-11-291-1/+2
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Initialize the shader_stats_dump flag.Bas Nieuwenhuizen2016-11-291-0/+1
| | | | | | | | Meta was using it before it was set. I suspect we typically don't want to dump meta shaders, so just set it to false in the beginning. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* vc4: Add a note for the future about texture latency calculation.Eric Anholt2016-11-291-0/+20
| | | | | | | Debugging a shader-db reported cycle count regression from the tex coalescing, I eventually figured out that the texture latencies were totally bogus. Really fixing it will probably involve mirroring vc4_qir_schedule.c's texture fifo management here.
* vc4: Add support for coalescing ALU ops into tex_[srtb] MOVs.Eric Anholt2016-11-294-29/+37
| | | | | | | | | | | This isn't as complete as I would like (can't merge interpolation because of the implicit r5 dependency, doesn't work with control flow), but this was cheap and easy. Improves 3DMMES Taiji performance by 1.15353% +/- 0.299896% (n=29, 16) total instructions in shared programs: 99810 -> 99059 (-0.75%) instructions in affected programs: 10705 -> 9954 (-7.02%)