summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: don't declare LDS in TESMarek Olšák2017-01-231-2/+1
| | | | | | not used since we started using the offchip tess ring Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: preload PS inputs only if KILL is usedMarek Olšák2017-01-231-2/+6
| | | | | | | | | | | so that most shaders can get lower VGPR usage thanks to lazy input loading. I think this is a more accurate constraint that prevents the black transitions in Witcher 2. Affected shaders (7758): Max Waves: 57437 -> 58231 (1.38 %) Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: adjust the rule for using the LINEAR_ALIGNED layoutMarek Olšák2017-01-231-1/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: drop all IBs if at least one was rejected within the contextMarek Olšák2017-01-231-1/+7
| | | | | | The corruption is inevitable and hangs are possible too. Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: report a rejected IB as a lost contextMarek Olšák2017-01-233-0/+14
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* vulkan: import latest registry for 1.0.39 extensions.Dave Airlie2017-01-241-42/+408
| | | | | Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* vulkan: bump vulkan.h to 1.0.39 versionDave Airlie2017-01-241-2/+365
| | | | | | | This introduces a bunch of new extension defines. Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: don't resubmit the same cs over and over while tracingGrazvydas Ignotas2017-01-231-2/+1
| | | | | | | Fixes: 97dfff54 ("radv: Dump command buffer on hang.") Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> CC: <[email protected]>
* gallium/radeon: add HUD queries for monitoring some hw blocksSamuel Pitoiset2017-01-234-1/+110
| | | | | | | | | | | | It's also possible to monitor them via performance counters but the hardware can only use two counters simultaneously. It seems easier to re-use the existing code which reads from MMIO instead of writing a multi-pass approach. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: refactor the GRBM counters pathSamuel Pitoiset2017-01-233-43/+47
| | | | | | | | | | This will allow to expose more queries in order to know which blocks are busy/idle. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* swr: Align query results allocationGeorge Kyriazis2017-01-232-4/+5
| | | | | | | | | | | Some query results struct contents are declared as cache line aligned. Use aligned malloc, and align the whole struct, to be safe. Fixes crash when compiling with clang. CC: <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Prune empty nodes in CalculateProcessorTopology.Bruce Cherniak2017-01-231-0/+9
| | | | | | | | | | | | CalculateProcessorTopology tries to figure out system topology by parsing /proc/cpuinfo to determine the number of threads, cores, and NUMA nodes. There are some architectures where the "physical id" begins with 1 rather than 0, which was creating and empty "0" node and causing a crash in CreateThreadPool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102 Reviewed-By: George Kyriazis <[email protected]> CC: <[email protected]>
* i965: Use UNUSED to silence unused variable (used in assert).Matt Turner2017-01-231-1/+1
|
* dri: allow 16bit R/GR images to be exported via drm buffersRainer Hochecker2017-01-234-0/+24
| | | | | | | This allows eglCreateImageKHR to access P010 surfaces created by vaapi Signed-off-by: Rainer Hochecker <[email protected]> Acked-by: Ben Widawky <[email protected]>
* st/va: make sure that we call begin_frame() only once v2Christian König2017-01-232-3/+9
| | | | | | | | | | This fixes "st/va: delay calling begin_frame until we have all parameters". v2: call begin frame after decoder (re)creation as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Nayan Deshmukh <[email protected]> Tested-by: Andy Furniss <[email protected]>
* drirc: remove spurious tabsEric Engestrom2017-01-231-8/+8
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_tgsi: use DDIV instead of DRCP + DMULNicolai Hähnle2017-01-231-6/+3
| | | | | | | | | | | | Fixes GL45-CTS.gpu_shader_fp64.built_in_functions. v2: use DDIV unconditionally (Roland) Reviewed-by: Roland Scheidegger <[email protected]> (v1) Reviewed-by: Marek Olšák <[email protected]> (v1) Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* glsl: split DIV_TO_MUL_RCP into single- and double-precision flagsNicolai Hähnle2017-01-232-9/+14
| | | | | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* r600: implement DDIVNicolai Hähnle2017-01-231-0/+59
| | | | | | Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* r600: factor out cayman_emit_unary_double_rawNicolai Hähnle2017-01-231-20/+42
| | | | | | | | We will use it for DDIV. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* r600: double multiply can handle only one multiply at a timeNicolai Hähnle2017-01-231-17/+19
| | | | | | | | | | It seems clear that trying to multiply two pairs of doubles would result in the temporary register getting overwritten by the second pair. So make the code more explicit. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* glsl: fix tes linking regressionTimothy Arceri2017-01-231-2/+2
| | | | | Fixes regression caused by cbeba6bd48da2c. I accidentally pushed the wrong version of the patch.
* mesa: remove unused gl_shader_info field from gl_linked_shaderTimothy Arceri2017-01-231-2/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set and get cs layouts to and from shader_infoTimothy Arceri2017-01-234-36/+17
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set and get gs layouts directly to and from shader_infoTimothy Arceri2017-01-232-41/+41
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl/i965: set and get tes layouts directly to and from shader_infoTimothy Arceri2017-01-233-45/+40
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use last_vert_prog to get last {clip,cull}_distance_array_sizeTimothy Arceri2017-01-233-23/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set {clip,cull}_distance_array_size directly in gl_programTimothy Arceri2017-01-236-73/+25
| | | | | | | There are some line wrapping violations here but those lines will get deleted in the following patch. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa/glsl: change xfb_program field to last_vert_progTimothy Arceri2017-01-237-32/+44
| | | | | | | | | | Now that the i965 backend doesn't depend on this field we can make it more generic and short circuit a bunch of code paths. The new field will be used in a following patch for another clean-up. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: use gl_program for CurrentProgram rather than gl_shader_programTimothy Arceri2017-01-2327-396/+248
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes much more sense and should be more performant in some critical paths such as SSO validation which is called at draw time. Previously the CurrentProgram array could have contained multiple pointers to the same struct which was confusing and we would often need to fish out the information we were really after from the gl_program anyway. Also it was error prone to depend on the _LinkedShader array for programs in current use because a failed linking attempt will lose the infomation about the current program in use which is still valid. V2: fix validate_io() to compare linked_stages rather than the consumer and producer to decide if we are looking at inward facing shader interfaces which don't need validation. Acked-by: Edward O'Callaghan <[email protected]> To avoid build regressions the following 2 patches were squashed in to this commit: mesa/meta: rewrite _mesa_shader_program_use() and _mesa_program_use() These are rewritten to do what the function name suggests, that is _mesa_shader_program_use() sets the use of all stage and _mesa_program_use() sets the use of a single stage. Reviewed-by: Lionel Landwerlin <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> mesa: update active relinked program This likely fixes a subroutine bug were _mesa_shader_program_init_subroutine_defaults() would never have been called for the relinked program as we previously just set _NEW_PROGRAM as dirty and never called the _mesa_use* functions when linking. Acked-by: Edward O'Callaghan <[email protected]>
* freedreno/a5xx: set frag shader threadsizeRob Clark2017-01-221-2/+7
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: set fragcoordxy properlyRob Clark2017-01-221-1/+1
| | | | | | | | | | | What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into bary.f. We were incorrectly setting both this and gl_FragCoord.xy to the same register resulting in all sorts of hilarity. Fixes stk, vdrift, 0ad, probably a bunch others. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/ir3: setup var locations in standalone compilerRob Clark2017-01-221-1/+69
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix psizeRob Clark2017-01-222-8/+5
| | | | | | | | Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on a5xx. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: srgb fixRob Clark2017-01-221-1/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix int vbosRob Clark2017-01-221-1/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix clear for uint/sint formatsRob Clark2017-01-221-19/+28
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix cull stateRob Clark2017-01-221-5/+5
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno: update generated headersRob Clark2017-01-226-13/+36
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* anv: descriptors: don't update immutables samplers with anything but their ↵Lionel Landwerlin2017-01-211-12/+19
| | | | | | | immutable value Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/search: Use the correct bit size for integer comparisonsJason Ekstrand2017-01-211-32/+16
| | | | | | | | | | | | | | | | | | The previous code always compared integers as 64-bit. Due to variations in sign-extension in the code generated by nir_opt_algebraic.py, this meant that nir_search doesn't always do what you want. Instead, 32-bit values should be matched as 32-bit and 64-bit values should be matched as 64-bit. While we're here we unify the unsigned and signed paths. Now that we're using the right bit size, they should be the same since the only difference we had before was sign extension. This gets the UE4 bitfield_extract optimization working again. It had stopped working due to the constant 0xff00ff00 getting sign-extended when it shouldn't have. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "17.0 13.0" <[email protected]>
* intel/blorp/copy: Properly handle clear colors for CCS_E imagesJason Ekstrand2017-01-211-0/+82
| | | | | | | | | | | | | | In order to handle CCS_E, we stomp the image format to a UINT format and then do some bitcasting logic in the shader. This works fine since SKL render compression only considers the channel layout of the format and not the format itself. In order for this to work on images that have been fast-cleared, we need to also convert the clear color so that, when interpreted as UINT, it provides the same bit value as it would have in the original format. This fixes a bunch of OpenGL ES CTS tests for copy_image when we start using CCS more aggressively. Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "17.0" <[email protected]>
* glsl: Rename [u]int64_t tokens.Kenneth Graunke2017-01-202-5/+5
| | | | | | | | | | basetsd.h on Windows defines INT64 and UINT64 typedefs which conflict with these. Append "_TOK" to avoid conflicts. Should fix the Windows build. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* Revert "i965: Really don't emit Q or UQ moves on Gen < 8"Matt Turner2017-01-201-8/+0
| | | | | | This reverts commit c95380c4044237d73fb537511667c3c8f658fcee. Acked-by: Kenneth Graunke <[email protected]>
* i965: Select DF type for 64-bit integers on Gen < 8.Matt Turner2017-01-204-10/+12
| | | | | | | | | | | | Gen8 adds Q/UQ types. We attempted to change the types back to DF in the generator (commit c95380c40), but an assertion added in the FP64 series (commit e481dcc3) triggers before that code has a chance to execute. In fact, using Q/UQ in the IR and then changing to DF in the generator would not work in the presence of source modifiers, etc. Fixes: d6fcede6 ("i965: Return Q and UQ types for int64 and uint64") Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Enable ARB_gpu_shader_int64 on Gen8+Ian Romanick2017-01-202-0/+6
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Split SIMD16 CMP of Q and UQ instructionsIan Romanick2017-01-201-14/+29
| | | | | | | This is basically the same as happens for doubles. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Enable 64-bit integer support for almost all unary and binary operationsIan Romanick2017-01-201-10/+0
| | | | | | | | Integer comparison functions (e.g., nir_op_ilt) are handled in the next commit. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Enable uploading 64-bit integer uniformsIan Romanick2017-01-201-1/+3
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add 64-bit integer support for conversions and bitcastsIan Romanick2017-01-202-5/+35
| | | | | | | | | | v2 (idr): Make the "from" type in a cast unsized. This reduces the number of required cast operations at the expensive slightly more complex code. However, this will be a dramatic improvement when other sized integer types are added. Suggested by Connor. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>