summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* gallium/radeon: add HUD queries for monitoring some hw blocksSamuel Pitoiset2017-01-234-1/+110
| | | | | | | | | | | | It's also possible to monitor them via performance counters but the hardware can only use two counters simultaneously. It seems easier to re-use the existing code which reads from MMIO instead of writing a multi-pass approach. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: refactor the GRBM counters pathSamuel Pitoiset2017-01-233-43/+47
| | | | | | | | | | This will allow to expose more queries in order to know which blocks are busy/idle. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* swr: Align query results allocationGeorge Kyriazis2017-01-232-4/+5
| | | | | | | | | | | Some query results struct contents are declared as cache line aligned. Use aligned malloc, and align the whole struct, to be safe. Fixes crash when compiling with clang. CC: <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Prune empty nodes in CalculateProcessorTopology.Bruce Cherniak2017-01-231-0/+9
| | | | | | | | | | | | CalculateProcessorTopology tries to figure out system topology by parsing /proc/cpuinfo to determine the number of threads, cores, and NUMA nodes. There are some architectures where the "physical id" begins with 1 rather than 0, which was creating and empty "0" node and causing a crash in CreateThreadPool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102 Reviewed-By: George Kyriazis <[email protected]> CC: <[email protected]>
* i965: Use UNUSED to silence unused variable (used in assert).Matt Turner2017-01-231-1/+1
|
* dri: allow 16bit R/GR images to be exported via drm buffersRainer Hochecker2017-01-233-0/+20
| | | | | | | This allows eglCreateImageKHR to access P010 surfaces created by vaapi Signed-off-by: Rainer Hochecker <[email protected]> Acked-by: Ben Widawky <[email protected]>
* st/va: make sure that we call begin_frame() only once v2Christian König2017-01-232-3/+9
| | | | | | | | | | This fixes "st/va: delay calling begin_frame until we have all parameters". v2: call begin frame after decoder (re)creation as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Nayan Deshmukh <[email protected]> Tested-by: Andy Furniss <[email protected]>
* drirc: remove spurious tabsEric Engestrom2017-01-231-8/+8
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/glsl_to_tgsi: use DDIV instead of DRCP + DMULNicolai Hähnle2017-01-231-6/+3
| | | | | | | | | | | | Fixes GL45-CTS.gpu_shader_fp64.built_in_functions. v2: use DDIV unconditionally (Roland) Reviewed-by: Roland Scheidegger <[email protected]> (v1) Reviewed-by: Marek Olšák <[email protected]> (v1) Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* glsl: split DIV_TO_MUL_RCP into single- and double-precision flagsNicolai Hähnle2017-01-232-9/+14
| | | | | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* r600: implement DDIVNicolai Hähnle2017-01-231-0/+59
| | | | | | Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* r600: factor out cayman_emit_unary_double_rawNicolai Hähnle2017-01-231-20/+42
| | | | | | | | We will use it for DDIV. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* r600: double multiply can handle only one multiply at a timeNicolai Hähnle2017-01-231-17/+19
| | | | | | | | | | It seems clear that trying to multiply two pairs of doubles would result in the temporary register getting overwritten by the second pair. So make the code more explicit. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* glsl: fix tes linking regressionTimothy Arceri2017-01-231-2/+2
| | | | | Fixes regression caused by cbeba6bd48da2c. I accidentally pushed the wrong version of the patch.
* mesa: remove unused gl_shader_info field from gl_linked_shaderTimothy Arceri2017-01-231-2/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set and get cs layouts to and from shader_infoTimothy Arceri2017-01-234-36/+17
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set and get gs layouts directly to and from shader_infoTimothy Arceri2017-01-232-41/+41
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl/i965: set and get tes layouts directly to and from shader_infoTimothy Arceri2017-01-233-45/+40
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use last_vert_prog to get last {clip,cull}_distance_array_sizeTimothy Arceri2017-01-233-23/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set {clip,cull}_distance_array_size directly in gl_programTimothy Arceri2017-01-236-73/+25
| | | | | | | There are some line wrapping violations here but those lines will get deleted in the following patch. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa/glsl: change xfb_program field to last_vert_progTimothy Arceri2017-01-237-32/+44
| | | | | | | | | | Now that the i965 backend doesn't depend on this field we can make it more generic and short circuit a bunch of code paths. The new field will be used in a following patch for another clean-up. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: use gl_program for CurrentProgram rather than gl_shader_programTimothy Arceri2017-01-2327-396/+248
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes much more sense and should be more performant in some critical paths such as SSO validation which is called at draw time. Previously the CurrentProgram array could have contained multiple pointers to the same struct which was confusing and we would often need to fish out the information we were really after from the gl_program anyway. Also it was error prone to depend on the _LinkedShader array for programs in current use because a failed linking attempt will lose the infomation about the current program in use which is still valid. V2: fix validate_io() to compare linked_stages rather than the consumer and producer to decide if we are looking at inward facing shader interfaces which don't need validation. Acked-by: Edward O'Callaghan <[email protected]> To avoid build regressions the following 2 patches were squashed in to this commit: mesa/meta: rewrite _mesa_shader_program_use() and _mesa_program_use() These are rewritten to do what the function name suggests, that is _mesa_shader_program_use() sets the use of all stage and _mesa_program_use() sets the use of a single stage. Reviewed-by: Lionel Landwerlin <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> mesa: update active relinked program This likely fixes a subroutine bug were _mesa_shader_program_init_subroutine_defaults() would never have been called for the relinked program as we previously just set _NEW_PROGRAM as dirty and never called the _mesa_use* functions when linking. Acked-by: Edward O'Callaghan <[email protected]>
* freedreno/a5xx: set frag shader threadsizeRob Clark2017-01-221-2/+7
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: set fragcoordxy properlyRob Clark2017-01-221-1/+1
| | | | | | | | | | | What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into bary.f. We were incorrectly setting both this and gl_FragCoord.xy to the same register resulting in all sorts of hilarity. Fixes stk, vdrift, 0ad, probably a bunch others. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/ir3: setup var locations in standalone compilerRob Clark2017-01-221-1/+69
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix psizeRob Clark2017-01-222-8/+5
| | | | | | | | Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on a5xx. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: srgb fixRob Clark2017-01-221-1/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix int vbosRob Clark2017-01-221-1/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix clear for uint/sint formatsRob Clark2017-01-221-19/+28
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno/a5xx: fix cull stateRob Clark2017-01-221-5/+5
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* freedreno: update generated headersRob Clark2017-01-226-13/+36
| | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]>
* anv: descriptors: don't update immutables samplers with anything but their ↵Lionel Landwerlin2017-01-211-12/+19
| | | | | | | immutable value Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/search: Use the correct bit size for integer comparisonsJason Ekstrand2017-01-211-32/+16
| | | | | | | | | | | | | | | | | | The previous code always compared integers as 64-bit. Due to variations in sign-extension in the code generated by nir_opt_algebraic.py, this meant that nir_search doesn't always do what you want. Instead, 32-bit values should be matched as 32-bit and 64-bit values should be matched as 64-bit. While we're here we unify the unsigned and signed paths. Now that we're using the right bit size, they should be the same since the only difference we had before was sign extension. This gets the UE4 bitfield_extract optimization working again. It had stopped working due to the constant 0xff00ff00 getting sign-extended when it shouldn't have. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "17.0 13.0" <[email protected]>
* intel/blorp/copy: Properly handle clear colors for CCS_E imagesJason Ekstrand2017-01-211-0/+82
| | | | | | | | | | | | | | In order to handle CCS_E, we stomp the image format to a UINT format and then do some bitcasting logic in the shader. This works fine since SKL render compression only considers the channel layout of the format and not the format itself. In order for this to work on images that have been fast-cleared, we need to also convert the clear color so that, when interpreted as UINT, it provides the same bit value as it would have in the original format. This fixes a bunch of OpenGL ES CTS tests for copy_image when we start using CCS more aggressively. Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "17.0" <[email protected]>
* glsl: Rename [u]int64_t tokens.Kenneth Graunke2017-01-202-5/+5
| | | | | | | | | | basetsd.h on Windows defines INT64 and UINT64 typedefs which conflict with these. Append "_TOK" to avoid conflicts. Should fix the Windows build. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* Revert "i965: Really don't emit Q or UQ moves on Gen < 8"Matt Turner2017-01-201-8/+0
| | | | | | This reverts commit c95380c4044237d73fb537511667c3c8f658fcee. Acked-by: Kenneth Graunke <[email protected]>
* i965: Select DF type for 64-bit integers on Gen < 8.Matt Turner2017-01-204-10/+12
| | | | | | | | | | | | Gen8 adds Q/UQ types. We attempted to change the types back to DF in the generator (commit c95380c40), but an assertion added in the FP64 series (commit e481dcc3) triggers before that code has a chance to execute. In fact, using Q/UQ in the IR and then changing to DF in the generator would not work in the presence of source modifiers, etc. Fixes: d6fcede6 ("i965: Return Q and UQ types for int64 and uint64") Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Enable ARB_gpu_shader_int64 on Gen8+Ian Romanick2017-01-202-0/+6
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Split SIMD16 CMP of Q and UQ instructionsIan Romanick2017-01-201-14/+29
| | | | | | | This is basically the same as happens for doubles. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Enable 64-bit integer support for almost all unary and binary operationsIan Romanick2017-01-201-10/+0
| | | | | | | | Integer comparison functions (e.g., nir_op_ilt) are handled in the next commit. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Enable uploading 64-bit integer uniformsIan Romanick2017-01-201-1/+3
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add 64-bit integer support for conversions and bitcastsIan Romanick2017-01-202-5/+35
| | | | | | | | | | v2 (idr): Make the "from" type in a cast unsized. This reduces the number of required cast operations at the expensive slightly more complex code. However, this will be a dramatic improvement when other sized integer types are added. Suggested by Connor. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Enable emitting Q and UQ instructions in the fs backendIan Romanick2017-01-202-1/+12
| | | | | | | | v2: Fixup assertion in brw_reg_type_to_hw_type to allow BRW_REGISTER_TYPE_{UQ,Q} on Gen8+. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add support for constant evaluation on Q and UQ typesIan Romanick2017-01-202-7/+20
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Return Q and UQ types for int64 and uint64Ian Romanick2017-01-201-4/+2
| | | | | | | | | It seems like maybe this should return a different type based on Gen. Q and UQ only exist on Gen8+, but, based on the old comment, I believe previous Gens can generate 64-bit moves. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Really don't emit Q or UQ moves on Gen < 8Ian Romanick2017-01-201-0/+8
| | | | | | | | | | | | | | | | | | | | | | It's much easier to do this in the generator rather than while coming out of NIR. brw_type_for_nir_type doesn't know the Gen, so we'd have to add a bunch of plumbing. The alternate fix is to not emit int64 moves for doubles in the first place... but that seems even more difficult. This change won't catch non-MOV instructions that try to use 64-bit integer types on Gen < 8. This may convert certain kinds of bugs in to different kinds of bugs that are more difficult to detect (since the assertions in the function won't catch them). NOTE: I don't think anything can emit mixed-type 64-bit moves until the same platform supports both ARB_gpu_shader_fp64 and ARB_gpu_shader_int64. When we enable int64 on Gen < 8, we can solve this problem other ways. This prevents regressions on HSW in the next patch. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Add support for 64-bit integer types to split_var_copies_blockIan Romanick2017-01-201-0/+2
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Enable 64-bit integer support for almost all unary and binary operationsIan Romanick2017-01-201-9/+15
| | | | | | | | | | v2: Don't up-convert the shift count parameter if shift instructions. Suggested by Connor. Add type_is_singed() function. This will make adding 8- and 16-bit types easier. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: Jason Ekstrand <[email protected]>
* nir: Shift count for shift opcodes is always 32-bitsIan Romanick2017-01-202-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously both sources were unsized. This caused problems when the thing being shifted was 64-bit but the shift count was 32-bit. The expectation in NIR is that all unsized sources (and destination) will ultimately have the same size. The changes in nir_opt_algebraic.py are to prevent errors like: Failed to parse transformation: 03:12:25 (('extract_i8', 'a', 'b'), ('ishr', ('ishl', 'a', ('imul', ('isub', 3, 'b'), 8)), 24), 'options->lower_extract_byte') 03:12:25 Traceback (most recent call last): 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 610, in __init__ 03:12:25 xform = SearchAndReplace(xform) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 495, in __init__ 03:12:25 BitSizeValidator(varset).validate(self.search, self.replace) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 311, in validate 03:12:25 validate_dst_class = self._validate_bit_class_up(replace) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 414, in _validate_bit_class_up 03:12:25 src_class = self._validate_bit_class_up(val.sources[i]) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 420, in _validate_bit_class_up 03:12:25 assert src_class == src_type_bits 03:12:25 AssertionError Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Connor Abbott <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: Jason Ekstrand <[email protected]>
* nir: Lower packing and unpacking of 64-bit integer typesIan Romanick2017-01-201-5/+37
| | | | | | | | | This change makes me wonder whether double packing should be reimplemented as int64BitsToDouble(packInt2x32(v)). I'm a little on the fence since not all platforms that support fp64 natively support int64. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>