summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glsl/tests: fix segfault in uniform initializer testTimothy Arceri2016-08-111-0/+5
| | | | | | | Caused by 549222f5 Tested-by: Aaron Watry <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97286
* glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shadersIan Romanick2016-08-101-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says: It is an error to undefine or to redefine a built-in (pre-defined) macro name. The GLSL ES 1.00 spec does not contain this text. Section 3.3 (Preprocessor) of the GLSL 1.30 spec says: #define and #undef functionality are defined as is standard for C++ preprocessors for macro definitions both with and without macro parameters. At least as far as I can tell GCC allow '#undef __FILE__'. Furthermore, there are desktop OpenGL conformance tests that expect '#undef __VERSION__' and '#undef GL_core_profile' to work. Fixes: GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Cc: [email protected]
* glcpp: Track the actual version instead of just the version_resolved flagIan Romanick2016-08-102-6/+6
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Cc: [email protected]
* glsl: remove remaining tabs in link_uniform_initializers.cppTimothy Arceri2016-08-111-39/+39
| | | | Reviewed-by: Eric Anholt <[email protected]>
* glsl: use UniformHash to find storage locationTimothy Arceri2016-08-111-18/+11
| | | | | | There is no need to be looping over all the uniforms. Reviewed-by: Eric Anholt <[email protected]>
* glsl: remove dead builtins before assigning varying locationsTimothy Arceri2016-08-111-9/+9
| | | | | | | | Builtins already have locations assigned so this shouldn't change anything. We want to call it earlier so we can tranform GLSL IR to NIR earlier. Reviewed-by: Eric Anholt <[email protected]>
* glsl: split out varying and uniform linking codeTimothy Arceri2016-08-111-207/+222
| | | | | | | | | | | | | Here a new function link_varyings_and_uniforms() is created this should help make it easier to follow the code in link_shader() which was getting very large. Note the end of the new function contains a for loop with some lowering calls that currently don't seem related to varyings or uniforms but they are a dependancy for converting to NIR ealier so we move things here now to keep things easy to follow. Reviewed-by: Eric Anholt <[email protected]>
* i965/vec4: Make opt_vector_float reset at the top of each blockJason Ekstrand2016-08-101-80/+82
| | | | | | | | | | | The pass isn't really control-flow aware and you can get into case where it tries to combine instructions from different blocks. This can actually lead to an assertion failure when removing unneeded instructions if part of the vector is set in one block and part in another. This prevents regressions in the next commit. Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* mesa: Use a temporary set to track whether we've added a resource yet.Eric Anholt2016-08-101-26/+50
| | | | | | | Saves another .1s on servo.trace. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* prog_hash_table: Convert to using util/hash_table.h.Eric Anholt2016-08-102-205/+54
| | | | | | | | | | | | | | | | | | Improves glretrace -b servo.trace (a trace of Mozilla's servo rendering engine booting, rendering a page, and exiting) from 1.8s to 1.1s. It uses a large uniform array of structs, making a huge number of separate program resources, and the fixed-size hash table was killing it. Given how many times we've improved performance by swapping the hash table to util/hash_table.h, just do it once and for all. This just rebases the old hash table API on top of util/, for minimal diff. Cleaning things up is left for later, particularly because I want to fix up the new hash table API a little bit. v2: Add UNUSED to the now-unused parameter. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* prog_hash_table: Convert compare funcs to match util/hash_table.h.Eric Anholt2016-08-102-7/+11
| | | | | | | | I'm going to replace this hash table with util/hash_table.h, and the first step is to compare things the same way. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Drop an unused program/hash_table.h include.Eric Anholt2016-08-101-1/+0
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* swr: [rasterizer core] unused variable warning fixesTim Rowley2016-08-103-12/+0
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] add core string to JitManagerTim Rowley2016-08-104-6/+10
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] fix OOB check of viewport indicesTim Rowley2016-08-101-2/+2
| | | | | | Use correct comparison intrinsic for OOB check of viewport indices. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer common] add linux definition for InterlockedAdd64Tim Rowley2016-08-101-0/+2
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] add VMASKSTOREPS intrinsicTim Rowley2016-08-101-0/+1
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] add mask support for odd format fetchTim Rowley2016-08-101-15/+26
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] routing of viewport indexes through frontendTim Rowley2016-08-106-27/+91
| | | | | | Viewport transform performed based on per-prim viewport index if available. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] split FE and BE statsTim Rowley2016-08-1011-59/+95
| | | | | | | | | | | Separated FE stats out into its own structure. There are 17 FE vs 3 BE stat fields. Since there is only one FE thread per DC then we don't have to loop over all threads and sum up FE stats over all the worker threads. This also reduces size of DC since we only need to store one copy of the FE stats and not one per worker. Finally, we can use the new FE callback mechanism to update these. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] remove all old stats codeTim Rowley2016-08-107-92/+0
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] viewport array supportTim Rowley2016-08-108-34/+49
| | | | | | Change viewport matrix storage from AOS to SOA to support viewport arrays. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] fetch support for offsetting VertexIDTim Rowley2016-08-102-4/+16
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] fundamentally change how stats workTim Rowley2016-08-107-19/+94
| | | | | | Add a per draw stats callback to update driver stats. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] add rasterizerSampleCount to PS contextTim Rowley2016-08-102-0/+6
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] remove cygwin threads.cpp stubsTim Rowley2016-08-101-14/+0
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] allow override of KNOB thread settingsTim Rowley2016-08-106-70/+53
| | | | | | | | - Remove HYPERTHREADED_FE support - Add threading info as optional data passed to SwrCreateContext. If supplied this data will override any KNOB thread settings. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] add SwrWaitForIdleFETim Rowley2016-08-104-14/+51
| | | | | | | This is a blocking call that waits until all FE work is complete. This is useful for waiting for FE work to complete such as for streamout. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] change threadsDone to be a 32-bit value.Tim Rowley2016-08-103-5/+5
| | | | Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] update trivial accept test conditionsTim Rowley2016-08-101-3/+6
| | | | | | | enable/disable raster tile trivial accept test based on scissor enable trait. Can be optimized further. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] improve implementation for SoWriteOffsetTim Rowley2016-08-108-31/+63
| | | | | | | 1. SoWriteOffset is no longer treated as a stat 2. Added callback from core to update streamout write offset Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer common] make disabled asserts always print (but not break)Tim Rowley2016-08-101-5/+3
| | | | Signed-off-by: Tim Rowley <[email protected]>
* vl/rbsp: add a check for emulation prevention three byteLeo Liu2016-08-101-2/+12
| | | | | | | | | | This is the case when the "00 00 03" is very close to the beginning of nal unit header v2: move the check to rbsp init Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* Re-apply "glsl: don't try to lower non-gl builtins as if they were gl_FragData"Ilia Mirkin2016-08-101-1/+2
| | | | | | | | | | | | | If a shader has an output array, it will get treated as though it were gl_FragData and rewritten into gl_out_FragData instances. We only want this to happen on the actual gl_FragData and not everything else. This is a small part of the problem pointed out by the below bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: set CB_COLORn_INFO.ROUND_MODEMarek Olšák2016-08-101-0/+5
| | | | | | | just do what the register spec says Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: set CB_COLORn_INFO.SIMPLE_FLOATMarek Olšák2016-08-101-0/+1
| | | | | | | | This can help enable some blend optimizations (see the register spec). Vulkan always sets this. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: disallow MIN/MAX blend equations for dual source blendingMarek Olšák2016-08-101-0/+10
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: only set dual source blending for MRT0Marek Olšák2016-08-101-0/+4
| | | | | | | | | | | | | | This is the proper fix for Overlord and Witcher 2 hangs. The hang condition is that 1 app must write to MRT0 and MRT1 from a pixel shader while MRT1 is disabled in CB_TARGET_MASK (does this generate unflushable pixel quads? I don't know), and another app (e.g. Glamor) must enable dual source blending in both MRT0 and MRT1. The hw gets confused, which leads to corruption and hangs. Cc: 12.0 11.2 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* st/mesa: in ATI fs don't assume TEMP0=REG0Miklós Máté2016-08-101-2/+3
| | | | | | | The temporaries are allocated dynamically. Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* st/nine: Fix invalid attempt to use indirect draws.Trevor Davenport2016-08-101-0/+1
| | | | | | | | Since commit 6d7177f01b231e9fe79a558c28d2b562a218d7ea, radeonsi would take a different path if info->indirect_params was not initialized properly. Nine was not initializating this field. Signed-off-by: Marek Olšák <[email protected]>
* util: Use win32 intrinsics for util_last_bit if present.Mathias Fröhlich2016-08-101-0/+12
| | | | | | | | | v2: Split into two patches. v3: Fix off by one problem. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]>
* gallium/radeon: use unflushed fences for deferred flushes (v2)Marek Olšák2016-08-101-1/+43
| | | | | | | | | | +23% Bioshock Infinite performance. v2: - use the new fence_finish interface - allow deferred fences with multiple contexts - clear the ctx pointer after a deferred flush Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: set the ctx parameter of fence_finishMarek Olšák2016-08-101-7/+18
| | | | | | for deferred flushes Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add a pipe_context parameter to fence_finishMarek Olšák2016-08-1039-46/+73
| | | | | | | | required by glClientWaitSync (GL 4.5 Core spec) that can optionally flush the context Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: use PIPE_USAGE_STREAM for GL_CLIENT_STORAGE_BIT without READ_BIT (v2)Marek Olšák2016-08-101-3/+7
| | | | | | v2: keep STAGING for GL_MAP_READ_BIT Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: add HUD queries for mapped VRAM/GTTMarek Olšák2016-08-102-0/+12
| | | | | | mainly for monitoring visible VRAM congestion Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/radeon: track the amount of mapped memoryMarek Olšák2016-08-103-1/+18
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: track the amount of mapped memoryMarek Olšák2016-08-105-1/+26
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: don't try to unmap userptr buffersMarek Olšák2016-08-101-0/+3
| | | | | | no app calls this AFAIK Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: increase the size of the renderer stringMarek Olšák2016-08-101-1/+1
| | | | | | Mine is longer than 64 bytes. Reviewed-by: Nicolai Hähnle <[email protected]>