summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965/gen9: Use custom MOCS entries set up by the kernel.Francisco Jerez2015-07-162-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of relying on hardware defaults the i915 kernel driver is going program custom MOCS tables system-wide on Gen9 hardware. The "WT" entry previously used for renderbuffers had a number of problems: It disabled caching on eLLC, it used a reserved L3 cacheability setting, and it used to override the PTE controls making renderbuffers always WT on LLC regardless of the kernel's setting. Instead use an entry from the new MOCS tables with parameters: TC=LLC/eLLC, LeCC=PTE, L3CC=WB. The "WB" entry previously used for anything other than renderbuffers has moved to a different index in the new MOCS tables but it should have the same caching semantics as the old entry. Even though the corresponding kernel change ("drm/i915: Added Programming of the MOCS") is in a way an ABI break it doesn't seem necessary to check that the kernel is recent enough because the change should only affect Gen9 which is still unreleased hardware. v2: Update MOCS values for the new Android-incompatible tables introduced in v7 of the kernel patch. Cc: 10.6 <[email protected]> Reference: http://lists.freedesktop.org/archives/intel-gfx/2015-July/071080.html Reviewed-by: Ben Widawsky <[email protected]>
* clover: little OpenCL status code logging cleanEdB2015-07-165-25/+32
| | | | | | | | s/build_error/compile_error in order to match the stored OpenCL status code. Make program::build catch and log every OpenCL error. Make tgsi error triggering uniform with the llvm one. Reviewed-by: Francisco Jerez <[email protected]>
* glsl: avoid compiler's segfault when processing operators with void argumentsRenaud Gaubert2015-07-162-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is done by returning an rvalue of type void in the ast_function_expression::hir function instead of a void expression. This produces (in the case of the ternary) an hir with a call to the void returning function and an assignment of a void variable which will be optimized out (the assignment) during the optimization pass. This fix results in having a valid subexpression in the many different cases where the subexpressions are functions whose return values are void. Thus preventing to dereference NULL in the following cases: * binary operator * unary operators * ternary operator * comparison operators (except equal and nequal operator) Equal and nequal had to be handled as a special case because instead of segfaulting on a forbidden syntax it was now accepting expressions with a void return value on either (or both) side of the expression. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85252 Signed-off-by: Renaud Gaubert <[email protected]> Reviewed-by: Gabriel Laskar <[email protected]> Reviewed-by: Samuel Iglesias Gonsalvez <[email protected]>
* r200: fix some potential big endian issuesRoland Scheidegger2015-07-165-129/+140
| | | | | | | | | | | | The formats chosen (both by texture format choser, fbo storage allocation) are different for big endian not just for rgba8 but also lower bit width formats (why I don't actually know). Even the function to test for renderable formats used different formats, however the actual colorbuffer setup did not. And the blitter did not take that into account neither. Untested (what could possibly go wrong...). Same as for r100. Acked-by: Marek Olšák <[email protected]>
* radeon: fix some potential big endian issuesRoland Scheidegger2015-07-164-90/+76
| | | | | | | | | | | The formats chosen (both by texture format choser, fbo storage allocation) are different for big endian not just for rgba8 but also lower bit width formats (why I don't actually know). Even the function to test for renderable formats used different formats, however the actual colorbuffer setup did not. And the blitter did not take that into account neither. Untested (what could possibly go wrong...). Acked-by: Marek Olšák <[email protected]>
* radeon/r200: mark state atoms as dirty after blitsRoland Scheidegger2015-07-162-0/+24
| | | | | | | | | | | Blit submits lots of packets which are usually handled by state atoms, so these must be dirtied. Not sure if this fixes anything, but it was a concern raised by bug 51658 (with this all issues there seen as actual bugs should be fixed, with the exception of the patch to upload non-used texenv state atoms which I just don't understand). Acked-by: Marek Olšák <[email protected]>
* r200: fix fbo rendering by disabling optimized texture format chooserRoland Scheidegger2015-07-161-1/+13
| | | | | | | | | | | | It is rather unfortunate that we don't know if a texture is going to be used as a rt later, and we lack the means to do something about a format chosen which we can't render to directly, so disable this and always chose renderable format for rgba8 textures. This addresses an issue raised on (old) bug, https://bugs.freedesktop.org/show_bug.cgi?id=51658 with gnome-shell, don't know if that's still applicable but it might fix other things as well. Acked-by: Marek Olšák <[email protected]>
* i965: Fix 32 bit build warnings in intel_get_yf_ys_bo_size()Anuj Phogat2015-07-151-3/+3
| | | | | | | | | | | | | | | Along with fixing the type of pitch parameter, patch also changes the types of few local variables and function return type. Warnings fixed are: intel_mipmap_tree.c:671:7: warning: passing argument 3 of 'intel_get_yf_ys_bo_size' from incompatible pointer type intel_mipmap_tree.c:563:1: note: expected 'uint64_t *' but argument is of type 'long unsigned int *' Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Optimize batchbuffer macros.Matt Turner2015-07-156-42/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously OUT_BATCH was just a macro around an inline function which does brw->batch.map[brw->batch.used++] = dword; When making consecutive calls to intel_batchbuffer_emit_dword() the compiler isn't able to recognize that we're writing consecutive memory locations or that it doesn't need to write batch.used back to memory each time. We can avoid both of these problems by making a local pointer to the next location in the batch in BEGIN_BATCH(). Cuts 18k from the .text size. text data bss dec hex filename 4946956 195152 26192 5168300 4edcac i965_dri.so before 4928956 195152 26192 5150300 4e965c i965_dri.so after This series (including commit c0433948) improves performance of Synmark OglBatch7 by 8.01389% +/- 0.63922% (n=83) on Ivybridge. Reviewed-by: Chris Wilson <[email protected]>
* i965: Add and use USED_BATCH macro.Matt Turner2015-07-156-22/+25
| | | | | | | The next patch will replace the .used field with an on-demand calculation of batchbuffer usage. Reviewed-by: Chris Wilson <[email protected]>
* i965: Split batch emission from relocation functions.Matt Turner2015-07-152-34/+30
| | | | | | | So that everything writing to the batch between BEGIN_BATCH() and ADVANCE_BATCH() goes through OUT_BATCH. Reviewed-by: Chris Wilson <[email protected]>
* i965: Move BEGIN_BATCH() into same control flow as ADVANCE_BATCH().Matt Turner2015-07-151-2/+2
| | | | | | | | | | BEGIN_BATCH() and ADVANCE_BATCH() will contain "do {" and "} while (0)" respectively to allow declaring local variables used by intervening OUT_BATCH macros. As such, BEGIN_BATCH() and ADVANCE_BATCH() need to be in the same control flow. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* osmesa: fix OSMesaPixelsStore typoBrian Paul2015-07-152-2/+2
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91337 Cc: 10.6 <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vc4: Cache the texture p1 for the sampler.Eric Anholt2015-07-143-49/+69
| | | | | Cuts another 12% of vc4_uniforms.o, in exchange for computing it at CSO creation time.
* vc4: Cache texture p0/p1 setup for the sampler view.Eric Anholt2015-07-143-28/+43
| | | | | In exchange for a bit of space and computation in CSO setup, we cut vc4_uniform.c (draw time) code size by 4.8%.
* vc4: Move uniforms handling to a separate file.Eric Anholt2015-07-143-314/+341
| | | | | The rest of vc4_program.c is about compiling, while this is about uniform emit at draw time.
* vc4: Fix some -Wdouble-promotion warnings.Eric Anholt2015-07-143-6/+6
| | | | | No code generation changes from this, but it'll be useful to have this next time I go checking -Wdouble-promotion.
* i965/cs: Initialize GPGPU Thread CountJordan Justen2015-07-142-0/+25
| | | | | | | | | | | | | | | | This field should always be set for gen8. In the bdw PRM, Volume 2d: Command Reference: Structures under INTERFACE_DESCRIPTOR_DATA, DWORD 6, Bits 9:0, Number of Threads in GPGPU Thread Group: "This field should not be set to 0 even if the barrier is disabled, since an accurate value is needed for proper pre-emption." In the HSW PRM, the it doesn't mention that it must always be set, but it should not hurt. Reported-by: Kristian Høgsberg <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* vc4: Fix compiler warnings on release builds.Eric Anholt2015-07-144-7/+14
|
* vc4: Add better debug for register allocation failure.Eric Anholt2015-07-141-1/+5
|
* vc4: Drop reloc_count tracking for debug asserts on non-debug builds.Eric Anholt2015-07-141-0/+10
| | | | Cuts another 88 bytes of compiled code.
* vc4: Rework cl handling to be friendlier to the compiler.Eric Anholt2015-07-146-152/+203
| | | | | Drops 680 bytes of code, from avoiding a bunch of extra updates to the next pointer in the struct.
* vc4: Make a helper function for getting the current offset in the CL.Eric Anholt2015-07-144-20/+21
| | | | | | I needed to rewrite this a bit for safety checking in the next commit. Despite being a static inline of the same thing that was being done, we lose 36 bytes of code for some reason.
* vc4: Drop separate cl*_reloc_hindex().Eric Anholt2015-07-141-18/+6
| | | | | | Now that RCL generation is in the kernel, we don't have any other callers. Oddly, the compiler generates another 8 bytes of code for this, but the simplification is worth it.
* vc4: Store reloc pointers as pointers, not offsets.Eric Anholt2015-07-141-5/+5
| | | | | | | Now that we don't resize the CL as we build (it's set up at the top by vc4_start_draw()), we can store the pointers instead of offsets from the base. Saves a bit of math in emitting relocs (about 60 bytes of code).
* vc4: Add perf debug for when we wait on BOs.Eric Anholt2015-07-144-43/+72
|
* i965: Mark constant static data as const.Matt Turner2015-07-142-23/+23
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* glsl: Lower shader storage buffer object loads to GLSL IR instrinsicsSamuel Iglesias Gonsalvez2015-07-141-8/+65
| | | | | | | | Extend the existing lower_ubo_reference pass to also detect SSBO loads and lower them to __intrinsic_load_ssbo intrinsics. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* glsl: Lower shader storage buffer object writes to GLSL IR instrinsicsSamuel Iglesias Gonsalvez2015-07-141-130/+311
| | | | | | | | Extend the existing lower_ubo_reference pass to also detect SSBO writes and lower them to __intrinsic_store_ssbo intrinsics. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* glsl: Don't do copy propagation on buffer variablesIago Toral Quiroga2015-07-141-1/+1
| | | | | | | | | | | Since the backing storage for these is shared we cannot ensure that the value won't change by writes from other threads. Normally SSBO accesses are not guaranteed to be syncronized with other threads, except when memoryBarrier is used. So, we might be able to optimize some SSBO accesses, but for now we always take the safe path and emit the SSBO access. Reviewed-by: Jordan Justen <[email protected]>
* glsl: Don't do constant variable on buffer variablesIago Toral Quiroga2015-07-141-0/+7
| | | | | | | | | | | Since the backing storage for these is shared we cannot ensure that the value won't change by writes from other threads. Normally SSBO accesses are not guaranteed to be syncronized with other threads, except when memoryBarrier is used. So, we might be able to optimize some SSBO accesses, but for now we always take the safe path and emit the SSBO access. Reviewed-by: Jordan Justen <[email protected]>
* glsl: Don't do constant propagation on buffer variablesIago Toral Quiroga2015-07-141-0/+8
| | | | | | | | | | | Since the backing storage for these is shared we cannot ensure that the value won't change by writes from other threads. Normally SSBO accesses are not guaranteed to be syncronized with other threads, except when memoryBarrier is used. So, we might be able to optimize some SSBO accesses, but for now we always take the safe path and emit the SSBO access. Reviewed-by: Jordan Justen <[email protected]>
* glsl: Do not kill dead assignments to buffer variables or SSBO declarations.Iago Toral Quiroga2015-07-141-3/+6
| | | | | | | | | | | | | | | If we kill dead assignments we lose the buffer writes. Also, we never kill UBO declarations even if they are never referenced by the shader, they are always considered active. Although the spec does not seem say this specifically for SSBOs, it is probably implied since SSBOs are pretty much the same as UBOs, only that you can write to them. v2: - Fix the comment (Jordan) Reviewed-by: Jordan Justen <[email protected]>
* glsl: Don't do tree grafting on buffer variablesIago Toral Quiroga2015-07-141-4/+5
| | | | | | Otherwise we can lose writes into the buffers backing the variables. Reviewed-by: Jordan Justen <[email protected]>
* mesa: Implement _mesa_BindBufferRange for target GL_SHADER_STORAGE_BUFFERIago Toral Quiroga2015-07-141-0/+37
| | | | Reviewed-by: Jordan Justen <[email protected]>
* mesa: Implement _mesa_BindBufferBase for target GL_SHADER_STORAGE_BUFFERIago Toral Quiroga2015-07-141-0/+56
| | | | Reviewed-by: Jordan Justen <[email protected]>
* mesa: Implement _mesa_BindBuffersRange for target GL_SHADER_STORAGE_BUFFERIago Toral Quiroga2015-07-141-0/+110
| | | | | | | v2: - Fix error message (Jordan) Reviewed-by: Jordan Justen <[email protected]>
* mesa: Implement _mesa_BindBuffersBase for target GL_SHADER_STORAGE_BUFFERIago Toral Quiroga2015-07-142-0/+149
| | | | | | | v2: - Add space before const (Jordan) Reviewed-by: Jordan Justen <[email protected]>
* mesa: Implement _mesa_DeleteBuffers for target GL_SHADER_STORAGE_BUFFERIago Toral Quiroga2015-07-141-0/+11
| | | | | | | v2: - Remove the extra spaces (Jordan) Reviewed-by: Jordan Justen <[email protected]>
* mesa: Initialize and free shader storage buffersIago Toral Quiroga2015-07-141-0/+19
| | | | | | | v2: - Fix indention, used tabs instead of whitespaces. (Jordan) Reviewed-by: Jordan Justen <[email protected]>
* glsl: fix error messages in invalid declarations of shader storage blocksSamuel Iglesias Gonsalvez2015-07-141-7/+8
| | | | | | | | | | Due to GL_ARB_shader_storage_buffer_object extension, shader storage blocks have the same limitations as uniform blocks. This patch fixes the corresponding error messages. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* glsl: buffer variables cannot be defined outside interface blocksSamuel Iglesias Gonsalvez2015-07-141-0/+12
| | | | | | | | | | | | Section 4.3.7 "Buffer Variables", GLSL 4.30 spec: "Buffer variables may only be declared inside interface blocks (section 4.3.9 “Interface Blocks”), which are then referred to as shader storage blocks. It is a compile-time error to declare buffer variables at global scope (outside a block)." Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* glsl: shader buffer variables cannot have initializersSamuel Iglesias Gonsalvez2015-07-141-0/+9
| | | | | | | | | | | | Section 4.3.7 "Buffer Variables" of the GLSL 4.30 spec: "Buffer variables cannot have initializers." v2: - Rewrite error message (Jordan) Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* glsl: enable binding layout qualifier usage for shader storage buffer objectsSamuel Iglesias Gonsalvez2015-07-142-6/+26
| | | | | | | | | | | | See GLSL 4.30 spec, section 4.4.5 "Uniform and Shader Storage Block Layout Qualifiers". v2: - Add whitespace in an error message. Delete period '.' at the end of that error message (Jordan). Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa: add MaxShaderStorageBlocks to struct gl_program_constantsSamuel Iglesias Gonsalvez2015-07-142-0/+5
| | | | | | | | v2: - Set MaxShaderStorageBlocks to 8. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa: Add shader storage buffer support to struct gl_contextIago Toral Quiroga2015-07-144-0/+51
| | | | | | | | | | | This includes the array of bindings, the current buffer bound to the GL_SHADER_STORAGE_BUFFER target and a set of general limits and default values for shader storage buffers. v2: - Use spec values for the new defined constants (Jordan) Reviewed-by: Jordan Justen <[email protected]>
* glsl: Identify active uniform blocks that are buffer blocks as such.Iago Toral Quiroga2015-07-144-0/+11
| | | | Reviewed-by: Jordan Justen <[email protected]>
* glsl: link buffer variables and shader storage buffer interface blocksKristian Høgsberg2015-07-144-9/+21
| | | | Reviewed-by: Jordan Justen <[email protected]>
* glsl: Implement parser support for 'buffer' qualifierKristian Høgsberg2015-07-146-9/+42
| | | | | | | This is used to identify shader storage buffer interface blocks where buffer variables are declared. Reviewed-by: Jordan Justen <[email protected]>
* nir: add nir_var_shader_storageIago Toral Quiroga2015-07-146-8/+20
| | | | Reviewed-by: Jordan Justen <[email protected]>