summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Guard against error_type in the tree rebalancer.Kenneth Graunke2014-07-161-1/+3
| | | | | | This helped me track down the bug fixed in the previous commit. Signed-off-by: Kenneth Graunke <[email protected]>
* glsl: Make the tree rebalancer bail on matrix operands.Kenneth Graunke2014-07-161-1/+3
| | | | | | | | | It doesn't handle things like (vector * matrix) correctly, and apparently Matt's intention was to bail. Fixes shader compilation in Natural Selection 2. Signed-off-by: Kenneth Graunke <[email protected]>
* Revert "i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams."Kenneth Graunke2014-07-162-26/+7
| | | | | | | | | | | | | | | | This reverts commit 3178d2474ae5bdd1102fb3d76a60d1d63c961ff5. This caused GPU hangs on Ivybridge for some users and huge (80%) performance regressions across the board on multiple platforms. We need to find a better solution. I've made several attempts, but none of them have worked yet. In the meantime, we should revert this. Reverting it breaks GL_PRIMITIVES_GENERATED for non-zero streams, but that's okay, since we don't expose GL_ARB_gpu_shader5 yet. Fixes Piglit's EXT_transform_feedback/generatemipmap prims_generated test case on Haswell.
* ilo: add some missing formatsChia-I Wu2014-07-161-21/+22
| | | | Map more pipe formats to hardware formats. Enable more VB formats on Haswell.
* ilo: update and tailor the surface format tableChia-I Wu2014-07-161-286/+258
| | | | | Recreate the table from scratch with the help of a pdf-table-to-csv converter. Switch to a form that is more suitable for ilo.
* i965: Don't copy propagate abs into Broadwell logic instructions.Kenneth Graunke2014-07-152-12/+6
| | | | | | | | | | | | It's not clear what abs on logical instructions means on Broadwell, and it doesn't appear to do anything sensible. Fixes 270 Piglit tests (the bitand/bitor/bitxor tests with abs). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81157 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965/fs: Use WE_all for gl_SampleID header register munging.Kenneth Graunke2014-07-151-5/+9
| | | | | | | | | | | | This code should execute without regard to the currently executing channels. Asking for gl_SampleID inside control flow might break in strange ways. It appears to break even at the top of the program in SIMD16 mode occasionally as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: [email protected]
* i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.Kenneth Graunke2014-07-151-6/+8
| | | | | | | | | | | | | | gen8_fs_generator uses these to decide whether to set the execution size to 8 or 16, so we incorrectly made both of these MOVs the full width in SIMD16 shaders. (It happened to work out on Gen4-7.) Setting them should also help inform optimization passes what's really going on, which could help avoid bugs. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: [email protected]
* i965: Set execution size to 8 for instructions with force_sechalf set.Kenneth Graunke2014-07-151-1/+1
| | | | | | | | | | | | | | | | | Both inst->force_uncompressed and inst->force_sechalf mean that the generated instruction should be uncompressed and have an execution size of 8. We don't require the visitor to set both flags - setting inst->force_sechalf by itself is supposed to be enough. On Gen4-7, guess_execution_size() demoted instructions to 8-wide based on the default compression state. On Gen8+, we instead set a default execution size, which worked great...except that we forgot to check inst->force_sechalf when deciding whether to use 8 or 16. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: [email protected]
* nvc0: fix translate path for PRIM_RESTART_WITH_DRAW_ARRAYSChristoph Bumiller2014-07-151-13/+28
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add support for indirect drawingChristoph Bumiller2014-07-1510-32/+223
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: check if a fence has already been signalledIlia Mirkin2014-07-151-0/+3
| | | | | | | nouveau_fence_update does real work unconditionally. Avoid doing that if the fence we're checking on has already been signalled. Signed-off-by: Ilia Mirkin <[email protected]>
* glsl: Don't declare variables in for-loop declaration.Matt Turner2014-07-151-2/+2
| | | | | Reported-by: Brian Paul <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* exec_list: Make various places use the new length() method.Connor Abbott2014-07-156-23/+9
| | | | | | | | | | Instead of hand-rolling it. v2 [mattst88]: Rename get_size to length. Expand comment in ir_reader. Reviewed-by: Ian Romanick <[email protected]> [v1] Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* exec_list: Add a function to give the length of a list.Connor Abbott2014-07-151-0/+20
| | | | | | | | | v2 [mattst88]: Remove trailing whitespace. Rename get_size to length. Mark as const. Reviewed-by: Ian Romanick <[email protected]> [v1] Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* exec_list: Add a prepend function.Connor Abbott2014-07-151-1/+19
| | | | | | | | | | This complements the existing append function. It's implemented in a rather simple way right now; it could be changed if performance is a concern. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* mesa: Don't allow GL_TEXTURE_{LUMINANCE,INTENSITY}_* queries outside compat ↵Ian Romanick2014-07-151-2/+7
| | | | | | | | | | | | | | | | | | profile There are no queries for GL_TEXTURE_LUMINANCE_SIZE, GL_TEXTURE_INTENSITY_SIZE, GL_TEXTURE_LUMINANCE_TYPE, or GL_TEXTURE_INTENSITY_TYPE in any version of OpenGL ES or desktop OpenGL core profile. NOTE: Without changes to piglit, this regresses required-sized-texture-formats. v2: Rebase on different initial change. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.2 <[email protected]>
* mesa: Don't allow GL_TEXTURE_BORDER queries outside compat profileIan Romanick2014-07-151-0/+2
| | | | | | | | | | | | | There are no texture borders in any version of OpenGL ES or desktop OpenGL core profile. Fixes piglit's gl-3.2-texture-border-deprecated. v2: Rebase on different initial change. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.2 <[email protected]>
* mesa: Handle uninitialized textures like other textures in ↵Ian Romanick2014-07-151-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | get_tex_level_parameter_image Instead of catching the special case early, handle it by constructing a fake gl_texture_image that will cause the values required by the OpenGL 4.0 spec to be returned. Previously, calling glGenTextures(1, &t); glBindTexture(GL_TEXTURE_2D, t); glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, 0xDEADBEEF, &value); would not generate an error. Anuj: Can you verify this does not regress proxy_textures_invalid_size? Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Suggested-by: Brian Paul <[email protected]> Cc: "10.2" <[email protected]> Cc: Anuj Phogat <[email protected]>
* i965/fs: Relax interference check in register coalescing.Matt Turner2014-07-151-11/+12
| | | | | | | | | | | | | A similar attempt was made in commit 5ff1e446 and was reverted in commit a39428cf after causing a regression in an ES 3 conformance test. The test still passes after this commit. total instructions in shared programs: 1994827 -> 1992858 (-0.10%) instructions in affected programs: 128247 -> 126278 (-1.54%) GAINED: 0 LOST: 1 Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Perform CSE on sends-from-GRF rather than textures.Matt Turner2014-07-151-1/+1
| | | | | | | | | Should potentially allow a few more cases, while avoiding doing CSE on texture operations on Gen <= 6 with the MRF. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80211 Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: lu hua <[email protected]>
* glsl: Update expression types after rebalancing the tree.Matt Turner2014-07-151-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | If we saw a tree that looked like vec3 / \ vec3 float / \ vec3 float / \ vec3 float We would see that all of the expression types were vec3, and then rebalance to vec3 / \ vec3 vec3 <-- should be float / \ / \ vec3 float float float This patch adds code to visit the rebalanced tree and update the expression types from the bottom up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80880 Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add callback_leave to ir_hierarchical_visitor.Matt Turner2014-07-153-73/+126
|
* i965: Initialize new chunks of realloc'd memory.Matt Turner2014-07-151-0/+4
| | | | | | | Otherwise we'd compare uninitialized pointers with NULL and dereference, leading to crashes. Reviewed-by: Kenneth Graunke <[email protected]>
* radeon/llvm: Fix LLVM diagnostic error reportingTom Stellard2014-07-151-7/+4
| | | | | | | We were trying to print the error message after disposing the message object. Tested-by and Reviewed-by: Aaron Watry <[email protected]>
* util/tgsi: Fix ureg_EMIT/ENDPRIM prototype.José Fonseca2014-07-151-2/+2
| | | | | | | | 0cbefc1bea703378381afff946e30c27a21f191d added a source argument to EMIT/ENDPRIM, but it did not update tgsi_ureg accordingly, causing all users of ureg_EMIT/ENDPRIM to fail at runtime with an assertion failure. Trivial.
* glapi: Use GetProcAddress instead of dlsym on Windows.Vinson Lee2014-07-141-0/+4
| | | | | | | | | | | | This patch fixes this MinGW build error. glapi_gentable.c: In function '_glapi_create_table_from_handle': glapi_gentable.c:123:9: error: implicit declaration of function 'dlsym' [-Werror=implicit-function-declaration] *procp = dlsym(handle, symboln); ^ Signed-off-by: Vinson Lee <[email protected]> Acked-by: Brian Paul <[email protected]>
* ilo: raise texture size limitsChia-I Wu2014-07-152-17/+9
| | | | | Report the hardware limits now that max-texture-size piglit test has been fixed.
* ilo: move away from drm_intel_bo_alloc_tiledChia-I Wu2014-07-155-304/+359
| | | | | We want to know the exact sizes of the BOs, and the driver has the knowledge to do so. Refactoring of the resource allocation code is needed though.
* radeonsi: partially revert "switch descriptors to i32 vectors"Marek Olšák2014-07-141-0/+12
| | | | It indeed breaks LLVM 3.4.2.
* i965/vec4: Invalidate live intervals in opt_cse, not _local.Matt Turner2014-07-141-3/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Move aeb list into opt_cse_local.Matt Turner2014-07-142-7/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Invalidate live intervals in opt_cse, not _local.Matt Turner2014-07-141-3/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Move aeb list into opt_cse_local.Matt Turner2014-07-142-7/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Fix aggregates with dynamic initializers.Cody Northrop2014-07-141-3/+14
| | | | | | | | | | | | | | | | | | | | | Vectors are falling in to the ir_dereference_array() path. Without this change, the following glsl aborts the debug driver, or gets the wrong answer in release: mat2x2 a = mat2( vec2( 1.0, vertex.x ), vec2( 0.0, 1.0 ) ); Also submitting piglit tests, will reference in bug. v2: Rebase on Mesa master. v3: Remove unneeded check for arrays, which are covered by process_array_constructor(), recommended by Timothy Arceri. Signed-off-by: Cody Northrop <[email protected]> Reviewed-by: Courtney Goeltzenleuchter <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79373
* Avoid mesa_dri_drivers import lib being installedJon TURNEY2014-07-131-2/+1
| | | | | | | | | | | | | | On Cygwin and MinGW, linking a shared library also generates an import library Use a wildcard which also matches the name of the megadriver import lib, mesa_dri_drivers.dll.a, so that is also removed after megadriver symlinks are created (This then matches src/gallium/targets/dri/Makefile.am, which already does things this way) Signed-off-by: Jon TURNEY <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965/vec4: Silence warnings about unhandled interpolation opsChris Forbes2014-07-131-0/+3
| | | | Signed-off-by: Chris Forbes <[email protected]>
* docs: Mark off ARB_gpu_shader5 interpolation functions for i965Chris Forbes2014-07-131-1/+1
| | | | Signed-off-by: Chris Forbes <[email protected]>
* i965/fs: add support for ir_*_interpolate_at_* expressionsChris Forbes2014-07-132-2/+150
| | | | | | | | | | | | | | SIMD8-only for now. V5: - Fix style complaints - Move prototype to be with other oddball emit functions - Use unreachable() instead of assert() where possible V6: - Describe what is happening with the clamping - Add reg_width to make some expressions clearer Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Skip channel expressions splitting for interpolationChris Forbes2014-07-131-0/+25
| | | | | | | | The backend will have to do a message send, so we want to keep these in one piece, just like texture ops. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: add generator support for pixel interpolator queryChris Forbes2014-07-134-0/+59
| | | | | | | | | | V5: - Split into separate opcodes - Pass message data in src1 immediate - Put noperspective bit in fs_inst rather than adding any junk to backend_instruction Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: add low-level support for send to pixel interpolatorChris Forbes2014-07-132-0/+38
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/disasm: add support for pixel interpolator messagesChris Forbes2014-07-131-0/+17
| | | | | | | V3: Rework for brw_inst changes Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add message descriptor bit definitions for pixel interpolatorChris Forbes2014-07-132-0/+16
| | | | | | | These got lost in the big brw_inst shakeup. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/disasm: Disassemble indirect sends more properlyChris Forbes2014-07-121-162/+174
| | | | | | | | | | | - Don't try to disassemble send's src1 as a descriptor if it's not an immediate. - In the same case, show src1 as an operand (makes it easier to see bogus register regions, etc -- the hardware is very fussy) Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Avoid crashing while dumping vec4 insn operandsChris Forbes2014-07-121-1/+4
| | | | | | | | We'd otherwise go looking into virtual_grf_sizes for things that aren't in there at all. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix two broken asserts in brw_eu_emitChris Forbes2014-07-121-2/+2
| | | | | | | These were looking in the wrong field. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: add new interpolateAt* builtin functionsChris Forbes2014-07-121-0/+67
| | | | | | | | | | | | V2: - Don't assume everyone wants interpolateAtSample() lowered to interpolateAtOffset. It turns out this isn't what we want most of the time for i965. Lowering can be added later in an ir pass which drivers opt into, rather than bolting it straight into the builtin definition. - Only expose the interpolateAt* builtins in the fragment language. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: add new expression types for interpolateAt*Chris Forbes2014-07-128-2/+79
| | | | | | | Will be used to implement interpolateAt*() from ARB_gpu_shader5 Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* allow builtin functions to require parameters to be shader inputsChris Forbes2014-07-122-0/+24
| | | | | | | | | | | | | The new interpolateAt* builtins have strange restrictions on the <interpolant> parameter. - It must be a shader input, or an element of a shader input array. - It must not include a swizzle. V2: Don't abuse ir_var_mode_shader_in for this; make a new flag. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>