summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* mesa: fix MSVC build break in extensions.hBrian Paul2015-11-121-1/+3
| | | | Reviewed-by: Emil Velikov <[email protected]>
* gallium: add support for gl_HelperInvocation semanticIlia Mirkin2015-11-121-1/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]>
* glsl: add gl_HelperInvocation system valueIlia Mirkin2015-11-121-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: In helpers, only check driver capability for metaNanley Chery2015-11-125-1/+20
| | | | | | | | | | | Make API context and version checks done by the helper functions pass unconditionally while meta is in progress. This transparently makes extension checks solely dependent on struct gl_extensions while in meta. v2: Use an 8-bit data type instead of a GLuint Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/extensions: Prefix global struct and extension typeNanley Chery2015-11-122-23/+23
| | | | | | | | | | | | | Rename the following types and variables: * struct extension -> struct mesa_extension, like the mesa_format type. * extension_table -> _mesa_extension_table, like the _mesa_extension_override_{enables,disables} structs. Suggested-by: Marek Olšák <[email protected]> Suggested-by: Chad Versace <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa: Generate a helper function for each extensionNanley Chery2015-11-123-22/+44
| | | | | | | | | | | | | | | | | Generate functions which determine if an extension is supported in the current context. Initially, enums were going to be explicitly used with _mesa_extension_supported(). The idea to embed the function and enums into generated helper functions was suggested by Kristian Høgsberg. For performance, the function body no longer uses _mesa_extension_supported() and, as suggested by Chad Versace, the functions are also declared static inline. v2: Place function qualifiers on separate line (Chad) v3: Move function curly brace to new line (Chad) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/extensions: Replace extension::api_set with ::versionNanley Chery2015-11-122-336/+331
| | | | | | | | | | | | | | | The api_set field has no users outside of _mesa_extension_supported(). Remove it and allow the version field to take its place. The brunt of the transformation was performed with the following vim commands: s/\(GL [^,]\+\),\s*\d*,\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1, GLL, GLC\2\3/g s/\(GLL [^,]\+\)\,\s*\d*/\1, GLL/g s/\(GLC [^,]\+\)\(,\s*\d*\),\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1\2, GLC\3\4/g s/\( ES1[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4, ES1/g s/\( ES2[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4\6, ES2/g Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/extensions: Use _mesa_extension_supported()Nanley Chery2015-11-122-46/+14
| | | | | | | | Replace open-coded checks for extension support with _mesa_extension_supported(). Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/extensions: Create _mesa_extension_supported()Nanley Chery2015-11-121-0/+18
| | | | | | | | | | | | Create a function which determines if an extension is supported in the current context. v2: Use common variable names (Emil) Insert new line between variables and return statement (Chad) Rename api_set variable to api_bit (Chad) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/extensions: Add extension::versionNanley Chery2015-11-122-320/+334
| | | | | | | | | | | | | | | | Enable limiting advertised extension support by context version with finer granularity. This new field is currently unused and is set to 0 everywhere. When it is used, a value of 0 will indicate that the extension is supported for any version of a context. v2: Use uint*t type for version and note the expected values (Emil) Use an 8-bit data type Reformat macro for better readability (Chad) v3: Note preparatory nature of commit (Chad) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/extensions: Move entries entries to separate fileNanley Chery2015-11-123-325/+327
| | | | | | | | | | With this infrastructure set in place, we can now reuse the entries to generate useful code. v2: Add the new file into Makefile.sources (Emil) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/extensions: Wrap array entries in macrosNanley Chery2015-11-121-327/+328
| | | | | | | | | | | | | Now that we're using macros, remove the redundant text from each entry. Remove comments between the entries to make editing easier and separate the sections with blank lines. Structure the EXT macros in a way that helps reviewers verify that no meaning has been altered. v2: Indent the entries (Chad) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/extensions: Remove array sentinelNanley Chery2015-11-121-18/+25
| | | | | | | | Simplify future updates to the extension struct array by removing the sentinel. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Check instructions appear only on supported hardware.Matt Turner2015-11-121-0/+254
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add initial assembly validation pass.Matt Turner2015-11-125-0/+174
| | | | | | | Initially just checks that sources are non-NULL, which would have alerted us to the problem fixed by commit 6c846dc5. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add annotation_insert_error() and support for printing errors.Matt Turner2015-11-122-7/+71
| | | | | | | | Will allow annotations to contain error messages (indicating an instruction violates a rule for instance) that are printed after the disassembly of the block. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Combine assembly annotations if possible.Matt Turner2015-11-121-5/+18
| | | | | | | | Often annotations are identical between sets of consecutive instructions. We can perhaps avoid some memory allocations by reusing the previous annotation. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Set annotation_info's mem_ctx.Matt Turner2015-11-123-2/+5
| | | | | | It was being memset to 0 previously. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Don't consider control flow instructions to have sources.Matt Turner2015-11-121-8/+8
| | | | | | | | | | | | And why did IFF have a destination? I suspect that once upon a time the disassembler used this information to know which fields to find the jump targets in. The jump targets have moved, so the disassembler has to know how to handle these per-generation anyway. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fill out instruction list.Matt Turner2015-11-123-14/+42
| | | | | | | | | | | Add some instructions: illegal, movi, sends, sendsc. Remove some instructions with reused opcodes: msave, mrestore, push, pop, goto. I did have some gross code for disassembling opcodes per-generation, but there's very little meaningful overlap so it's probably not needed. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Consolidate is_3src() functions.Matt Turner2015-11-123-8/+7
| | | | Otherwise I'll have to add another later in this series.
* mesa: validate precision of varyings during ValidateProgramPipelineTapani Pälli2015-11-123-0/+80
| | | | | | | | | | | Fixes following failing ES3.1 CTS tests: ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingFloat ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingInt ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingUInt Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZESamuel Iglesias Gonsálvez2015-11-121-2/+14
| | | | | | | | | | | FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler message. This patch adjusts the number of registers written by the opcode following what the PRM spec says about the number of registers written by the SIMD8 and SIMD16's writeback messages for sampler messages. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/skl/gt4: Fix URB programming restriction.Ben Widawsky2015-11-111-0/+9
| | | | | | | | | | | | | | The comment in the code details the restriction. Thanks to Ken for having a very helpful conversation with me, and spotting the blurb in the link I sent him :P. There are still stability problems for me on GT4, but this definitely helps with some of the failures. v2: Comment fixes Cc: [email protected] Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: implement ARB_clear_textureIlia Mirkin2015-11-112-0/+30
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: add ARB_enhanced_layoutsTimothy Arceri2015-11-122-0/+2
| | | | Reviewed-by: Emil Velikov <[email protected]>
* i965: Split nir_emit_intrinsic by stage with a general fallback.Kenneth Graunke2015-11-112-277/+381
| | | | | | | | | | | | | | | | | | | | | | Many intrinsics only apply to a particular stage (such as discard). In other cases, we may want to interpret them differently based on the stage (such as load_primitive_id or load_input). The current method isn't that pretty - we handle all intrinsics in one giant function. Sometimes we assert on stage, sometimes we forget. Different behaviors are handled via if-ladders based on stage. This commit introduces new nir_emit_<stage>_intrinsic() functions, and makes nir_emit_instr() call those. In turn, those fall back to the generic nir_emit_intrinsic() function for cases they don't want to handle specially. This makes it clear which intrinsics only exist in one stage, and makes it easy to handle inputs/outputs differently for various stages. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa/copyimage: allow width/height to not be multiples of blockIlia Mirkin2015-11-111-3/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | For compressed textures, the image size is not necessarily a multiple of the block size (e.g. the last mip levels). Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core Profile spec says: An INVALID_VALUE error is generated if the dimensions of either subregion exceeds the boundaries of the corresponding image object, or if the image format is compressed and the dimensions of the subregion fail to meet the alignment constraints of the format. and Section 8.7 (Compressed Texture Images) says: An INVALID_OPERATION error is generated if any of the following conditions occurs: * width is not a multiple of four, and width + xoffset is not equal to the value of TEXTURE_WIDTH. * height is not a multiple of four, and height + yoffset is not equal to the value of TEXTURE_HEIGHT. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860 Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: [email protected]
* i965/brw_reg: Add a brw_VxH_indirect helperJason Ekstrand2015-11-111-0/+11
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: remove old comments in arrayobj.cBrian Paul2015-11-111-5/+0
|
* i965: Print force_writemask_all in dump_instructions().Kenneth Graunke2015-11-112-0/+6
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Combine BRW_NEW_*_BINDING_TABLE dirty bits.Kenneth Graunke2015-11-115-26/+14
| | | | | | | | | | | A while back, we moved to directly emitting the Gen7+ state when constructing the binding tables. These flags are only used on Gen4-6, which emit all the binding table pointers at once. We gain nothing by having separate flags, so combine them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Map GL_PATCHES to 3DPRIM_PATCHLIST_n.Kenneth Graunke2015-11-112-1/+10
| | | | | | | | | | | | | | | | Inspired by a patch by Fabian Bieler. Fabian defined a _3DPRIM_PATCHLIST_0 macro (which isn't actually a valid topology type); I instead chose to make a macro that takes an argument. He also took the number of patch vertices from _mesa_prim (which was set to ctx->TessCtrlProgram.patch_vertices) - I chose to use it directly to avoid the need for the VBO patch. v2: Change macro to 0x20 + (n - 1) instead of 0x1F + n to better match the documentation (suggested by Ian). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is ↵Eduardo Lima Mitev2015-11-101-0/+31
| | | | | | | | | | | | | | | | | | | | | | | a const When both fadd and fmul instructions have at least one operand that is a constant and it is only used once, the total number of instructions can be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because the constants will be progagated as immediate operands of fmul and fadd. This patch detects these situations and prevents fusing fmul+fadd into ffma. Shader-db results on i965 Haswell: total instructions in shared programs: 6235835 -> 6225895 (-0.16%) instructions in affected programs: 1124094 -> 1114154 (-0.88%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 7612 HURT: 843 GAINED: 4 LOST: 0 Reviewed-by: Jason Ekstrand <[email protected]>
* nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driverEduardo Lima Mitev2015-11-104-1/+272
| | | | | | | | | Because the next patch will add an optimization that is specific to i965, we want to move this loweing pass to that driver altogether. This is safe because i965 is the only consumer. Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Lower UBO and SSBO access in glsl linkerKristian Høgsberg Kristensen2015-11-105-3/+5
| | | | | | | | | | | All GLSL IR consumers run this lowering pass so we can move it to the linker. This moves the pass up quite a bit, but that's the point: it needs to run before we throw away information about per-component vector access. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* glsl: Drop exec_list argument to lower_ubo_referenceKristian Høgsberg Kristensen2015-11-102-2/+2
| | | | | | | | | | | We always pass in shader->ir and we already pass in the shader, so just drop the exec_list. Most passes either take just a exec_list or a shader, so this seems more consistent. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* st/mesa: Destroy buffer object's mutex.Jose Fonseca2015-11-101-0/+1
| | | | | | | | Ideally we should have a _mesa_cleanup_buffer_object function in src/mesa/bufferobj.c so that the destruction logic resided in a single place. Reviewed-by: Brian Paul <[email protected]>
* i965/fs: Use regs_read/written for post-RA scheduling in calculate_depsJason Ekstrand2015-11-071-11/+4
| | | | | | | | | | | | Previously, we were assuming that everything read/wrote exactly 1 logical GRF (1 in SIMD8 and 2 in SIMD16). This isn't actually true. In particular, the PLN instruction reads 2 logical registers in one of the components. This commit changes post-RA scheduling to use regs_read and regs_written instead so that we add enough dependencies. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92770 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* i965/nir/fs: Add comment for no-op memory barrier functionsFrancisco Jerez2015-11-061-0/+19
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965/nir/fs: Implement new barrier functions for compute shadersJordan Justen2015-11-061-0/+7
| | | | | | | | | | | | | | | | | | | | | | For these nir intrinsics, we emit the same code as nir_intrinsic_memory_barrier: * nir_intrinsic_memory_barrier_atomic_counter * nir_intrinsic_memory_barrier_buffer * nir_intrinsic_memory_barrier_image We treat these nir intrinsics as no-ops: * nir_intrinsic_group_memory_barrier * nir_intrinsic_memory_barrier_shared v3: * Add comment for no-op cases (curro) v4: * Moving comment to a separate patch authored by curro Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* mesa: report enum name in glClientActiveTexture() error stringBrian Paul2015-11-051-1/+2
| | | | As we do for glActiveTexture(). Trivial.
* i965: Fix scalar VS float[] and vec2[] output arrays.Kenneth Graunke2015-11-054-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The scalar VS backend has never handled float[] and vec2[] outputs correctly (my original code was broken). Outputs need to be padded out to vec4 slots. In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4. However, this is wrong: type_size_scalar() for a float[2] would return 2, or for vec2[2] it would return 4. This looked like a single slot, even though in reality each array element would be stored in separate vec4 slots. Because of this bug, outputs[] and output_components[] would not get initialized for the second element's VARYING_SLOT, which meant emit_urb_writes() would skip writing them. Nothing used those values, and dead code elimination threw a party. To fix this, we introduce a new type_size_vec4_times_4() function which pads array elements correctly, but still counts in scalar components, generating correct indices in store_output intrinsics. Normally, varying packing avoids this problem by turning varyings into vec4s. So this doesn't actually fix any Piglit or dEQP tests today. However, if varying packing is disabled, things would be broken. Tessellation shaders can't use varying packing, so this fixes various tcs-input Piglit tests on a branch of mine. v2: Shorten the implementation of type_size_4x to a single line (caught by Connor Abbott), and rename it to type_size_vec4_times_4() (renaming suggested by Jason Ekstrand). Use type_size_vec4 rather than using type_size_vec4_times_4 and then dividing by 4. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa: set debug callback for debug contextsIlia Mirkin2015-11-051-0/+57
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/mesa: account for texture views when doing CopyImageSubDataIlia Mirkin2015-11-051-0/+8
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965/fs: Do not mark used surfaces in FS_OPCODE_GET_BUFFER_SIZEIago Toral Quiroga2015-11-052-4/+4
| | | | | | | | Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <[email protected]>
* i965/vec4: Do not mark used surfaces in VS_OPCODE_GET_BUFFER_SIZEIago Toral Quiroga2015-11-052-5/+5
| | | | | | | | Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <[email protected]>
* i965/vec4: Do not mark used direct surfaces in VS_OPCODE_PULL_CONSTANT_LOADIago Toral Quiroga2015-11-053-13/+8
| | | | | | | | | | Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const, do not add unnecessary temporary (Curro) Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: Do not mark used direct surfaces in UNIFORM_PULL_CONSTANT_LOADIago Toral Quiroga2015-11-052-11/+1
| | | | | | | | Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: Do not mark direct used surfaces in VARYING_PULL_CONSTANT_LOADIago Toral Quiroga2015-11-053-13/+8
| | | | | | | | | | Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const and remove useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <[email protected]>