summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965/cs: Add generator support for CS_OPCODE_CS_TERMINATEJordan Justen2015-05-022-0/+36
| | | | | | | | | | v2: * Don't rely on brw_eu* to generate the send instruction. We now generate the send here, and drop the "i965/cs: Add support for the SEND message that terminates a CS thread" brw_eu* patch. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Mark g0 as used by CS_OPCODE_CS_TERMINATEJordan Justen2015-05-021-0/+4
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add emit_cs_terminate to emit CS_OPCODE_CS_TERMINATEJordan Justen2015-05-022-0/+23
| | | | | | | | | | | | v2: * Do more work at the visitor level. g0 is loaded and sent to the generator now. v3: * Use Ken's comment explaining g0 usage Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add CS_OPCODE_CS_TERMINATEJordan Justen2015-05-022-0/+7
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add BRW_NEW_CS_PROG_DATA and BRW_CACHE_CS_PROGJordan Justen2015-05-023-0/+6
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add an INTEL_DEBUG=cs option.Paul Berry2015-05-022-2/+4
| | | | | | | | | At the moment it's not wired up to anything. Later patches will hook it up to the compute shader back-end. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/cs: Add compute support to update_program().Paul Berry2015-05-021-0/+21
| | | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/cs: Update program.c for compute shaders.Paul Berry2015-05-021-0/+3
| | | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/cs: Add inline functions for dealing with compute shaders.Paul Berry2015-05-021-0/+22
| | | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add BRW_NEW_COMPUTE_PROGRAM state flag.Paul Berry2015-05-022-0/+9
| | | | | | | | | Also add code to brw_upload_state to set it when the compute program changes. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Strip trailing constant zeroes in sample messagesNeil Roberts2015-05-012-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a send message is emitted with a message length that is less than required for the message then the remaining parameters default to zero. We can take advantage of this to save a register when a shader passes constant zeroes as the final coordinates to the sample function. I think this might be useful for GLES applications that are using 2D textures to simulate 1D textures. On Skylake it will be useful for shaders that do texelFetch(tex,something,0) which I think is fairly common. This helps more on Skylake because in that case the order of the instruction operands are u,v,lod,r which is good for 2D textures whereas before they were u,lod,v,r which is only good for 1D textures. On Haswell: total instructions in shared programs: 8535730 -> 8533261 (-0.03%) instructions in affected programs: 236968 -> 234499 (-1.04%) helped: 1174 On Skylake: total instructions in shared programs: 10345646 -> 10341237 (-0.04%) instructions in affected programs: 293011 -> 288602 (-1.50%) helped: 1218 Reviewed-by: Matt Turner <[email protected]> v2: Applied suggestions by Kenneth Graunke: - Only apply on Gen5+ - Apply to all texture opcodes, not just TEX and TXF. Moved the optimisation into the loop as suggested by Matt Turner. Fix the array index when there is a header.
* i965/skl: Force the exec size to 8 when initing header for SIMD4x2Neil Roberts2015-05-012-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | On Gen9+ there needs to be a header when sampling using SIMD4x2. The header is set up by copying from the g0 register. Commit 07c571a39f tried to fix this mov instruction to always use an exec size of 8 because previously it was incorrectly using 4. It did this by casting the type of the destination register to vec8. This was done because there is code in brw_set_dest to guess the exec size based on the width of the dest register. However I misunderstood how this works because it is actually only used when the width is less than 8. That means the patch actually changed it to use the default exec size which on SIMD16 would be 16 and the MOV would clobber over the first register in the send message. This patch makes it additionally set the default exec size to 8. This is similar to how the message is set up in fs_generator::generate_tex. I think this wasn't picked up by any Piglit tests because we don't have any fragment shaders that hit this code path so nothing was using SIMD16. However the patch caused failures in deqp tests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90153 Reviewed-by: Matt Turner <[email protected]> Tested-by: Tapani Pälli <[email protected]>
* i965: Unhardcode a few more stage names and abbreviations.Kenneth Graunke2015-04-302-11/+5
| | | | | | | | | | | | | | | The stage_abbrev and stage_name fields in backend_visitor provide what we need without any additional effort. It also means we'll get the right names for compute shaders, SIMD8 geometry shaders, and both kinds of tessellation shaders. This does unfortunately change the capitalization of the stage abbreviation in the INTEL_DEBUG=optimizer output filenames. It doesn't seem worth adding code to handle, though. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa: add GL_OES_EGL_syncMarek Olšák2015-04-301-0/+1
| | | | | This is an empty extension whose presence means that EGL sync objects can be used with ES contexts.
* i965/blorp: Prepare drawing rectangle for flipped coordinatesTopi Pohjolainen2015-04-301-2/+2
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Add support for layered renderingTopi Pohjolainen2015-04-304-5/+9
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Allow blend state to be set for multiple render targetsTopi Pohjolainen2015-04-303-19/+18
| | | | | | | | | | | Original blorp writes only one buffer per shader invocation. Once the launch mechanism is shared with glsl-based programs there will be need for supporting multiple render targets. Also drop the always constant color write disable settings. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Prepare for attributes other than render positionTopi Pohjolainen2015-04-304-7/+12
| | | | | | | | | | | | | | | Note that the magic number of one in gen7 logic is replaced by BRW_SF_URB_ENTRY_READ_OFFSET ( == 1 also) for clarity. On gen6 the change from zero to one (BRW_SF_URB_ENTRY_READ_OFFSET) has no effect for native blorp as blorp doesn't use any additional attributes. In fact, regular pipeline setup always uses BRW_SF_URB_ENTRY_READ_OFFSET even when there are no additional attributes. Hence the change makes the two (blorp and regular) consistent. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Remove unused argumentsTopi Pohjolainen2015-04-303-21/+12
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/gen7/blorp: Remove unused argumentsTopi Pohjolainen2015-04-301-47/+28
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Allow caller to provide sampler settingsTopi Pohjolainen2015-04-303-8/+14
| | | | | | | v2 (Ken): s/use_unorm_coords/non_normalized_coords/ Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Refactor vertex buffer state setupTopi Pohjolainen2015-04-301-26/+34
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Remove constant parameterTopi Pohjolainen2015-04-303-20/+0
| | | | | | | | This was still needed when we had support for blorp clears but now this is fixed to nop. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/gen8: Expose state base address setupTopi Pohjolainen2015-04-302-2/+5
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/ps/gen8: Refactor state uploadingTopi Pohjolainen2015-04-302-26/+58
| | | | | | | | | v2: Use SET_FIELD() for sampler count, and for that reason added GEN7_PS_SAMPLER_COUNT_MASK. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/ps/gen7: Refactor state uploadingTopi Pohjolainen2015-04-302-20/+45
| | | | | | | | | | | Now the uploading depends only on the input parameters instead of consulting the current gl-state. v2: Rebased on top of sampler count clamping Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Refactor sampler state setupTopi Pohjolainen2015-04-302-22/+47
| | | | | | | | v2 (Matt): Moved * to the name. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Remove dependency to tex object in default color setupTopi Pohjolainen2015-04-301-11/+11
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Refactor and expose brw_upload_binding_table()Topi Pohjolainen2015-04-302-7/+21
| | | | | | | | | | | | Read and write parts of the state stage are also split into explicit arguments allowing future patches to use constant program data. v2 (Ken): s/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/ Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Expose and refactor brw_update_renderbuffer_surfaces()Topi Pohjolainen2015-04-302-21/+35
| | | | | | | | | | Note that brw_update_renderbuffer_surfaces() already had a helper variable which was used in parallel to direct access of the current draw buffer of the context. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Refactor rb surface setup to allow caller to store offsetsTopi Pohjolainen2015-04-305-58/+59
| | | | | | | | | | | Notice that in gen7_wm_surface_state.c there is also indentation change in the surrounding code removing tabs. v2 (Matt): Fixed whitespace: tabs -> spaces Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/gen8: Use constant pointers for reading miptree detailsTopi Pohjolainen2015-04-301-2/+2
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/ps: Use SET_FIELD() for sampler countTopi Pohjolainen2015-04-303-4/+7
| | | | | | | | The value is actually clamped to 0-16 as sample state pointer can be used to support more than 16 samplers. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Don't try to apply the opt_sampler_eot extension for vsNeil Roberts2015-04-291-0/+3
| | | | | | | | | | | | | | | | | | | The opt_sampler_eot optimisation of fs_visitor effectively assumes that it is running on a fragment shader because it casts the program key to a brw_wm_prog_key. However on Skylake fs_visitor can also be used for vertex shaders. It looks like this usually works anyway because the optimisation is skipped if key->nr_color_regions != 1. However for a vertex shader the key is actually a brw_vs_prog_key so the space for nr_color_regions is probably taken up by key->base.program_string_id. This can end up making nr_color_regions be 1 in which case the function will later assert when the last instruction is not FS_OPCODE_FB_WRITE. This was making the DEQP test suite assert. Presumably this only happens there because that compiles a lot of shaders so it would end up with a high value for program_string_id. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* util/macros: Move DIV_ROUND_UP to util/macros.hAxel Davy2015-04-291-4/+1
| | | | | | | Move DIV_ROUND_UP to a shared location accessible everywhere Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* mesa: Fix glGetProgramiv(GL_ACTIVE_ATTRIBUTES).Jose Fonseca2015-04-291-2/+4
| | | | | | | | | | | | It's returning random values, because RESOURCE_VAR() is casting different objects into ir_variable pointers. This updates _mesa_count_active_attribs to filter the resources with the same logic used in _mesa_longest_attribute_name_length. https://bugs.freedesktop.org/show_bug.cgi?id=90207 Reviewed-by: Tapani Pälli <[email protected]>
* meta: remove unneeded #include colortab.hBrian Paul2015-04-281-1/+0
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* mesa: remove unneeded #include colortab.hBrian Paul2015-04-281-1/+0
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* mesa: remove unused options var in compile_shader()Brian Paul2015-04-281-3/+0
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* st/mesa: allow glsl version up to 410, enable ARB_shader_precisionIlia Mirkin2015-04-281-2/+4
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965/vs: Remove unnecessary NULL check on generate_code() result.Kenneth Graunke2015-04-271-2/+1
| | | | | | | | | Code generation is not allowed to fail for any reason - in fact, fs_generator has no mechanism for failing. The visitor is responsible for that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Enable ARB_gpu_shader5 on Gen8+.Matt Turner2015-04-271-6/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix code emission for imul_high in NIR.Matt Turner2015-04-271-1/+23
| | | | | | Copy over from brw_fs_visitor.cpp. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix stride for multiply in macro.Matt Turner2015-04-271-0/+2
| | | | | | | | We have to use W/UW type for src1 of the multiply in the MUL/MACH macro, but in order to read the low 16-bits of each 32-bit integer, we need to set the appropriate stride. Reviewed-by: Kenneth Graunke <[email protected]>
* Revert "i965/fs: Allow SIMD16 borrow/carry/64-bit multiply on Gen > 7."Matt Turner2015-04-271-3/+3
| | | | | | | | | This reverts commit 9f5e5bd34d8ba48c851b442fb88f742b1ba6a571. I have no idea what made me believe these didn't apply to Gen > 7. They do, and without them we generate bad code that causes failures on Gen 8. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: fix up GLSL version when computing GL versionIlia Mirkin2015-04-271-0/+17
| | | | | | | | | | | In some situations it is convenient for a driver to expose a higher GLSL version while some extensions are still incomplete. However in that situation, it would report a GLSL version that was higher than the GL version. Avoid that situation by limiting the GLSL version to the GL version. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: the function name appears to have a gl prefix alreadyIlia Mirkin2015-04-271-2/+2
| | | | | | | | | | | | Currently we're producing errors like User error: GL_INVALID_OPERATION in glglDeleteProgramsARB(invalid call) And noop_warn appears to be called with the full function name. Don't prepend a gl prefix. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* Fix a few typosZoë Blade2015-04-2728-38/+38
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* i965/gen8: Factor out texture surface state set-up from ↵Francisco Jerez2015-04-271-60/+77
| | | | | | | gen8_update_texture_surface(). This moves most of the surface state set-up logic that can be shared between textures and shader images to a separate function.
* i965/gen7: Factor out texture surface state set-up from ↵Francisco Jerez2015-04-272-54/+84
| | | | | | | gen7_update_texture_surface(). This moves most of the surface state set-up logic that can be shared between textures and shader images to a separate function.