summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965/cs: Support CS program precompileJordan Justen2015-05-024-0/+41
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add brw_setup_tex_for_precompile. Use in VS, GS & FS.Jordan Justen2015-05-023-24/+24
| | | | | | Suggested-by: Kristian Høgsberg <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Emit compute shader code and upload programsJordan Justen2015-05-023-0/+212
| | | | | | | | | | | | v2: * Don't bother checking for 'gen > 5' (krh) * Populate sampler data in key (krh) v3: * Drop no8 support, and simplify code in several places (Ken) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Set invocation counts based on max_cs_threadsJordan Justen2015-05-021-0/+24
| | | | | | | | | | | | | | For ES, we set the max counts based on SIMD8, which is currently accurate. For desktop GL, we set the max counts based on SIMD16, which can fail in some cases where a SIMD16 program is not currently supported. Therefore, this value is not currently accurate, but will work fine in many cases, and lets us run more test cases. Eventually we want to always be able to generate a SIMD16 program. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add max_cs_threadsJordan Justen2015-05-024-1/+14
| | | | | | | | Add values for gen7 & gen8. These are the number threads in a subslice. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove comment about chv device numbers being preliminaryJordan Justen2015-05-021-3/+0
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Support compute programs in fs_visitorJordan Justen2015-05-024-3/+93
| | | | | | | | | | v2: * Clean out some unneeded code copied from run_fs (krh) * Always use NIR * Split shader time out into a separate commit Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cache: Add support for CS in program state cacheJordan Justen2015-05-024-0/+54
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add brw_cs_prog_data, brw_cs_prog_key and brw_context::cs.Paul Berry2015-05-022-0/+62
| | | | | | | | | | | | [email protected]: * Added brw_cs_prog_key structure * Added brw_cs_prog_data::dispatch_grf_start_reg_16 * Added brw_cs_prog_data::local_size * Added brw_cs_prog_data::simd_size Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add generator support for CS_OPCODE_CS_TERMINATEJordan Justen2015-05-022-0/+36
| | | | | | | | | | v2: * Don't rely on brw_eu* to generate the send instruction. We now generate the send here, and drop the "i965/cs: Add support for the SEND message that terminates a CS thread" brw_eu* patch. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Mark g0 as used by CS_OPCODE_CS_TERMINATEJordan Justen2015-05-021-0/+4
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add emit_cs_terminate to emit CS_OPCODE_CS_TERMINATEJordan Justen2015-05-022-0/+23
| | | | | | | | | | | | v2: * Do more work at the visitor level. g0 is loaded and sent to the generator now. v3: * Use Ken's comment explaining g0 usage Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add CS_OPCODE_CS_TERMINATEJordan Justen2015-05-022-0/+7
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add BRW_NEW_CS_PROG_DATA and BRW_CACHE_CS_PROGJordan Justen2015-05-023-0/+6
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add an INTEL_DEBUG=cs option.Paul Berry2015-05-022-2/+4
| | | | | | | | | At the moment it's not wired up to anything. Later patches will hook it up to the compute shader back-end. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/cs: Add compute support to update_program().Paul Berry2015-05-021-0/+21
| | | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/cs: Update program.c for compute shaders.Paul Berry2015-05-021-0/+3
| | | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/cs: Add inline functions for dealing with compute shaders.Paul Berry2015-05-021-0/+22
| | | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Add BRW_NEW_COMPUTE_PROGRAM state flag.Paul Berry2015-05-022-0/+9
| | | | | | | | | Also add code to brw_upload_state to set it when the compute program changes. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Strip trailing constant zeroes in sample messagesNeil Roberts2015-05-012-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a send message is emitted with a message length that is less than required for the message then the remaining parameters default to zero. We can take advantage of this to save a register when a shader passes constant zeroes as the final coordinates to the sample function. I think this might be useful for GLES applications that are using 2D textures to simulate 1D textures. On Skylake it will be useful for shaders that do texelFetch(tex,something,0) which I think is fairly common. This helps more on Skylake because in that case the order of the instruction operands are u,v,lod,r which is good for 2D textures whereas before they were u,lod,v,r which is only good for 1D textures. On Haswell: total instructions in shared programs: 8535730 -> 8533261 (-0.03%) instructions in affected programs: 236968 -> 234499 (-1.04%) helped: 1174 On Skylake: total instructions in shared programs: 10345646 -> 10341237 (-0.04%) instructions in affected programs: 293011 -> 288602 (-1.50%) helped: 1218 Reviewed-by: Matt Turner <[email protected]> v2: Applied suggestions by Kenneth Graunke: - Only apply on Gen5+ - Apply to all texture opcodes, not just TEX and TXF. Moved the optimisation into the loop as suggested by Matt Turner. Fix the array index when there is a header.
* i965/skl: Force the exec size to 8 when initing header for SIMD4x2Neil Roberts2015-05-012-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | On Gen9+ there needs to be a header when sampling using SIMD4x2. The header is set up by copying from the g0 register. Commit 07c571a39f tried to fix this mov instruction to always use an exec size of 8 because previously it was incorrectly using 4. It did this by casting the type of the destination register to vec8. This was done because there is code in brw_set_dest to guess the exec size based on the width of the dest register. However I misunderstood how this works because it is actually only used when the width is less than 8. That means the patch actually changed it to use the default exec size which on SIMD16 would be 16 and the MOV would clobber over the first register in the send message. This patch makes it additionally set the default exec size to 8. This is similar to how the message is set up in fs_generator::generate_tex. I think this wasn't picked up by any Piglit tests because we don't have any fragment shaders that hit this code path so nothing was using SIMD16. However the patch caused failures in deqp tests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90153 Reviewed-by: Matt Turner <[email protected]> Tested-by: Tapani Pälli <[email protected]>
* i965: Unhardcode a few more stage names and abbreviations.Kenneth Graunke2015-04-302-11/+5
| | | | | | | | | | | | | | | The stage_abbrev and stage_name fields in backend_visitor provide what we need without any additional effort. It also means we'll get the right names for compute shaders, SIMD8 geometry shaders, and both kinds of tessellation shaders. This does unfortunately change the capitalization of the stage abbreviation in the INTEL_DEBUG=optimizer output filenames. It doesn't seem worth adding code to handle, though. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* mesa: add GL_OES_EGL_syncMarek Olšák2015-04-301-0/+1
| | | | | This is an empty extension whose presence means that EGL sync objects can be used with ES contexts.
* i965/blorp: Prepare drawing rectangle for flipped coordinatesTopi Pohjolainen2015-04-301-2/+2
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Add support for layered renderingTopi Pohjolainen2015-04-304-5/+9
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Allow blend state to be set for multiple render targetsTopi Pohjolainen2015-04-303-19/+18
| | | | | | | | | | | Original blorp writes only one buffer per shader invocation. Once the launch mechanism is shared with glsl-based programs there will be need for supporting multiple render targets. Also drop the always constant color write disable settings. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Prepare for attributes other than render positionTopi Pohjolainen2015-04-304-7/+12
| | | | | | | | | | | | | | | Note that the magic number of one in gen7 logic is replaced by BRW_SF_URB_ENTRY_READ_OFFSET ( == 1 also) for clarity. On gen6 the change from zero to one (BRW_SF_URB_ENTRY_READ_OFFSET) has no effect for native blorp as blorp doesn't use any additional attributes. In fact, regular pipeline setup always uses BRW_SF_URB_ENTRY_READ_OFFSET even when there are no additional attributes. Hence the change makes the two (blorp and regular) consistent. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Remove unused argumentsTopi Pohjolainen2015-04-303-21/+12
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/gen7/blorp: Remove unused argumentsTopi Pohjolainen2015-04-301-47/+28
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Allow caller to provide sampler settingsTopi Pohjolainen2015-04-303-8/+14
| | | | | | | v2 (Ken): s/use_unorm_coords/non_normalized_coords/ Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Refactor vertex buffer state setupTopi Pohjolainen2015-04-301-26/+34
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Remove constant parameterTopi Pohjolainen2015-04-303-20/+0
| | | | | | | | This was still needed when we had support for blorp clears but now this is fixed to nop. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/gen8: Expose state base address setupTopi Pohjolainen2015-04-302-2/+5
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/ps/gen8: Refactor state uploadingTopi Pohjolainen2015-04-302-26/+58
| | | | | | | | | v2: Use SET_FIELD() for sampler count, and for that reason added GEN7_PS_SAMPLER_COUNT_MASK. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/ps/gen7: Refactor state uploadingTopi Pohjolainen2015-04-302-20/+45
| | | | | | | | | | | Now the uploading depends only on the input parameters instead of consulting the current gl-state. v2: Rebased on top of sampler count clamping Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Refactor sampler state setupTopi Pohjolainen2015-04-302-22/+47
| | | | | | | | v2 (Matt): Moved * to the name. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Remove dependency to tex object in default color setupTopi Pohjolainen2015-04-301-11/+11
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Refactor and expose brw_upload_binding_table()Topi Pohjolainen2015-04-302-7/+21
| | | | | | | | | | | | Read and write parts of the state stage are also split into explicit arguments allowing future patches to use constant program data. v2 (Ken): s/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/ Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Expose and refactor brw_update_renderbuffer_surfaces()Topi Pohjolainen2015-04-302-21/+35
| | | | | | | | | | Note that brw_update_renderbuffer_surfaces() already had a helper variable which was used in parallel to direct access of the current draw buffer of the context. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Refactor rb surface setup to allow caller to store offsetsTopi Pohjolainen2015-04-305-58/+59
| | | | | | | | | | | Notice that in gen7_wm_surface_state.c there is also indentation change in the surrounding code removing tabs. v2 (Matt): Fixed whitespace: tabs -> spaces Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/gen8: Use constant pointers for reading miptree detailsTopi Pohjolainen2015-04-301-2/+2
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/ps: Use SET_FIELD() for sampler countTopi Pohjolainen2015-04-303-4/+7
| | | | | | | | The value is actually clamped to 0-16 as sample state pointer can be used to support more than 16 samplers. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Don't try to apply the opt_sampler_eot extension for vsNeil Roberts2015-04-291-0/+3
| | | | | | | | | | | | | | | | | | | The opt_sampler_eot optimisation of fs_visitor effectively assumes that it is running on a fragment shader because it casts the program key to a brw_wm_prog_key. However on Skylake fs_visitor can also be used for vertex shaders. It looks like this usually works anyway because the optimisation is skipped if key->nr_color_regions != 1. However for a vertex shader the key is actually a brw_vs_prog_key so the space for nr_color_regions is probably taken up by key->base.program_string_id. This can end up making nr_color_regions be 1 in which case the function will later assert when the last instruction is not FS_OPCODE_FB_WRITE. This was making the DEQP test suite assert. Presumably this only happens there because that compiles a lot of shaders so it would end up with a high value for program_string_id. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* util/macros: Move DIV_ROUND_UP to util/macros.hAxel Davy2015-04-291-4/+1
| | | | | | | Move DIV_ROUND_UP to a shared location accessible everywhere Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* mesa: Fix glGetProgramiv(GL_ACTIVE_ATTRIBUTES).Jose Fonseca2015-04-291-2/+4
| | | | | | | | | | | | It's returning random values, because RESOURCE_VAR() is casting different objects into ir_variable pointers. This updates _mesa_count_active_attribs to filter the resources with the same logic used in _mesa_longest_attribute_name_length. https://bugs.freedesktop.org/show_bug.cgi?id=90207 Reviewed-by: Tapani Pälli <[email protected]>
* meta: remove unneeded #include colortab.hBrian Paul2015-04-281-1/+0
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* mesa: remove unneeded #include colortab.hBrian Paul2015-04-281-1/+0
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* mesa: remove unused options var in compile_shader()Brian Paul2015-04-281-3/+0
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* st/mesa: allow glsl version up to 410, enable ARB_shader_precisionIlia Mirkin2015-04-281-2/+4
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965/vs: Remove unnecessary NULL check on generate_code() result.Kenneth Graunke2015-04-271-2/+1
| | | | | | | | | Code generation is not allowed to fail for any reason - in fact, fs_generator has no mechanism for failing. The visitor is responsible for that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>