aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_program.h
Commit message (Collapse)AuthorAgeFilesLines
* i965: keep gl_program shader info in sync after gather infoTimothy Arceri2016-12-201-1/+1
| | | | | | | | | | It's possible that nir_shader was cloned and it no longer contains a pointer to the shader_info in gl_program. So we need to copy shader_info back to gl_program if that is the case. Fixes a regression with NIR_TEST_CLONE=true Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98840
* i965: only try print GLSL IR once when using INTEL_DEBUG to dump irTimothy Arceri2016-11-171-2/+1
| | | | | | | | Since we started releasing GLSL IR after linking the only time we can print GLSL IR is during linking. When regenerating variants only NIR will be available. Reviewed-by: Emil Velikov <[email protected]>
* i965: create populate key functions for tcs and tesTimothy Arceri2016-09-271-4/+6
| | | | | | These will be used by the on disk shader cache. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: make more effective use of SamplersUsedTimothy Arceri2016-07-051-1/+0
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* glsl/mesa: split gl_shader in twoTimothy Arceri2016-06-301-1/+1
| | | | | | | | | | | | | | | | | There are two distinctly different uses of this struct. The first is to store GL shader objects. The second is to store information about a shader stage thats been linked. The two uses actually share few fields and there is clearly confusion about their use. For example the linked shaders map one to one with a program so can simply be destroyed along with the program. However previously we were calling reference counting on the linked shaders. We were also creating linked shaders with a name even though it is always 0 and called the driver version of the _mesa_new_shader() function unnecessarily for GL shader objects. Acked-by: Iago Toral Quiroga <[email protected]>
* i965: Move brw_create_nir to brw_program.cJason Ekstrand2016-05-261-0/+6
| | | | | | | | This way it's no longer part of libi965_compiler.la since it depends on GLSL and ARB program stuff. Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Handle mix-and-match TCS/TES with separate shader objects.Kenneth Graunke2015-12-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GL_ARB_separate_shader_objects allows the application to mix-and-match TCS and TES programs separately. This means that the interface between the two stages isn't known until the final SSO pipeline is in place. This isn't a great match for our hardware: the TCS and TES have to agree on the Patch URB entry layout. Since we store data as per-patch slots followed by per-vertex slots, changing the number of per-patch slots can significantly alter the layout. This can easily happen with SSO. To handle this, we store the [Patch]OutputsWritten and [Patch]InputsRead bitfields in the TCS/TES program keys, introducing program recompiles. brw_upload_programs() decides the layout for both TCS and TES, and passes it to brw_upload_tcs/tes(), which store it in the key. When creating the NIR for a shader specialization, we override nir->info.inputs_read (and friends) to the program key's values. Since everything uses those, no further compiler changes are needed. This also replaces the hack in brw_create_nir(). To avoid recompiles, brw_precompile_tes() looks to see if there's a TCS in the linked shader. If so, it accounts for the TCS outputs, just as brw_upload_programs() would. This eliminates all recompiles in the non-SSO case. In the SSO case, there should only be recompiles when using a TCS and TES that have different input/output interfaces. Fixes Piglit's mix-and-match-tcs-tes test. v2: Pull the brw_upload_programs code into a brw_upload_tess_programs() helper function (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Add tessellation control shaders.Kenneth Graunke2015-12-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Add tessellation evaluation shadersKenneth Graunke2015-12-221-0/+2
| | | | | | | | | | | | | | | | | | | The TES is essentially a post-tessellator VS, which has access to the entire TCS output patch, and a special gl_TessCoord input. Otherwise, they're very straightforward. This patch implements SIMD8 tessellation evaluation shaders for Gen8+. The tessellator can generate a lot of geometry, so operating in SIMD8 mode (8 vertices per thread) is more efficient than SIMD4x2 mode (only 2 vertices per thread). I have another patch which implements SIMD4x2 mode for older hardware (or via an environment variable override). We currently handle all inputs via the pull model. v2: Improve comments (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Push down inclusion of brw_program.h.Matt Turner2015-11-241-0/+2
| | | | | | | We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <[email protected]>
* i965: Move the entire compiler API into a single fileJason Ekstrand2015-10-191-123/+1
| | | | | | | | | | | At this point, the compiler API has been substantially simplified. In the spirit of Kristian's making a compiler library, this commit makes a single header file that contains, more-or-less, the entire compiler API. There's still a bit of cleanup to do particularly in the area of geometry shaders. However, this gets us much closer to having a separate compiler. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Get rid of prog_data compare functionsJason Ekstrand2015-09-301-4/+0
| | | | | | They are no longer used. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gs: Remove the dependency on the VS VUE map.Kenneth Graunke2015-09-261-2/+0
| | | | | | | | | | | | | | | | Because we only support geometry shaders in core profile, we can safely ignore any driver-extending of VS outputs. Those are: - Legacy userclipping (doesn't exist in core profile) - Edgeflag copying (Gen4-5 only, no GS support) - Point coord replacement (Gen4-5 only, no GS support) - front/back color hacks (Gen4-5 only, no GS support) v2: Rebase; leave a comment about why SSO works. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Remove the brw_vue_prog_key base class.Kenneth Graunke2015-09-031-17/+15
| | | | | | | | | The legacy userclip fields are only used for the vertex shader, and at that point there's only program_string_id and the tex struct, which are common to all keys. So there's no need for a "VUE" key base class. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Delete the brw_vue_program_key::userclip_active flag.Kenneth Graunke2015-09-031-6/+3
| | | | | | | | | | | | | | | | | There are two uses of this flag. The primary use is checking whether we need to emit code to convert legacy gl_ClipVertex/gl_Position clipping to clip distances. In this case, we also have to upload the clip planes as uniforms, which means setting nr_userclip_plane_consts to a positive value. Checking if it's > 0 works for detecting this case. Gen4-5 also wants to know whether we're doing clipping at all, so it can emit user clip flags. Checking if output_reg[VARYING_SLOT_CLIP_DIST0] is set to a real register suffices for this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Move brw_setup_tex_for_precompile to brw_program.[ch].Kenneth Graunke2015-09-031-0/+4
| | | | | | | | | | | | This living in brw_fs.{h,cpp} is a historical artifact of us supporting texturing for fragment shaders before any other stages. It's kind of awkward given that we use it for all stages. This avoids having to include brw_fs.h in geometry shader code in order to access this function. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Rename brw_vec4_prog_data/key to brw_bue_prog_data/keyKristian Høgsberg2014-12-101-3/+3
| | | | | | | | | | | These structs aren't vec4 specific, they are shared by shader stages operating on Vertex URB Entries (VUEs). VUEs are the data structures in the URB that hold vertex data between the pipeline geometry stages. Using vue in the name instead of vec4 makes a lot more sense, especially when we add scalar vertex shader support. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make vertex color clamp handling code VS specific.Kenneth Graunke2014-12-021-2/+2
| | | | | | | | | | | | | Vertex color clamping only applies to gl_[Secondary]{Front,Back}Color, which are compatibility-only built-in varyings. We only support GS in core profile, so they can't exist in geometry shaders. We can drop several dirty bits from the GS program key - they're unnecessary for a core profile implementation. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Use the enum type for gen6_gather_wa sampler key field.Kenneth Graunke2014-12-021-7/+7
| | | | | | | Requested by Matt Turner. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Drop use of GL types in program keys.Kenneth Graunke2014-12-021-23/+23
| | | | | | | | This is really far removed from the API; we should just use C types. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Move program key structures to brw_program.h.Kenneth Graunke2014-12-021-5/+103
| | | | | | | | | | | | | | With fs_visitor/fs_generator being reused for SIMD8 VS/GS programs, we're running into weird #include patterns, where scalar code #includes brw_vec4.h and such. Program keys aren't really related to SIMD4X2/SIMD8 execution - they mostly capture NOS for a particular shader stage. Consolidating them all in one place that's vec4/scalar neutral should help avoid problems. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Silence unused parameter warning in brw_dump_irIan Romanick2014-10-241-2/+1
| | | | | | | | | | | | Just remove the parameter. Silences: brw_program.c: In function 'brw_dump_ir': brw_program.c:566:33: warning: unused parameter 'brw' [-Wunused-parameter] Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove unused sampler key fieldsTopi Pohjolainen2014-04-081-6/+0
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Widen sampler key bitfields for 32 samplersChris Forbes2014-03-021-3/+3
| | | | | | | | | | | | | Previously the `high` 16 samplers on Haswell+ would not get sampler workarounds applied. Don't bother widening YUV fields, since they're ignored and going away soon anyway. Signed-off-by: Chris Forbes <[email protected]> Cc: "10.1" <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Refactor debug dumping of GLSL IR.Eric Anholt2014-02-221-0/+5
| | | | | | | | | This was only going to get worse when tesselation shows up, and was causing too much extra duplication in my stderr changes coming up. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Unify fs_generator:: and vec4_generator::mark_surface_used as a free ↵Francisco Jerez2014-02-191-0/+4
| | | | | | | | function. This way it can be used anywhere. I need it from the visitor. Reviewed-by: Paul Berry <[email protected]>
* i965: Move up duplicated fields from stage-specific prog_data to ↵Francisco Jerez2014-02-191-0/+7
| | | | | | | | | | | | | brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and the simplification of the stage-specific brw_*_prog_data_compare. Reviewed-by: Paul Berry <[email protected]>
* i965: Add Gen6 gather wa to sampler keyChris Forbes2014-02-081-0/+11
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/Gen7: Include bitfield in the sampler key for CMS layoutChris Forbes2013-12-071-0/+5
| | | | | | | | | | | | | | | | | | | | We need to emit extra shader code in this case to sample the MCS surface first; we can't just blindly do this all the time since IVB will sometimes try to access the MCS surface even if disabled. V3: Use actual MSAA layout from the texture's mt, rather then computing what would have been used based on the format. This is simpler and less fragile - there's at least one case where we might want to have a texture's MSAA layout change based on what the app does (CMS SINT falling back to UMS if the app ever attempts to render to it with a channel disabled.) This also obsoletes V2's 1/10 -- compute_msaa_layout can now remain an implementation detail of the miptree code. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: w/a for gather4 green RG32FChris Forbes2013-10-031-0/+5
| | | | | | | | | V4: Only flag quirks if there are any uses of gather in the shader, to avoid spurious recompiles just because someone happened to use RG32F. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make brw_{program,vs}.h safe to include from C++.Paul Berry2013-08-231-0/+8
| | | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Shorten sampler loops in key setup.Kenneth Graunke2013-08-191-0/+1
| | | | | | | | | Now that we have the number of samplers available, we don't need to iterate over all 16. This should be particularly helpful for vertex shaders. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Pass brw_context to functions rather than intel_context.Kenneth Graunke2013-07-091-1/+1
| | | | | | | | | | | | | | This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965: Make perf_debug() output to GL_ARB_debug_output in a debug context.Eric Anholt2013-03-051-1/+2
| | | | | | | | I tried to ensure that performance in the non-debug case doesn't change (we still just check one condition up front), and I think the impact is small enough in the debug context case to warrant including all of it. Reviewed-by: Jordan Justen <[email protected]>
* i965: Add texrect scale parameters before pointers to ParameterValues.Eric Anholt2012-12-281-0/+1
| | | | | | | | | | | | | If adding scale parameters during program compile caused a realloc of ParameterValues, then the driver uniform storage set up by _mesa_associate_uniform_storage() would point to potentially freed memory. Note that this uses TexturesUsed, which may change at runtime for GLSL when sampler uniforms change. This is a flaw in our handling of texrect in general, and not one I'm fixing currently. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Index sampler program key data by linker-assigned index.Kenneth Graunke2012-08-271-1/+1
| | | | | | | | | Now that most things are based on the linker-assigned index, it makes sense to convert the arrays in the VS/WM program key as well. It seems silly to leave them indexed by texture unit. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add performance debug for shader recompiles.Eric Anholt2012-08-121-0/+2
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move loop over texture units into brw_populate_sampler_prog_key.Kenneth Graunke2012-07-121-1/+2
| | | | | | | | | | | The whole reason I avoided this was because it might operate on a brw_vertex_program or a brw_fragment_program. However, that isn't a problem: all we need is the gl_program base type. This avoids awkwardly passing the loop counter 'i' as a parameter, simplifies both callers, and also plumbs prog in place for future use. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Delete previous workaround for textureGrad with shadow samplers.Kenneth Graunke2012-07-121-12/+0
| | | | | | | | | | | | | | | It had many problems: - The shadow comparison was done post-filtering. - It required state-dependent recompiles whenever the comparison function changed. - It didn't even work: many cases hit assertion failures. - I never implemented it for the VS. The new lowering pass which converts textureGrad to textureLod by computing the LOD value works much better. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Factor out texturing related data from brw_wm_prog_key.Kenneth Graunke2011-12-191-0/+60
The idea is to reuse this for the VS and (in the future) GS as well. v2: Include yuvtex data since we're not dropping GL_MESA_ycbycr. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1] Reviewed-by: Ian Romanick <[email protected]>