aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_vs.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: consolidate generation checkTimothy Arceri2016-07-071-6/+6
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: don't copy VS attribute work arounds for HSW+Timothy Arceri2016-07-071-2/+4
| | | | | | | These workarounds are not required for HSW and above so stop copying them at VS key generation which is called at draw time. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: make more effective use of SamplersUsedTimothy Arceri2016-07-051-2/+1
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: move vs outputs written into a helperTimothy Arceri2016-06-221-31/+43
| | | | | | | We will reuse this for fs key generation for the on disk shader cache. Reviewed-by: Iago Toral Quiroga <[email protected]>
* mesa: Rename CoordReplaceBits back to CoordReplace.Mathias Fröhlich2016-06-161-1/+1
| | | | | | | | It used to be called like that and fits better with 80 columns. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* i965: Convert i965 to use CoordsReplaceBits.Mathias Fröhlich2016-06-161-5/+1
| | | | | | | | Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* i965: Keep track of the per-thread scratch allocation in brw_stage_state.Francisco Jerez2016-06-131-5/+3
| | | | | | | | | | | | | | | | This will be used to find out what per-thread slot size a previously allocated scratch BO was used with in order to fix a hardware race condition without introducing additional stalls or memory allocations. Instead of calling brw_get_scratch_bo() manually from the various codegen functions, call a new helper function that keeps track of the per-thread scratch size and conditionally allocates a larger scratch BO. v2: Handle BO allocation manually instead of relying on brw_get_scratch_bo (Ken). Cc: <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Shrink stage_prog_data param array lengthJordan Justen2016-05-291-3/+1
| | | | | | | | | | | | | | It appears we were over-allocating these arrays. Previously we would use nir->num_uniforms directly for scalar programs, and multiply it by 4 for vec4 programs. Instead we should have been dividing by 4 in both cases to convert from bytes to a gl_constant_value count. The size of gl_constant_value is 4 bytes. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Mark brw const in brw_state_dirty and callers.Kenneth Graunke2016-05-161-1/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add support for GL_ARB_cull_distanceKristian Høgsberg Kristensen2016-05-131-0/+4
| | | | | Signed-off-by: Kristian Høgsberg Kristensen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Switch TCS gl_program/gl_shader_program checks over to TES.Kenneth Graunke2015-12-221-1/+1
| | | | | | | | | | | | | | | Tessellation control shaders are optional, but evaluation shaders will always be present when using tessellation. However, we'll always enable the TCS (HS) hardware stage when using tessellation - we'll just create a program on the fly. That program, however, won't have a gl_program or gl_shader_program. So we shouldn't check brw->tess_ctrl_program or shader_prog->_LinkedShaders[MESA_SHADER_TESS_CTRL] - if we want to know whether tessellation is enabled, we should look for a TES. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Force VS -> TCS varyings to use the SSO VUE map layout.Kenneth Graunke2015-12-141-1/+3
| | | | | | | | | | | | The compact VUE map only works when varying packing is in use. Unfortunately, varying packing is disabled for TCS inputs. This is needed to fix Piglit's tcs-input-read-array-interface test. v2: Make lines fit in 80 columns (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: De-duplicate key_debug() function.Kenneth Graunke2015-12-021-10/+0
| | | | | | | | | | | | This appeared in brw_vs.c and brw_wm.c, should have appeared in brw_gs.c, and was soon going to have to be in brw_tcs.c and brw_tes.c as well. So, instead, move it to a central location (which has to know about both struct brw_context and perf_debug()). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Push down inclusion of brw_program.h.Matt Turner2015-11-241-0/+1
| | | | | | | We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <[email protected]>
* i965: Convert scalar_* flags to a scalar_stage array.Kenneth Graunke2015-11-161-3/+4
| | | | | | | | | I was going to add scalar_tcs and scalar_tes flags, and then thought better of it and decided to convert this to an array. Simpler. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Print input/output VUE maps on INTEL_DEBUG=vs, gs.Kenneth Graunke2015-11-131-1/+5
| | | | | | | | | | | | I've been carrying around a patch to do this for the last few months, and it's been exceedingly useful for debugging GS and tessellation problems. I've caught lots of bugs by inspecting the interface expectations of two adjacent stages. It's not that much spam, so I figure we may as well just print it. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Matt Turner <[email protected]>
* i965: Do legacy userclipping in OpenGL ES 1.x contexts.Ian Romanick2015-10-301-1/+2
| | | | | | | | | | | | | | | Commit fba4823a disabled user clipping for everything except compatibility profile. Core profile and OpenGL ES 2.0+ have all removed the classic, OpenGL 1.0 user clip planes. ES 1.x, however, still has them. Fixes OpenGL ES 1.1 conformance mustpass.c and userclip.c Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Olivier Berthier <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92639 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92641
* mesa: replace UsesClipDistance with ClipDistanceArraySizeMarek Olšák2015-10-201-1/+1
| | | | | | | This is more practical and needed by gallium. Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965/vs: Move URB entry_size and read_length calculations to compile_vsJason Ekstrand2015-10-191-34/+0
| | | | Reviewed-By: Eduardo Lima Mitev <[email protected]>
* i965: Rename brw_foo_emit to brw_compile_fooJason Ekstrand2015-10-191-5/+5
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/vs: Rework vs_emit to take a nir_shader and a brw_compilerJason Ekstrand2015-10-191-2/+14
| | | | | | | | | | | This commit removes all dependence on GL state by getting rid of the brw_context parameter and the GL data structures. v2 (Jason Ekstrand): - Patch use_legacy_snorm_formula through as a function argument rather than trying to go through the shader key. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/vs: Drop hack that created NIR for fixed function vertex programs.Kenneth Graunke2015-10-171-12/+0
| | | | | | | | Marek made core Mesa call ProgramStringNotify(), which solves this properly. The hack is no longer needed. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix unsafe pointer when dumping VS/FS IRIago Toral Quiroga2015-10-121-1/+1
| | | | | | | | | | | | | | | | | | | | | For the VS and FS stages that use ARB_vertex_program or ARB_fragment_program we don't have a shader program, however, when debuging is enabled, we call brw_dump_ir like this: brw_dump_ir("vertex", prog, &vs->base, &vp->program.Base); where vs will be NULL (since prog is NULL). As pointed out by Chris, this &vs->base is not really a dereference, it simply computes a new address that just happens to be 0x0 because the offset of base in brw_shader is 0. Then brw_dump_ir will see a NULL pointer and not do anything. This is why this does not crash at the moment. However, this does not look very safe (it would crash for any location of base that is not the first in brw_shader), so patch it to prevent a potential (even if unlikely) problem in the future. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/vs: Fix a subtlety in the nr_attributes == 0 workaround.Kenneth Graunke2015-10-101-5/+7
| | | | | | | | | | | | | | | | | | | nr_attributes is used to compute first_non_payload_grf, which is the first register we're allowed to use for ordinary register allocation. The hardware requires us to read at least one pair of values, but we're completely free to overwrite that garbage register with whatever we like. Instead of altering nr_attributes, we should alter urb_read_length, which only affects the amount we ask the VF to read. This should save us a register in trivial cases (which admittedly isn't very useful). While we're at it, improve the explanation in the comments. v2: Actually do what I said (caught by Ilia). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/vs: Unify URB entry size/read length calculations between backends.Kenneth Graunke2015-10-101-0/+32
| | | | | | | | | | | | | | | | | Both the vec4 and scalar VS backends had virtually identical URB entry size and read length calculations. We can move those up a level to backend-agnostic code and reuse it for both. Unfortunately, the backends need to know nr_attributes to compute first_non_payload_grf, so I had to store that in prog_data. We could use urb_read_length, but that's nr_attributes rounded up to a multiple of two, so doing so would waste a register in some cases. There's more code to be removed in the vec4 backend, but that will come in a follow-on patch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Move brw_get_shader_time_index() call out of emit functionsKristian Høgsberg Kristensen2015-10-081-1/+5
| | | | | | | | | | brw_get_shader_time_index() is all tangled up in brw_context state and we can't call it from the compiler. Thanks the Jasons recent refactoring, we can just get the index and pass to the emit functions instead. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move brw_select_clip_planes() to brw_shader.cppKristian Høgsberg Kristensen2015-10-081-25/+0
| | | | | | | We call this from the compiler so move it to brw_shader.cpp. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move brw_dump_ir() out of brw_*_emit() functionsKristian Høgsberg Kristensen2015-10-081-0/+3
| | | | | | | We move these calls one level up into the codegen functions. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move prog_data uniform setup to the codegen levelJason Ekstrand2015-10-021-0/+9
| | | | | | | | | | As of now, uniform setup is more-or-less unified between vec4 and fs and no longer requires the fs_visitor. This makes uniform setup more of a language/API thing than a backend compiler thing. This commit moves setting up the stage_prog_data.params arrays to the same place as we set up the rest of stage_prog_data. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move binding table setup to codegen time.Jason Ekstrand2015-10-021-0/+5
| | | | | | | | | Setting up binding tables really has little to do with the actual process of turning shaders into instructions; it's more part of setting up prog_data. This commit moves it out of the visitors and with the rest of the prog_data setup stuff. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Pull stage_prog_data.nr_params out of the NIR shaderJason Ekstrand2015-10-021-12/+7
| | | | | | | | | | | | | Previously, we had a bunch of code in each stage to figure out how many slots we needed in stage_prog_data.param. This code was mostly identical across the stages and had been copied and pasted around. Unfortunately, this meant that any time you did something special, you had to add code for it to each of these places. In particular, none of the stages took subroutines into account; they were working entirely by accident. By taking this data from the NIR shader, we know the exact number of entries we need and everything goes a bit smoother. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/vs: Move lazy NIR creation to codegen_vs_progJason Ekstrand2015-10-021-0/+13
| | | | | | | | The next commit will add code to codegen_vs_prog that requires the NIR shader to be there in all cases. It doesn't hurt anything to just move it from brw_vs_emit to its only caller. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Get rid of prog_data compare functionsJason Ekstrand2015-09-301-21/+0
| | | | | | They are no longer used. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Simplify handling of VUE map changes.Kenneth Graunke2015-09-261-15/+0
| | | | | | | | | | | | | | | | | | | | The old code was disasterously complex - spread across multiple atoms which may not even run, inspecting the dirty bits to try and decide whether it was necessary to do checks...storing VS information in brw_context...extra flagging... This code tripped me and Carl up very badly when working on the shader cache code. It's very fragile and hard to maintain. Now that geometry shaders only depend on their inputs and don't have to worry about the VS VUE map, we can dramatically simplify this: just compute the VUE map coming out of the geometry shader stage in brw_upload_programs. If it changes, flag it. Done. v2: Also check vue_map.separable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Don't re-layout varyings for separate shader programs.Kenneth Graunke2015-09-261-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, our VUE map code always assigned slots to varyings sequentially, in one contiguous block. This was a bad fit for separate shaders - the GS input layout depended or the VS output layout, so if we swapped out vertex shaders, we might have to recompile the GS on the fly - which rather defeats the point of using separate shader objects. (Tessellation would suffer from this as well - we could have to recompile the HS, DS, and GS.) Instead, this patch makes the VUE map for separate shaders use a fixed layout, based on the input/output variable's location field. (This is either specified by layout(location = ...) or assigned by the linker.) Corresponding inputs/outputs will match up by location; if there's a mismatch, we're allowed to have undefined behavior. This may be less efficient - depending what locations were chosen, we may have empty padding slots in the VUE. But applications presumably use small consecutive integers for locations, so it hopefully won't be much worse in practice. 3% of Dota 2 Reborn shaders are hurt, but only by 2 instructions. This seems like a small price to pay for avoiding recompiles. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Move perf_debug code to brw_codegen_*_prog()Kristian Høgsberg Kristensen2015-09-141-5/+24
| | | | | | | | | | We're trying to avoid a libdrm dependency in the core compiler, so let's move the perf_debug code one level up from the brw_*_emit() helpers to the brw_codegen_*_prog() helpers. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Optimize VUE map comparisons.Kenneth Graunke2015-09-031-2/+2
| | | | | | | | | | | | The entire VUE map is computed based on the slots_valid bitfield; calling brw_compute_vue_map on the same bitfield will return the same result. So we can simply compare those. struct brw_vue_map is 136 bytes; doing a single 8-byte comparison is much cheaper and should work just as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Don't do legacy userclipping in non-compatibility contexts.Kenneth Graunke2015-09-031-0/+1
| | | | | | | | | | | | | | | | | According to the GLSL 1.50 specification, page 76: "The shader must also set all values in gl_ClipDistance that have been enabled via the OpenGL API, or results are undefined." With this patch, we only enable clip distance writes when the shader actually writes them. We no longer force a value to be written when clip planes are enabled in the API. This could mean the first varying slot would be used as clip distances - I believe it should be the safe kind of undefined behavior. Empirically, it doesn't seem to cause a problem. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Remove the brw_vue_prog_key base class.Kenneth Graunke2015-09-031-11/+11
| | | | | | | | | The legacy userclip fields are only used for the vertex shader, and at that point there's only program_string_id and the tex struct, which are common to all keys. So there's no need for a "VUE" key base class. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Delete the brw_vue_program_key::userclip_active flag.Kenneth Graunke2015-09-031-11/+6
| | | | | | | | | | | | | | | | | There are two uses of this flag. The primary use is checking whether we need to emit code to convert legacy gl_ClipVertex/gl_Position clipping to clip distances. In this case, we also have to upload the clip planes as uniforms, which means setting nr_userclip_plane_consts to a positive value. Checking if it's > 0 works for detecting this case. Gen4-5 also wants to know whether we're doing clipping at all, so it can emit user clip flags. Checking if output_reg[VARYING_SLOT_CLIP_DIST0] is set to a real register suffices for this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Remove legacy clip plane handling from geometry shaders.Kenneth Graunke2015-09-031-17/+8
| | | | | | | | | | | | | | | | | | We only support geometry shaders in core profiles, where gl_ClipVertex doesn't exist. Presumably the even older behavior of clipping to gl_Position isn't supported either. In fact, GLSL 1.50 page 76 claims: "The shader must also set all values in gl_ClipDistance that have been enabled via the OpenGL API, or results are undefined." So we don't need to handle legacy clipping in geometry shaders. I think Paul added this back when we were considering supporting the old GL_ARB_geometry_shader4 extension. This removes a non-orthagonal state dependency on GS compilation. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Reserve enough parameter entries for all image uniforms used in the ↵Francisco Jerez2015-08-111-1/+2
| | | | | | | | | program. v2: Add CS support. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Define and initialize image parameter structure.Francisco Jerez2015-08-111-1/+4
| | | | | | | | | | | | | | | | This will be used to pass image meta-data to the shader when we cannot use typed surface reads and writes. All entries except surface_idx and size are otherwise unused and will get eliminated by the uniform packing pass. size will be used for bounds checking with some image formats and will be useful for ARB_shader_image_size too. surface_idx is always used. v2: Add CS support. Move the image_params array back to brw_stage_prog_data. v3: Improve documentation. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vs: Get rid of brw_vs_compile completely.Kenneth Graunke2015-07-091-12/+8
| | | | | | | | | After tearing it out another level or two, and just passing the key and vp directly, we can finally remove this struct. It also eliminates a pointless memcpy() of the key. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/vec4: Move total_scratch calculation into the visitor.Kenneth Graunke2015-07-091-4/+1
| | | | | | | | This is more consistent with how we do it in the FS backend, and reduces a tiny bit of duplication. It'll also allow for a bit more tidying. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/vec4: Move perf_debug about register spilling into the visitor.Kenneth Graunke2015-07-091-4/+0
| | | | | | | | | | | | This patch makes us only issue the performance warning about register spilling if we actually spilled registers. We also use scratch space for indirect addressing and the like. This is basically commit c51163b0cf7aff0375b1a5ea4cb3da9d9e164044 for the vec4 backend. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Split VUE map handling out of brw_vs.c into brw_vue_map.c.Kenneth Graunke2015-06-221-102/+0
| | | | | | | | | | | This was originally only used by the vertex shader, but it's now used by the geometry shader as well, and will also eventually be used for tessellation control and evaluation shaders. I suspect it will be easier to find in a file named after the concept. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Rename brw_compile to brw_codegenJason Ekstrand2015-04-221-3/+3
| | | | | | | | | | | | This name better matches what it's actually used for. The patch was generated with the following command: for file in *; do sed -i -e s/brw_compile/brw_codegen/g $file done Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use device_info instead of the context for computing vue mapsJason Ekstrand2015-04-221-3/+5
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Rename do_<stage>_prog to brw_compile_<stage>_prog (and export)Carl Worth2015-04-021-9/+9
| | | | | | | | | | | | This is in preparation for these functions to be called from other files. This commit is intended to have no functional change. It exists in preparation for some upcoming code movement in preparation for the shader cache. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>