aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/nir: Remove the prog parameter from brw_nir_lower_inputsJason Ekstrand2015-10-021-4/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/shader: Get rid of the shader, prog, and shader_prog fieldsJason Ekstrand2015-10-0219-99/+68
| | | | | | | | | | Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs,vec4: Get rid of the sanity_param_countJason Ekstrand2015-10-025-31/+0
| | | | | | | | It doesn't exist for anything other than an assert that, as far as I can tell, isn't possible to trip. Soon, we will remove prog from the visitor entirely and this will become even more impossible to hit. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Use nir info instead of pulling things out of [shader_]progJason Ekstrand2015-10-023-10/+9
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use the nir info instead of pulling things out of [shader_]progJason Ekstrand2015-10-023-20/+19
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Move sampler unit lookup into rescale_texcoordJason Ekstrand2015-10-023-14/+13
| | | | | | | | | The texunit variable we create and assign in nir_emit_texture gets passed through two more layers of function calls before it gets to its sole use in rescale_texcoord. The best part is that we already pass the sampler into rescale_texcoord so we can just look it up there. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cs: Remove the prog argument from local_id_payload_dwordsJason Ekstrand2015-10-023-7/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/backend_shader: Add a field to store the NIR shaderJason Ekstrand2015-10-028-41/+39
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/glsl: Take a gl_shader_program and a stage rather than a gl_shaderJason Ekstrand2015-10-021-2/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move prog_data uniform setup to the codegen levelJason Ekstrand2015-10-026-17/+26
| | | | | | | | | | As of now, uniform setup is more-or-less unified between vec4 and fs and no longer requires the fs_visitor. This makes uniform setup more of a language/API thing than a backend compiler thing. This commit moves setting up the stage_prog_data.params arrays to the same place as we set up the rest of stage_prog_data. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move binding table setup to codegen time.Jason Ekstrand2015-10-0210-66/+67
| | | | | | | | | Setting up binding tables really has little to do with the actual process of turning shaders into instructions; it's more part of setting up prog_data. This commit moves it out of the visitors and with the rest of the prog_data setup stuff. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/shader: Pull assign_common_binding_table_offsets out of backend_shaderJason Ekstrand2015-10-025-11/+36
| | | | | | | This really has nothing to do with the backend compiler and we'd like to eventually be able to set this up earlier in the compile process. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/nir: Simplify uniform setupJason Ekstrand2015-10-022-24/+16
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/nir: Pull GLSL uniform handling into a common functionJason Ekstrand2015-10-026-151/+133
| | | | | | | | The way we deal with GLSL uniforms and builtins is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/nir: Pull common ARB program uniform handling into a common functionJason Ekstrand2015-10-025-34/+70
| | | | | | | | The way we deal with ARB program uniforms is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/vec4: Use the uniform count from nir_assign_var_locationsJason Ekstrand2015-10-021-21/+11
| | | | | | | | | | Previously, we were counting up uniforms as we set them up. However, this count should be exactly identical to shader->num_uniforms provided by nir_assign_var_locations. (If it's not, we're in trouble anyway because that means that locations don't match up.) This matches what the fs backend is already doing. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/shader: Get rid of the setup_vec4_uniform_value helperJason Ekstrand2015-10-025-41/+0
| | | | | | It's not used by anything anymore Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/shader: Pull setup_image_uniform_values out of backend_shaderJason Ekstrand2015-10-023-20/+42
| | | | | | | | | | I tried to do this once before but Curro pointed out that having it in backend_shader meant it could use the setup_vec4_uniform_values helper which did different things in vec4 and fs. Now the setup_uniform_values function differs only by an assert in the two backends so there's no real good reason to be using it anymore. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/vec4: Get rid of the uniform_vector_size arrayJason Ekstrand2015-10-025-18/+5
| | | | | | | The uniform_vector_size array was only ever used by pack_uniform_registers which no longer needs it. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/vec4: Use the actual channels used in pack_uniform_registersJason Ekstrand2015-10-021-14/+37
| | | | | | | | | | Previously, pack_uniform_registers worked based on the size of the uniform as given to us when we initially set up the uniforms. However, we have to walk through the uniforms and figure out liveness anyway, so we migh as well record the number of channels used as we go. This may also allow us to pack things tighter in a few cases. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Pull stage_prog_data.nr_params out of the NIR shaderJason Ekstrand2015-10-024-24/+14
| | | | | | | | | | | | | Previously, we had a bunch of code in each stage to figure out how many slots we needed in stage_prog_data.param. This code was mostly identical across the stages and had been copied and pasted around. Unfortunately, this meant that any time you did something special, you had to add code for it to each of these places. In particular, none of the stages took subroutines into account; they were working entirely by accident. By taking this data from the NIR shader, we know the exact number of entries we need and everything goes a bit smoother. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/vs: Move lazy NIR creation to codegen_vs_progJason Ekstrand2015-10-022-12/+13
| | | | | | | | The next commit will add code to codegen_vs_prog that requires the NIR shader to be there in all cases. It doesn't hurt anything to just move it from brw_vs_emit to its only caller. Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/vec4: Delete the old vec4_vp codeJason Ekstrand2015-10-028-672/+0
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Delete the old ir_visitor codeJason Ekstrand2015-10-026-2025/+2
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Always use NIRJason Ekstrand2015-10-023-49/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GLSL IR vs. NIR shader-db results for vec4 programs on i965: total instructions in shared programs: 1499328 -> 1388354 (-7.40%) instructions in affected programs: 1245199 -> 1134225 (-8.91%) helped: 7469 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on G4x: total instructions in shared programs: 1436799 -> 1325825 (-7.72%) instructions in affected programs: 1205599 -> 1094625 (-9.20%) helped: 7469 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on Iron Lake: total instructions in shared programs: 1436654 -> 1325682 (-7.72%) instructions in affected programs: 1205503 -> 1094531 (-9.21%) helped: 7468 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on Sandy Bridge: total instructions in shared programs: 2016249 -> 1787033 (-11.37%) instructions in affected programs: 1850547 -> 1621331 (-12.39%) helped: 14856 HURT: 1481 GLSL IR vs. NIR shader-db results for vec4 programs on Ivy Bridge: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 GLSL IR vs. NIR shader-db results for vec4 programs on Bay Trail: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 GLSL IR vs. NIR shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 I also ran our full suite of benchmarks on a Haswell and had the following statistically significant (according to ministat) changes: Test master-glsl master-nir diff bench_OglGeomPoint 461.556 463.006 1.450 bench_OglTerrainFlyInst 184.484 187.574 3.090 bench_OglTerrainPanInst 132.412 136.307 3.895 bench_OglTexFilterAniso 19.653 19.645 -0.008 bench_OglTexFilterTri 58.333 58.009 -0.324 bench_OglVSInstancing 65.049 65.327 0.278 bench_trexoff 69.474 69.694 0.220 bench_valley 40.708 41.125 0.417 v2 (Jason Ekstrand): - Remove more uses of NirOptions as a switch - New shader-db numbers - Added benchmark numbers Reviewed-by: Matt Turner <[email protected]>
* i965: don't forget to free image_param on prog_data freeIlia Mirkin2015-10-021-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: Print reg and reg_offset separately for ATTR files.Kenneth Graunke2015-10-011-1/+1
| | | | | | | | Reading this output was really confusing. reg represents attribute slots; reg_offset is the x/y/z/w component (0..3) within a vec4 slot. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/nir: Refactor input/output lowering setup into helpers.Kenneth Graunke2015-10-011-20/+26
| | | | | | | | | | | | | | The code for input lowering is going to get significantly more complicated shortly, so I wanted to pull it out. Vertex shader inputs are handled nearly identically regardless of vec4/scalar mode, so I opted to not split that. I thought about having each function actually do the lowering, but one pass through nir_lower_io that handles all types (which weren't handled earlier) is probably more efficient. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Allow nir_lower_io() to only lower one type of variable.Kenneth Graunke2015-10-011-2/+2
| | | | | | | | | We may want to use different type_size functions for (e.g.) inputs vs. uniforms. Passing in -1 for mode ignores this, handling all modes as before. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* meta: Handle array textures in scaled MSAA blitsIan Romanick2015-09-301-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old code had some significant problems with respect to sampler2DArray textures. The biggest problem was that some of the code would use vec3 for the texture coordinate type, and other parts of the code would use vec2. The resulting shader would not even compile. Since there were not tests for this path, nobody noticed. The input to the fragment shader is always treated as a vec3. If the source data is only vec2, the vertex puller will supply 0 for the .z component. The texture coordinate passed to the fragment shader is always a vec2 that comes from the .xy part of the vertex shader input. The layer, taken from the .z of the vertex shader input is passed separately as a flat integer. If the generated fragment shader does not use the layer integer, the GLSL linker will eliminate all the dead code in the vertex shader. Fixes the new piglit tests "blit-scaled samples=2 with gl_texture_2d_multisample_array", etc. on i965. Note for stable maintainer: This patch may depend on 46037237, and that patch should be safe for stable. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: Topi Pohjolainen <[email protected]> Cc: Jordan Justen <[email protected]> Cc: "10.6 11.0" <[email protected]>
* i965/miptree: Add PRM references for most struct members (v2)Chad Versace2015-09-301-25/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add comments that link the driver's miptree structures to the hardware structures documented in the PRM. This provides sorely needed orientation to developers new to the miptree code. And for miptree veterans, this clarifies some of the more obscure miptree data. For each driver struct field that closely corresponds to a hardware struct field, add a PRM reference to that hardware field's name. For example, struct intel_mipmap_tree { ... /** * @brief One of GL_TEXTURE_2D, GL_TEXTURE_2D_ARRAY, etc. * * @see RENDER_SURFACE_STATE.SurfaceType * @see RENDER_SURFACE_STATE.SurfaceArray * @see 3DSTATE_DEPTH_BUFFER.SurfaceType */ GLenum target; ... }; Also annotate the INTEL_MSAA_LAYOUT_* enums with the name of the PRM sections that documents the layout. v2: Replace "2D subimage" with "slice", and define what a "slice" is. For Ben. Reviewed-by: Anuj Phogat <[email protected]> (v1) Reviewed-by: Ben Widawsky <[email protected]> (v1)
* i965/miptree: Rename align_w,align_h -> halign,valignChad Versace2015-09-309-52/+62
| | | | | | | | | | | | | | | | | | | The values of intel_mipmap_tree::align_w and ::align_h correspond to the hardware enums HALIGN_* and VALIGN_*. See the confusion? align_h != HALIGN align_h == VALIGN Reduce the confusion by renaming the variables to match the hardware enum names: git ls-files | xargs sed -i -e 's/align_w/halign/g' \ -e 's/align_h/valign/g' Suggested-by: Kenneth Graunke <[email protected]> Acked-by: Ben Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/miptree: Rename intel_miptree_map::mt -> ::linear_mt (v2)Chad Versace2015-09-302-15/+17
| | | | | | | | | | | Because that's what it is. It's an untiled, *linear* miptree. v2: - Add space after /*. - Use one comment per function argument. Reviewed-by: Anuj Phogat <[email protected]> Acked-by: Ben Widawsky <[email protected]>
* i965/miptree: Fix comments for map modeChad Versace2015-09-301-1/+1
| | | | | | | | | The comment for intel_miptree_map::mode claimed that it was a bitmask of GL_MAP_{READ,WRITE,INVALIDATE}_BIT. In reality, the bitmask may include any of {GL,BRW}_MAP_*_BIT. Reviewed-by: Anuj Phogat <[email protected]> Acked-by: Ben Widawsky <[email protected]>
* i965/miptree: More comments for BRW_MAP_DIRECT_BIT (v2)Chad Versace2015-09-301-1/+3
| | | | | | | | Clarify that this bit extends the set of GL_MAP_*_BIT enums. Also fix typo of "temporary". Reviewed-by: Anuj Phogat <[email protected]> Acked-by: Ben Widawsky <[email protected]>
* i965: Remove duplicate copy of is_scalar_shader_stage().Kenneth Graunke2015-09-301-31/+20
| | | | | | | | | | | Jason open coded this in 60befc63 when cleaning up some ugly code; using our existing helper tidies it up a bit more. v2: Drop inline (suggested by Matt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i915: Remember to call intel_prepare_render() before blittingVille Syrjälä2015-09-301-0/+5
| | | | | | | | | | | | | | | | | Bring over the following fix from i965: commit fb3d62fe3d4fc40ba4ad9804d8b6f451316c9ae2 Author: Kenneth Graunke <[email protected]> Date: Tue Aug 6 14:36:09 2013 -0700 i965: Remember to call intel_prepare_render() before blitting. Fixes a crash in the following piglit tests: bin/fbo-sys-blit -auto bin/fbo-sys-sub-blit -auto Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "11.0" <[email protected]>
* i915: Fix texcoord vs. varying collision in fragment programsVille Syrjälä2015-09-302-26/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i915 fragment programs utilize the texture coordinate registers for both texture coordinates and varyings. Unfortunately the code doesn't check if the same index might be in use for both. It just naively uses the index to pick a texture unit, which could lead to collisions. Add an extra mapping step to allocate non conflicting texture units for both uses. The issue can be reproduced with a pair of simple shaders like these: attribute vec4 in_mod; varying vec4 mod; void main() { mod = in_mod; gl_TexCoord[0] = gl_MultiTexCoord0; gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex; } varying vec4 mod; uniform sampler2D tex; void main() { gl_FragColor = texture2D(tex, vec2(gl_TexCoord[0])) * mod; } Fixes many piglit tests on i915: glsl-link-varyings-2 glsl-orangebook-ch06-bump interpolation-none-gl_frontcolor-smooth-fixed interpolation-none-gl_frontcolor-smooth-none interpolation-none-gl_frontcolor-smooth-vertex interpolation-none-gl_frontsecondarycolor-smooth-fixed interpolation-none-gl_frontsecondarycolor-smooth-vertex interpolation-none-gl_frontsecondarycolor-smooth-none interpolation-none-other-flat-fixed interpolation-none-other-flat-none interpolation-none-other-flat-vertex interpolation-none-other-smooth-fixed interpolation-none-other-smooth-none interpolation-none-other-smooth-vertex v2 [idr]: Minor formatting tweaks. Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "11.0" <[email protected]>
* i830: Fix collision between I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0)Ville Syrjälä2015-09-301-4/+4
| | | | | | | | | | | | | | I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0) are trying to occupy the same bit. Move the texture bits upwards a bit to make room for I830_UPLOAD_RASTER_RULES. Now the driver will actually upload the raster rules which is rather important to get the provoking vertex right. Fixes the appearance of glxgears teeth on gen2. Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "10.6 11.0" <[email protected]>
* i965/cs: Upload UBO/SSBO surfacesJordan Justen2015-09-304-1/+30
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Get rid of prog_data compare functionsJason Ekstrand2015-09-3011-135/+1
| | | | | | They are no longer used. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/state_cache: Remove the aux_compare fieldsJason Ekstrand2015-09-302-11/+0
| | | | | | | They haven't been used since 1bba29ed403e735ba0bf04ed8aa2e571884fcaaf so there's no good reason to keep them around. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/copy_image: Fix a copy+past errorJason Ekstrand2015-09-301-1/+1
| | | | | Reported-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove early release of DRI2 miptreeChris Wilson2015-09-301-1/+0
| | | | | | | | | | | | | | | intel_update_winsys_renderbuffer_miptree() will release the existing miptree when wrapping a new DRI2 buffer, so we can remove the early release and so prevent a NULL mt dereference should importing the new DRI2 name fail for any reason. (Reusing the old DRI2 name will result in the rendering going astray, to a stale buffer, and not shown on the screen, but it allows us to issue a warning and not crash much later in innocent code.) Signed-off-by: Chris Wilson <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86281 Reviewed-by: Martin Peres <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/vec4/nir: add nir_intrinsic_memory_barrier supportSamuel Iglesias Gonsalvez2015-09-301-0/+9
| | | | | | | | | | | | | Fix OpenGL ES 3.1 conformance tests: advanced-readWrite-case1-vsfs and advanced-matrix-vsfs. v2: - Fix SHADER_OPCODE_MEMORY_FENCE emission and the allocation of 'tmp' (Francisco). Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Tested-by: Tapani Pälli <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* nir: Use a system value for gl_PrimitiveIDIn.Kenneth Graunke2015-09-291-0/+10
| | | | | | | | | | | | | | | | At least on Intel hardware, gl_PrimitiveIDIn comes in as a special part of the payload rather than a normal input. This is typically what we use system values for. Dave and Ilia also agree that a system value would be nicer. At some point, we should change it at the GLSL IR level as well. But that requires changing most of the drivers. For now, let's at least make NIR do the right thing, which is easy. v2: Add a comment about not creating a temporary (suggested by Iago). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/cs: Generate code to load gl_NumWorkGroupsJordan Justen2015-09-291-0/+28
| | | | | | | | | This code also sets cs_prog_data->uses_num_work_groups which is later used by state setup to indicate that the gl_NumWorkGroups surface needs to be setup. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Setup surface binding for gl_NumWorkGroupsJordan Justen2015-09-295-1/+53
| | | | | | | | | | | | | | | | | | | | | | | | | This will only be setup when the prog_data uses_num_work_groups boolean is set. At this point nothing will set uses_num_work_groups, but soon code will set it when emitting code for the intrinsic that loads gl_NumWorkGroups. We can't emit this surface information earlier at the start of the DispatchCompute* call because we may not have generated the program yet. Until we generate the program, we don't know if the gl_NumWorkGroups variable is accessed. We also can't emit the surface as part of the brw_cs_state atom, because we might not need the surface if gl_NumWorkGroups is not used by the program. Lastly, we cannot emit the surface later (after state upload) in the DispatchCompute* call, because it needs to be run before the brw_cs_state atom is emitted, since it changes the surface state. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Add a binding table entry for gl_NumWorkGroupsJordan Justen2015-09-293-5/+29
| | | | | | | | | | | | | If glDispatchComputeIndirect is used, then the value for this variable must be read from the indirect BO. To allow the same generated code to support indirect and glDispatchCompute, we will also setup a BO for the number of work groups using the intel_upload_data mechanism. This will only be required if the gl_NumWorkGroups variable is accessed. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Store compute invocation information in brw contextJordan Justen2015-09-292-24/+37
| | | | | | | | We will need this in an atom to setup a surface to read the gl_NumWorkGroups values from. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>