summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* nir: add lowering for ffractRob Clark2015-09-162-0/+4
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: The barrier send uses only 1 payload registerJordan Justen2015-09-152-2/+5
| | | | | | | | | | | | | | | | | | | When preparing the barrier payload, the instructions should operate in simd8 mode since we only use 1 payload register. fs_inst::regs_read is also updated to indicate that it only reads one register for SHADER_OPCODE_BARRIER. These issues were flagged by: commit cadd7dd384b33a779d46bd664f456bed4a21a5b7 Author: Jason Ekstrand <[email protected]> Date: Thu Jul 2 15:41:02 2015 -0700 i965/fs: Add a very basic validation pass Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/builder: Use a normal temporary array in nir_channelJason Ekstrand2015-09-151-1/+2
| | | | | | | | C++ gets cranky if we take references of temporaries. This isn't a problem yet in master because nir_builder is never used from C++. However, it will be in the future so we should fix it now. Reviewed-by: Rob Clark <[email protected]>
* freedreno/a4xx: more texture formatsRob Clark2015-09-151-7/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: border-color supportRob Clark2015-09-154-2/+31
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: wire up texture clamp loweringRob Clark2015-09-152-20/+80
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: helper for a3xx/a4xx border-colorsRob Clark2015-09-154-67/+99
| | | | | | | | Both use the same layout for the buffer containing border-color values, so rather than duplicating the logic in a4xx, split it out into a helper. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-09-155-17/+37
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir/lower_vec_to_movs: Coalesce into destinations of fdot instructionsJason Ekstrand2015-09-151-13/+36
| | | | | | | | | | | | | | | | | | | Now that we have a replicating fdot instruction, we can actually coalesce into the destinations of vec4 instructions. We couldn't really do this before because, if the destination had to end up in .z, we couldn't reswizzle the instruction. With a replicated destination, the result ends up in all channels so we can just set the writemask and we're done. Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1747753 -> 1746280 (-0.08%) instructions in affected programs: 143274 -> 141801 (-1.03%) helped: 667 HURT: 0 It turns out that dot-products matter... Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965/vec4: Use the replicated fdot instruction in NIRJason Ekstrand2015-09-152-3/+11
| | | | | Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir: Add a fdot instruction that replicates the result to a vec4Jason Ekstrand2015-09-153-0/+12
| | | | | | | | Fortunately, nir_constant_expr already auto-splats if "dst" never shows up in the constant expression field so we don't need to do anything there. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/lower_vec_to_movs: Coalesce movs on-the-fly when possibleJason Ekstrand2015-09-151-0/+85
| | | | | | | | | | | | | | | | | | | The old pass blindly inserted a bunch of moves into the shader with no concern for whether or not it was really needed. This adds code to try and coalesce into the destination of the instruction providing the value. Shader-db results for vec4 shaders on Haswell: total instructions in shared programs: 1754420 -> 1747753 (-0.38%) instructions in affected programs: 231230 -> 224563 (-2.88%) helped: 1017 HURT: 2 This approach is heavily based on a different patch by Eduardo Lima Mitev <[email protected]>. Eduardo's patch did this in a separate pass as opposed to integrating it into nir_lower_vec_to_movs. Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/lower_vec_to_movs: Get rid of start_idx and swizzle compactingJason Ekstrand2015-09-151-20/+13
| | | | | | | | | | | | | Previously, we did this thing with keeping track of a separate start_idx which was different from the iteration variable. I think this was a relic of the way that GLSL IR implements writemasks. In NIR, if a given bit in the writemask is unset then that channel is just "unused", not missing. In particular, a vec4 operation with a writemask of 0xd will use sources 0, 2, and 3 and leave source 1 alone. We can simplify things a good deal (and make them correct) by removing this "compacting" step. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vec4_nir: Use partial SSA form rather than full non-SSAJason Ekstrand2015-09-153-4/+20
| | | | | | | | | We made this switch in the FS backend some time ago and it seems to make a number of things a bit easier. In particular, supporting SSA values takes very little work in the backend and allows us to take advantage of the majority of the SSA information even after we've gotten rid of Phi nodes. Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/lower_vec_to_movs: Handle partially SSA shadersJason Ekstrand2015-09-151-6/+15
| | | | | | | | v2 (Jason Ekstrand): - Use nir_instr_rewrite_dest - Pass the impl directly into lower_vec_to_movs_block Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/lower_vec_to_movs: Pass the shader around directlyJason Ekstrand2015-09-151-6/+8
| | | | | | | | Previously, we were passing the shader around, we were just calling it "mem_ctx". However, the nir_shader is (and must be for the purposes of mark-and-sweep) the mem_ctx so we might as well pass it around explicitly. Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965/fs: Add a very basic validation passJason Ekstrand2015-09-154-0/+69
| | | | | | | | Currently the validation pass only validates that regs_read and regs_written are consistent with the sizes of VGRF's. We can add more as we find it to be useful. Reviewed-by: Matt Turner <[email protected]>
* i965/fs_surface_builder: Only apply predicate to components that existJason Ekstrand2015-09-151-1/+1
| | | | | | | | | | | | In certain conditions, we have to do bounds-checking in the shader for image_load_store. The way this works for image loads is that we do a predicated load and then emit a series of selects, one per component, that gives us 0 or the loaded value depending on whether or not you're in bounds. However, we were hard-coding 4 components which may not be correct. Instead, we should be using size which is the number of components read. Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: Only read output_components many components when writing an outputJason Ekstrand2015-09-151-1/+3
| | | | Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Set output_components for lowered clip distance outputsJason Ekstrand2015-09-151-0/+2
| | | | Reviewed-by: Kristian Høgsberg <[email protected]>
* mesa/teximage: restrict GL_ETC1_RGB8_OES support to GLESNanley Chery2015-09-151-1/+2
| | | | | | | | | | | According to the extensions table and our glext headers, OES_compressed_ETC1_RGB8_texture is only supported in GLES1 and GLES2. Since we may give users a GLES3 context when a GLES2 context is requested, we also allow this extension for GLES3 as well. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/extensions: restrict GL_OES_EGL_image to GLESNanley Chery2015-09-151-2/+1
| | | | | | | | Driver vendors do this as well. The extension specification lists GLES 1.1 or 2.0 as requirements. Reviewed-by: Chad Versace <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/extensions: restrict luminance alpha formats to API_OPENGL_COMPATNanley Chery2015-09-152-4/+6
| | | | | | | According the GL 3.1 spec, luminance alpha formats are deprecated. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* gallium/svga: Enable PIPE_FORMAT_L8_UNORM for vgpu10Thomas Hellstrom2015-09-151-1/+1
| | | | | | | | It's extensively used by XA for a8- and planar yuv component surfaces. This fixes broken XA yuv blits using vgpu10 contexts. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* egl/dri2: don't leak the fd on dri2_terminateEmil Velikov2015-09-153-1/+3
| | | | | | | | | | Currently the check was incorrect as it did not consider the (unlikely) case of fd == 0. In order to fix this we should first correctly initialize it to -1, as the swrast implementations leave it set to zero (props to calloc()). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Boyan Ding <[email protected]>
* egl/dri2/drm: compact existing device mgmtEmil Velikov2015-09-151-6/+4
| | | | | | | | | Move the fcntl(dupfd_cloexec) to the else branch where it belongs. Otherwise it's not immediately obvious that the code is hit, only when an existing device is used. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Boyan Ding <[email protected]>
* egl/dri2: Close file descriptor on error.Matt Turner2015-09-151-13/+14
| | | | | | | | | | v2: [Emil Velikov] Rework the error path to a common goto, close only if we own the fd. v3; [Emil Velikov] Always close the fd (we either opened the device or dup'd) (Boyan, Ian) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Boyan Ding <[email protected]>
* gbm: convert gbm bo format to fourcc format on dma-buf importRay Strode2015-09-151-1/+17
| | | | | | | | | | | | | | | | | | | | | At the moment if a gbm buffer is imported and the gbm buffer has an old-style GBM_BO_FORMAT format, the import will crash, since it's passed directly to DRI functions that expect a fourcc format (as provided by the newer GBM_FORMAT definitions) This commit addresses the problem in two ways: 1) it prevents invalid formats from leading to a crash by returning EINVAL if the image couldn't be created 2) it translates GBM_BO_FORMAT formats into the comparable GBM_FORMAT formats. Reference: https://bugzilla.gnome.org/show_bug.cgi?id=753531 CC: "10.6 11.0" <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Move perf_debug code to brw_codegen_*_prog()Kristian Høgsberg Kristensen2015-09-145-76/+75
| | | | | | | | | | We're trying to avoid a libdrm dependency in the core compiler, so let's move the perf_debug code one level up from the brw_*_emit() helpers to the brw_codegen_*_prog() helpers. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move brw_fs_precompile() to brw_wm.cKristian Høgsberg Kristensen2015-09-142-58/+59
| | | | | | | | | All other precompile functions live in the brw_<stage>.c files, make fs follow the convention. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* i965: Move compute shader code aroundKristian Høgsberg Kristensen2015-09-145-333/+362
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves the compute shader code around in order to make the way the code is split up more consistent. There should be no functional changes. Typically we have a few files per stage: brw_vs.c, brw_wm.c brw_gs.c: code to drive code generation and implement precompiling and cache search. genX_<stage>_state.c gen specific implementation of the state emission for the shader stage. The brw_*_emit() functions are all in the same files as the visitor classes they use (with the exception of VS, which may use either vec4 or fs). To make compute follow this convention, we move the brw_cs_emit() function into brw_fs.cpp. We can then rename brw_cs.cpp to brw_cs.c and do this in C like the other similar files. Finally, move state setup and atoms to gen7_cs_state.c. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
* meta: Abort meta pbo path if TexSubImage need signed unsigned conversionAnuj Phogat2015-09-141-18/+25
| | | | | | | | | See similar fix for Readpixels in mesa commit 0d20790. Jason suggested we need that for TexSubImage as well. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nvc0/ir: start offset at texBindBase for txq, like regular texturingIlia Mirkin2015-09-141-1/+4
| | | | | | | | | Curiously this has no actual effect. I think it's because the first 8 textures are bound in multiple slots for some reason. However seems prudent to use these the same way as regular texturing, esp in the case where there are more than 8 textures bound. Signed-off-by: Ilia Mirkin <[email protected]>
* vc4: Fix build from recent NIR cleanups.Eric Anholt2015-09-141-2/+1
|
* i965/vec4_nir: Load constants as integersAntia Puentes2015-09-141-2/+2
| | | | | | | | | | | Loads constants using integer as their register type, like it is done in FS backend. No shader-db changes in HSW. Cc: "10.6 11.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91716 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Fix saturation errors when coalescing registersAntia Puentes2015-09-141-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the register types do not match and the instruction that contains the final destination is saturated, register coalescing generated non-equivalent code. This did not happen when using IR because types usually matched, but it is visible in nir-vec4. For example, mov vgrf7:D vgrf2:D mov.sat m4:F vgrf7:F is coalesced to: mov.sat m4:D vgrf2:D The patch prevents coalescing in such scenario, unless the instruction we want to coalesce into is a MOV (without type conversion implied). In that case, the patch sets the register types to the type of the final destination. Shader-db results in HSW (only vec4 instructions shown): total instructions in shared programs: 1754415 -> 1754416 (0.00%) instructions in affected programs: 74 -> 75 (1.35%) helped: 0 HURT: 1 GAINED: 0 LOST: 0 Only one extra instruction in one of the shaders, that comes from eliminating a saturation error by preventing register coalesce. Cc: "10.6 11.0" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa: emit TXQS, support ARB_shader_texture_image_samplesIlia Mirkin2015-09-132-1/+6
| | | | | | | | | The image component of the ext is a no-op since there is no image support in gallium (yet). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r600g: add support for TXQS tgsi opcodeIlia Mirkin2015-09-132-5/+13
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]>
* nv50/ir: add support for TXQS tgsi opcodeIlia Mirkin2015-09-135-9/+41
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_CAP_TGSI_TXQS to let st know if TXQS is supportedIlia Mirkin2015-09-1315-0/+15
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]>
* tgsi: add a TXQS opcode to retrieve the number of texture samplesIlia Mirkin2015-09-133-2/+14
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* glsl/cs: Initialize gl_LocalInvocationIndex in main()Jordan Justen2015-09-131-0/+22
| | | | | | | | | | | | | | | We initialize gl_LocalInvocationIndex based on the extension spec formula: gl_LocalInvocationIndex = gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y + gl_LocalInvocationID.y * gl_WorkGroupSize.x + gl_LocalInvocationID.x; https://www.opengl.org/registry/specs/ARB/compute_shader.txt Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl/cs: Exclude gl_LocalInvocationIndex from builtin variable strippingJordan Justen2015-09-131-0/+8
| | | | | | | | | | | | | | | | | | We lower gl_LocalInvocationIndex based on the extension spec formula: gl_LocalInvocationIndex = gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y + gl_LocalInvocationID.y * gl_WorkGroupSize.x + gl_LocalInvocationID.x; https://www.opengl.org/registry/specs/ARB/compute_shader.txt We need to set this variable in main(), even if gl_LocalInvocationIndex is not referenced by the shader. (It may be used by a linked shader.) Therefore, we can't eliminate it as a dead variable. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl/cs: Initialize gl_GlobalInvocationID in main()Jordan Justen2015-09-133-0/+72
| | | | | | | | | | | | | | We initialize gl_GlobalInvocationID based on the extension spec formula: gl_GlobalInvocationID = gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID https://www.opengl.org/registry/specs/ARB/compute_shader.txt Signed-off-by: Jordan Justen <[email protected]> Cc: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Move link_get_main_function_signature to a common locationJordan Justen2015-09-135-33/+34
| | | | | | | | | | | Also rename to _mesa_get_main_function_signature. We will call it near the end of compilation to insert some code into main for initializing some compute shader global variables. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* glsl/cs: Don't strip gl_GlobalInvocationID and dependenciesJordan Justen2015-09-131-0/+14
| | | | | | | | | | | | | | | | We lower gl_GlobalInvocationID based on the extension spec formula: gl_GlobalInvocationID = gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID https://www.opengl.org/registry/specs/ARB/compute_shader.txt We need to set this variable in main(), even if gl_GlobalInvocationID is not referenced by the shader. (It may be used by a linked shader.) Therefore, we can't eliminate these as dead variables. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965/nir: Support gl_WorkGroupID variableJordan Justen2015-09-131-1/+9
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Initialize gl_WorkGroupID variable from payloadJordan Justen2015-09-132-0/+20
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* nir: Add gl_WorkGroupID system variableJordan Justen2015-09-133-0/+6
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* glsl/cs: Add gl_WorkGroupID variableJordan Justen2015-09-132-0/+2
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>