summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* Android: fix build break from nir/glsl move to compiler/Rob Herring2016-02-298-10/+10
| | | | | | | | | | | | | | | | | Commits a39a8fbbaa12 ("nir: move to compiler/") and eb63640c1d38 ("glsl: move to compiler/") broke Android builds. Fix them. There is also a missing dependency between generated NIR headers and several libraries. This isn't a new issue, but seems to have been exposed by the NIR move. Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled. Cc: "11.2" <[email protected]> Cc: Mauro Rossi <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 574a92b048ae2b482982c3f156182970d551ca94)
* i965/fs: Don't CSE negated multiplies with saturation.Matt Turner2016-02-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | It's not correct to CSE these multiplies mul.sat dst1, -a, b mul.sat dst2, a, b by emitting a negated MOV from dst1 to dst2: mul.sat dst1, -a, b mov dst2, -dst1 Take 2.0*2.0 for example. The first multiply would produce 0.0 and the second would produce 1.0. Fixes bad generated code in 18 to 22 shaders: instructions in affected programs: 432 -> 464 (7.41%) helped: 4 HURT: 18 Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 1567da1e2820d4c1a6c14f4598ad3addba6bc788)
* st/mesa: fix frontbuffer glReadPixels regressionsBrian Paul2016-02-291-2/+11
| | | | | | | | | | | | | | | | | | | | | | The change "mesa/readpix: Don't clip in _mesa_readpixels()" caused a few piglit regressions. The failing tests use glReadPixels to read from the front color buffer. The problem is we were trying to read from a non-existant front color buffer. The front color buffer is created on demand in st/mesa. Since the missing buffer bounds were effectively 0 x 0 the glReadPixels was totally clipped and returned early. The fix involves creating the real front color buffer when we're about to try reading from it. Tested with llvmpipe and VMware driver on Linux, Windows. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94253 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94254 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94257 Cc: [email protected] Reviewed-by: Roland Scheidegger <[email protected]> (cherry picked from commit 83b589301f4a150f4b1b13fd3ffd9f6d98ee6546)
* st/mesa: force depth mode to GL_RED for sized depth/stencil formatsIlia Mirkin2016-02-191-9/+25
| | | | | | | | | | | See commit 9db2098d for the i965 version of this. This fixes depth in a bunch of dEQP EXT_texture_border_clamp tests. And probably other ones as well. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* meta/copy_image: use precomputed dst_internal_format to avoid segfaultIlia Mirkin2016-02-191-1/+1
| | | | | | | | | If the destination is a renderbuffer, dst_tex_image will be NULL. This fixes the *to_renderbuffer dEQP copy image tests. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected]
* mesa: add GL_OES_texture_stencil8 supportIlia Mirkin2016-02-193-0/+11
| | | | | | | | | It's basically the same thing as GL_ARB_texture_stencil8 except that glCopyTexImage isn't supported, so add STENCIL_INDEX to the list of invalid GLES formats for glCopyTexImage. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* st/mesa: fix pbo uploadsIlia Mirkin2016-02-191-10/+18
| | | | | | | | | | | - LOD must be provided in .w for TXF (even for buffer textures) - User buffer must be valid at draw time - Must have a sampler associated with the sampler view This makes PBO uploads work again on nouveau. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: check fbo completeness based on internal format, not driver formatIlia Mirkin2016-02-191-3/+2
| | | | | | | | | | | | The base format is a function of the user-requested format, while the driver format is not. So we should use the base format instead. The driver format can be anything. Specifically in the stencil-only case, it might be a depth/stencil format. However we still want to refuse such an attachment when bound to GL_DEPTH. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: small optimization of _mesa_expand_bitmap()Brian Paul2016-02-191-7/+4
| | | | | | Avoid a per-pixel multiply. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: add special case ubyte[4] / BGRA conversion functionBrian Paul2016-02-191-5/+69
| | | | | | | | This reduces a glTexImage(GL_RGBA, GL_UNSIGNED_BYTE) hot spot in when storing the texture as BGRA. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: implement a simple cache for glDrawPixelsBrian Paul2016-02-193-0/+97
| | | | | | | | | Instead of discarding the texture we created, keep it around in case the next glDrawPixels draws the same image again. This is intended to help application which draw the same image several times in a row, either within a frame or subsequent frames. Reviewed-by: Charmaine Lee <[email protected]>
* st/mesa: disable depth/stencil/alpha tests in PBO uploadNicolai Hähnle2016-02-181-0/+8
| | | | | | Noticed by Brian Paul. Reviewed-by: Marek Olšák <[email protected]>
* mesa: fix new gcc6 warningsRob Clark2016-02-181-3/+0
| | | | | | | | | | | | | | | | | | | | | | | src/mesa/main/texstore.c:92:22: warning: ‘map_1032’ defined but not used [-Wunused-const-variable] static const GLubyte map_1032[6] = { 1, 0, 3, 2, ZERO, ONE }; ^~~~~~~~ src/mesa/main/texstore.c:91:22: warning: ‘map_3210’ defined but not used [-Wunused-const-variable] static const GLubyte map_3210[6] = { 3, 2, 1, 0, ZERO, ONE }; ^~~~~~~~ src/mesa/main/texstore.c:90:22: warning: ‘map_identity’ defined but not used [-Wunused-const-variable] static const GLubyte map_identity[6] = { 0, 1, 2, 3, ZERO, ONE }; ^~~~~~~~~~~~ These appear to be unused since: commit 8ec6534b266549cdc2798e2523bf6753924f6cde Author: Iago Toral Quiroga <[email protected]> AuthorDate: Wed Oct 15 13:42:11 2014 +0200 mesa: Use _mesa_format_convert to implement texstore_rgba. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: fix new gcc6 warningsRob Clark2016-02-181-1/+1
| | | | | | | | | | | | | src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp:244:1: warning: ‘void {anonymous}::fs_copy_prop_dataflow::dump_block_data() const’ defined but not used [-Wunused-function] fs_copy_prop_dataflow::dump_block_data() const ^~~~~~~~~~~~~~~~~~~~~ From looking at git history, it looks like this is intended to be unused (ie. just for adding on-demand debug prints) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* Android: fix build break in libmesa_programRob Herring2016-02-181-1/+1
| | | | | | | | | | | | Commit 5fd848f6c9ee ("program: Use _mesa_geometric_samples to calculate gl_NumSamples") broken Android builds. Add the missing include path "main" to framebuffer.h like other includes in prog_statevars.c. Cc: Neil Roberts <[email protected]> Cc: Ilia Mirkin <[email protected]> Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Neil Roberts <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: gl_NumSamples should always be at least oneIlia Mirkin2016-02-181-1/+1
| | | | | | | | | | | | | From ARB_sample_shading: "gl_NumSamples is the total number of samples in the framebuffer, or one if rendering to a non-multisample framebuffer" So make sure to always pass in at least 1. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O`Callaghan <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* compiler/glsl: Fix uniform location counting.Plamena Manolova2016-02-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the calculation of current uniforms to link_uniforms, which makes use of UniformRemapTable which stores all the reserved uniform locations. Location assignment for implicit uniforms now tries to use any gaps left in the table after the location assignment for explicit uniforms. This gives us more space to store more uniforms. Patch is based on earlier patch with following changes/additions: 1: Move the counting of explicit locations to check_explicit_uniform_locations and then pass the number to link_assign_uniform_locations. 2: Count the number of empty slots in UniformRemapTable and store them in a list_head. 3: Try to find an empty slot for implicit locations from the list, if that fails resize UniformRemapTable. Fixes following CTS tests: ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array Signed-off-by: Tapani Pälli <[email protected]> Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93696
* st/mesa: new st_DrawAtlasBitmaps() function for drawing bitmap textBrian Paul2016-02-172-3/+141
| | | | | | | | | | | | This basically saves the current pipeline state, sets up state for rendering, constructs a set of textured quads, renders, then restores the previous pipeline state. It shouldn't be hard to implement a similar function for non-gallium drives. With some code refactoring, the vertex definition code could probably be shared. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: implement a display list / glBitmap texture atlasBrian Paul2016-02-175-0/+448
| | | | | | | | | | | | | | | | | | | | | This improves the performance of applications which use glXUseXFont() or wglUseFontBitmaps() and glCallLists() to draw bitmap text. Basically, we collect all the glBitmap images from the display lists and put them into a texture atlas. To render the bitmaps for a glCallLists() command, we render a set of textured quads where each quad is textured with one bitmap image. Actually, the rendering part has to be done by the Mesa driver or Mesa/gallium state tracker. Note that GLUT demos that use glutBitmapCharacter() don't benefit from this. v2, per Nicolai Hähnle: - check the max tex rect size is at least 1024. - add comment in dd.h that texture_rectangle is required. - in _mesa_DeleteLists(), try to delete the atlas before the list(s) Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: apply DepthMode swizzle to stencil texturing as wellIlia Mirkin2016-02-171-2/+0
| | | | | | | | Gallium doesn't present these as GL_RED-style. A swizzle is necessary to present the proper data in the unused components. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: allow multisampled format info to be returned on GLES 3.1Ilia Mirkin2016-02-171-1/+4
| | | | | | | | | | | | The restriction on multisampled integer texture formats only applies to GLES 3.0, so don't apply it to GLES 3.1 contexts. This fixes a slew of dEQP-GLES31.functional.state_query.internal_format.* tests, which now all pass. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Extract push constant state to a new fileBen Widawsky2016-02-174-164/+191
| | | | | | | | | | | Every stage has a corresponding 3DSTATE_CONSTANT_XS packet, so having the code to create and emit push constant buffers in genX_vs_state.c is a little strange. Moving it to a separate file seems more logical. v2 [Ken]: Rebase on master, explain motivation in the commit message. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make emit_minmax return an instruction*.Matt Turner2016-02-173-10/+10
| | | | And use it in brw_fs_nir.cpp.
* i965: Lower min/max after optimization on Gen4/5.Matt Turner2016-02-178-44/+88
| | | | | | | | | | | | | | | | | | | Gen4/5's SEL instruction cannot use conditional modifiers, so min/max are implemented as CMP + SEL. Handling that after optimization lets us CSE more. On Ironlake: total instructions in shared programs: 6426035 -> 6422753 (-0.05%) instructions in affected programs: 326604 -> 323322 (-1.00%) helped: 1411 total cycles in shared programs: 129184700 -> 129101586 (-0.06%) cycles in affected programs: 18950290 -> 18867176 (-0.44%) helped: 2419 HURT: 328 Reviewed-by: Francisco Jerez <[email protected]>
* i965/vec4: Initialize force_writemask_all in vec4_builder().Matt Turner2016-02-171-1/+2
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* st/mesa: fix up result_src.type when doing i2u/u2i conversionsIlia Mirkin2016-02-171-0/+1
| | | | | | | | | | | | Even though it's a no-op, it's important to keep track of the type so that we can pick the properly-signed op later on. This fixes dEQP-GLES3.functional.shaders.precision.uint.highp_div_fragment, which ended up using IDIV instead of UDIV. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* st/mesa: use cso_set_viewport_dims() in try_pbo_upload_common()Brian Paul2016-02-171-12/+1
| | | | | | | Note that this results in a different transformation for the viewport's Z axis (depth range), but that doesn't matter for this case. Reviewed-by: Roland Scheidegger <[email protected]>
* i965/gen7: Use predicated rendering for indirect computeJordan Justen2016-02-172-14/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | On gen7 (Ivy Bridge, Haswell), we will get a GPU hang if an indirect dispatch is used, but one of the dimensions is 0. Therefore we use predicated rendering on the GPGPU_WALKER command to handle this case. Fixes piglit test: spec/arb_compute_shader/zero-dispatch-size From the ARB_compute_shader spec, under DispatchCompute: "If the work group count in any dimension is zero, no work groups are dispatched." And then for DispatchComputeIndirect: ... "is equivalent (assuming no errors are generated) to calling DispatchCompute with <num_groups_x>, <num_groups_y> and <num_groups_z>" ... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94100 Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Ilia Mirkin <[email protected]>
* st/mesa: add missing ETC2 entries to format_mapRob Clark2016-02-161-0/+42
| | | | | | | | Noticed by Ilia when I was trying to figure out why some app was failing to use ETC2. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: do not init limits when compute shaders are not supportedSamuel Pitoiset2016-02-161-0/+8
| | | | | | | | | | | | | | | | | When the number of uniform blocks is less than 12, ARB_uniform_buffer_object can't be enabled and the maximum GL version is not even 3.1... This fixes a regression introduced in 7c79c1e (st/mesa: add compute shader state) if the maximum number of uniform blocks allowed for compute shaders is less than 12. This happens on Kepler but this might also affect other Gallium drivers. Signed-off-by: Samuel Pitoiset <[email protected]> Reported-by: Tobias Klausmann <[email protected]> Tested-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]>
* mesa: Don't call driver when there is no compute workJordan Justen2016-02-161-0/+3
| | | | | | | | | | The ARB_compute_shader spec says: "If the work group count in any dimension is zero, no work groups are dispatched." Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: Set compute shader shared memory max to 64kJordan Justen2016-02-161-1/+1
| | | | | | | | | | | | | | | | | | | | See Ivy Bridge PRM, Volume 2, Part 2, 1.8.4 INTERFACE_DESCRIPTOR_DATA: DWORD 5, bits 20:16: "This field indicates how much shared local memory the thread group requires. The amount is specified in 4k blocks, but only powers of 2 are allowed: 0, 4k, 8k, 16k, 32k and 64k per half-slice." For Haswell, see Volume 2d, INTERFACE_DESCRIPTOR_DATA: DWORD 5, bits 20:16: With text identical to the Ivy Bridge PRM. For Broadwell, see Volume 2d, INTERFACE_DESCRIPTOR_DATA: DWORD 6, bits 20:16: With text identical to the Ivy Bridge PRM. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* st/mesa: use new CSO_BITS_ALL_SHADERSBrian Paul2016-02-164-24/+9
| | | | Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: simplify st->ctx, ctx->st usage in a various placesBrian Paul2016-02-166-18/+17
|
* st/mesa: use _mesa_geometric_width/height() in glDrawPixels codeBrian Paul2016-02-161-10/+9
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: rename attr variable in st_DrawTex()Brian Paul2016-02-161-10/+10
| | | | | | | Rename to 'tex_attr' to be a bit more clear. Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: use 'cso' instead of 'st->cso_context' in st_DrawTex()Brian Paul2016-02-161-1/+1
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: fix whitespace and add comment in st_DrawTex()Brian Paul2016-02-161-3/+3
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: used _mesa_num_tex_faces() in st_finalize_texture()Brian Paul2016-02-161-1/+1
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: use cso_save/restore_state() in st_cb_texture.cBrian Paul2016-02-161-33/+22
| | | | | | This simplifies the error handling code too. Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: use new cso_save/restore_state() functionsBrian Paul2016-02-164-101/+55
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: use new cso_set_viewport_dims() helperBrian Paul2016-02-163-36/+7
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: use 'cso' local var instead of st->cso_contextBrian Paul2016-02-163-90/+89
| | | | | | Just a little cleaner. Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: consolidate quad drawing codeBrian Paul2016-02-165-238/+136
| | | | | | | The glClear, glBitmap and glDrawPixels code now use a new st_draw_quad() helper function. Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmapBrian Paul2016-02-165-161/+193
| | | | | | | | | | | | Define a new st_util_vertex structure which is a bit smaller (9 floats versus the previous 12 floats per vertex). Clean up the glClear, glDrawPixels and glBitmap code that sets up the vertex data and does the drawing so it's all very similar. This can lead to more consolidation. v2: add assertion that vertex buffer slot == 0 to catch possible future change in cso_get_aux_vertex_buffer_slot() behavior. Reviewed-by: Jose Fonseca <[email protected]>
* st/mesa: include u_draw.h, not u_draw_quad.h in st_draw.cBrian Paul2016-02-161-1/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* i965: Expose logic telling if non-msrt mcs is supportedTopi Pohjolainen2016-02-162-4/+13
| | | | | | | | | Alos use the opportunity to mark inputs constant. (Context has to be given as read-write to intel_miptree_supports_non_msrt_fast_clear() to support debug output). Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965/gen9: Refactor msrt mcs initializationTopi Pohjolainen2016-02-161-14/+22
| | | | | | | | This will be re-used to initialize auxiliary buffers in lossless compression case. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965: Add a few assertions on lossless compressionTopi Pohjolainen2016-02-162-0/+9
| | | | | | | | | v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965: Add a flag telling color resolve pass to ignore CCS_ETopi Pohjolainen2016-02-163-2/+27
| | | | | | | | | v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>