summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Write a scalar TCS backend that runs in SINGLE_PATCH mode.Kenneth Graunke2016-05-037-15/+510
| | | | | | | | | | | | | | | | | | | | | | | Unlike most shader stages, the Hull Shader hardware makes us explicitly tell it how many threads to dispatch and manually configure the channel mask. One perk of this is that we have a lot of flexibility - we can run it in either SIMD4x2 or SIMD8 mode. Treating it as SIMD8 means that shaders with 8 or fewer output vertices (which is overwhemingly the common case) can be handled by a single thread. This has several intriguing properties: - Accessing input arrays with gl_InvocationID as the index is a simple SIMD8 URB read with g1 as the header. No indirect addressing required. - Barriers are no-ops. - We could potentially do output shadowing to combine writes, as the concurrency concerns are gone. (We don't do this yet, though.) v2: Drop first_non_payload_grf change, as it was always adding 0 (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Rework the TCS passthrough shader to use NIR.Kenneth Graunke2016-05-033-56/+85
| | | | | | | | | | | | | | | I'm about to implement a scalar TCS backend, and I'd rather not duplicate all of this code there. One change is that we now write the tessellation levels from all TCS threads, rather than just the first. This is pretty harmless, and was easier. The IF/ENDIF needed for that are gone; otherwise the generated code is basically identical. I chose to emit load/store intrinsics directly because it was easier. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa/objectlabel: handle NULL src stringMark Janes2016-05-031-3/+4
| | | | | | | | | | This prevents a crash when a NULL src is passed with a non-NULL length. fixes: dEQP-GLES31.functional.debug.object_labels.query_length_only Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95252 Signed-off-by: Mark Janes <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* swrast: Add texfetch_funcs entries for astc 3d formatsAnuj Phogat2016-05-031-1/+22
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Enable translation between astc 3d gl formats and mesa formatsAnuj Phogat2016-05-031-0/+80
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Handle astc 3d formats in _mesa_get_compressed_formats()Anuj Phogat2016-05-031-0/+29
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Handle astc 3d formats in _mesa_base_tex_format()Anuj Phogat2016-05-031-2/+4
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Account for astc 3d formats in _mesa_is_astc_format()Anuj Phogat2016-05-031-3/+13
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Add a helper function is_astc_3d_format()Anuj Phogat2016-05-031-0/+32
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Add the missing defines for GL_OES_texture_compression_astcAnuj Phogat2016-05-031-0/+23
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Align the values of #define's in glheader.hAnuj Phogat2016-05-031-29/+29
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Add OES_texture_compression_astc to extension table and gl_extensionsAnuj Phogat2016-05-032-0/+2
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Add entries for astc 3d formats initializing struct gl_format_infoAnuj Phogat2016-05-031-0/+21
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Add mesa formats for astc 3d formatsAnuj Phogat2016-05-031-0/+21
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Account for block depth in _mesa_format_image_size()Anuj Phogat2016-05-031-21/+23
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Handle 3d block sizes in _mesa_compute_compressed_pixelstoreAnuj Phogat2016-05-031-3/+3
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Handle 3d block sizes in teximage error checksAnuj Phogat2016-05-031-6/+13
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Handle 3d block sizes in getteximage error checksAnuj Phogat2016-05-031-4/+17
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Add an assert for BlockDepth in _mesa_get_format_block_size()Anuj Phogat2016-05-031-0/+3
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Add a helper function to query 3D block sizesAnuj Phogat2016-05-032-0/+25
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Add block depth field in struct gl_format_infoAnuj Phogat2016-05-034-274/+279
| | | | | | | This will be later required for 3D ASTC formats. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa/copyimage: make sure number of samples match.Dave Airlie2016-05-031-0/+14
| | | | | | | | | This fixes GL43-CTS.copy_image.samples_missmatch which otherwise asserts in the radeonsi driver. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/objectlabel: don't do memcpy if bufSize is 0 (v2)Dave Airlie2016-05-031-0/+5
| | | | | | | | | | This prevents GL43-CTS.khr_debug.labels_non_debug from memcpying all over the stack and crashing. v2: actually fix the test. Reviewed-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/textureview: move error checks up higherDave Airlie2016-05-031-24/+26
| | | | | | | | | | | | GL43-CTS.texture_view.errors checks for GL_INVALID_VALUE here but we catch these problems in the dimensionsOK check and return the wrong error value. This fixes: GL43-CTS.texture_view.errors. Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: fix blit-based GetTexImage for non-finalized texturesMarek Olšák2016-05-021-1/+2
| | | | | | | | This fixes getteximage-depth piglit failures on radeonsi. Cc: 11.1 11.2 <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* vbo: avoid leaking prim on vbo bind failureIlia Mirkin2016-05-011-1/+3
| | | | | | | Spotted by Coverity Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Vinson Lee <[email protected]>
* mesa: add LOCATION_COMPONENT support to GetProgramResourceivTimothy Arceri2016-05-012-0/+15
| | | | | | | | | | | | From Section 7.3.1.1 (Naming Active Resources) of the OpenGL 4.5 spec: "For the property LOCATION_COMPONENT, a single integer indicating the first component of the location assigned to an active input or output variable is written to params. For input and output variables with a component specified by a layout qualifier, the specified component is written. For all other input and output variables, the value zero is written." Reviewed-by: Anuj Phogat <[email protected]>
* glShaderSource must not change compile status.Jamey Sharp2016-05-011-1/+0
| | | | | | | | | | | | | | | | OpenGL 4.5 Core Profile section 7.1, in the documentation for CompileShader, says: "Changing the source code of a shader object with ShaderSource does not change its compile status or the compiled shader code." According to Karol Herbst, the game "Divinity: Original Sin - Enhanced Edition" depends on this odd quirk of the spec. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551 Signed-off-by: Jamey Sharp <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: don't forget to ship brw_nir_trig_workarounds.pyEmil Velikov2016-05-011-0/+3
| | | | | | Otherwise we won't be able to regenerate the source file(s). Signed-off-by: Emil Velikov <[email protected]>
* glx: Refactor the configure options for glx implementation choice (v3)Chuck Atkins2016-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of cascading support for various different implementations of GLX, all three options are now specified through the --enable-glx option: --enable-glx=dri : Enable the DRI-based GLX --enable-glx=xlib : Enable the classic Xlib-based GLX --enable-glx=gallium-xlib : Enable the gallium Xlib-based GLX --enable-glx[=yes] : Defaults to dri if DRI is enabled, else gallium-xlib if gallium is enabled, else xlib This removes the --enable-xlib-glx option and fixes a bug in which both the classic xlib-glx and gallium xlib-glx implementations were getting built causing different versioned and conflicting libGL libraries to be installed. v2: Changes from various review feedback from Emil: a) Fixed typos b) Corrected help docs for new option c) Added appropriate a-b and r-b tags in commit msg d) Fixed various GLX related dependency checks. v3: Rebased to current master and added changelog in commit msg Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94086 Acked-by: Brian Paul <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/glsl_to_tgsi: fix potential crash when allocating temporariesSamuel Pitoiset2016-04-301-1/+1
| | | | | | | | | | | When index - t->temps_size is greater than 4096, allocating space for temporaries on demand will miserably crash. This can happen when a game uses a lot of temporaries like the recent released Tomb raider. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Cc: "11.1 11.2" <[email protected]>
* mesa: simplify _mesa_LightfvThomas Faller2016-04-291-10/+0
| | | | | Signed-off-by: Thomas Faller <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/glsl_to_tgsi: reduce stack explosion in recursive expression visitorNicolai Hähnle2016-04-291-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | In optimized builds, visit(ir_expression *) experiences inlining with gcc that leads the function to have a roughly 32KB stack frame. This is a problem given that the function is called recursively. In non-optimized builds, the stack frame is much smaller, hence one gets crashes that happen only in optimized builds. Arguably there is a compiler bug or at least severe misfeature here. In any case, the easy thing to do for now seems to be moving the bulk of the non-recursive code into a separate function. This is sufficient to convince my version of gcc not to blow up the stack frame of the recursive part. Just to be sure, add the gcc-specific noinline attribute to prevent this bug from reoccuring if inliner heuristics change. v2: put ATTRIBUTE_NOINLINE into macros.h Cc: "11.1 11.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95133 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95026 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92850 Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* mesa: Fix indirect draw buffer size check on 32-bit systems.Kenneth Graunke2016-04-281-1/+1
| | | | | | | | | | | | | | Fixes dEQP-GLES31.functional subtests: draw_indirect.negative.command_offset_not_in_buffer_signed32_wrap draw_indirect.negative.command_offset_not_in_buffer_unsigned32_wrap These tests use really large values that overflow GLsizeiptr, at which point the buffer size isn't less than "end". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95138 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* nir: Switch the arguments to nir_foreach_use and friendsJason Ekstrand2016-04-281-1/+1
| | | | | | | | | | | This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_use(\([^,]*\),\s*\([^,]*\))/nir_foreach_use(\2, \1)/ and similar expressions for nir_foreach_use_safe, etc. Reviewed-by: Ian Romanick <[email protected]>
* nir: Switch the arguments to nir_foreach_functionJason Ekstrand2016-04-286-13/+13
| | | | | | | | | This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <[email protected]>
* nir: Switch the arguments to nir_foreach_instrJason Ekstrand2016-04-286-11/+11
| | | | | | | | | | | This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <[email protected]>
* i965/nir: fixup for new foreach_block()Connor Abbott2016-04-286-80/+69
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: improve comment on _mesa_check_disallowed_mapping(), return boolBrian Paul2016-04-281-2/+8
| | | | | | | | The old comment was a bit terse. Also, change the function return type to bool. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: rename lower_flrp to lower_flrp32Samuel Iglesias Gonsálvez2016-04-281-1/+1
| | | | | | | A later patch will add lower_flrp64 option to NIR. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vbo: Return INVALID_OPERATION during draw with a mapped bufferJordan Justen2016-04-272-47/+42
| | | | | | | | | | | | | | | | Fixes the OpenGLES 3.1 CTS: * ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos Because this is triggering the error message after the normal API validation phase, we don't have the API function name available, and therefore we generate an error message without the draw call name: Mesa: User error: GL_INVALID_OPERATION in draw call (vertex buffers are mapped) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95142 Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965/blorp/gen8: Fix blitting of interleaved msaa surfacesTopi Pohjolainen2016-04-271-2/+16
| | | | | | | | | | | | | | | | | | Fixes ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample. Current logic divides given layer of one by number of samples (four) trashing the layer to zero. Layer adjustment is only to be used with non-interleaved msaa surfaces where samples for particular layer are in multiple slices. I copy-pasted a bit of documentation from brw_blorp.c::brw_blorp_compute_tile_offsets(). Also took the opportunity to fix the comment regarding sampling as 2D, cube textures are the only exception. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Enable ARB_texture_stencil8 and OES_texture_stencil8 on Gen8+.Kenneth Graunke2016-04-263-7/+2
| | | | | | | | | | | | | | | | | | | | | Stencil texturing is required by ES 3.1. Apparently we never actually turned it on. Do that now. Also turn on the desktop extension. Fixes nine dEQP-GLES31.functional tests: stencil_texturing.format.stencil_index8_2d texture.border_clamp.formats.stencil_index8.nearest_size_pot texture.border_clamp.formats.stencil_index8.nearest_size_npot texture.border_clamp.formats.stencil_index8.gather_size_pot texture.border_clamp.formats.stencil_index8.gather_size_npot texture.border_clamp.unused_channels.stencil_index8 state_query.internal_format.renderbuffer.stencil_index8_samples state_query.internal_format.texture_2d_multisample.stencil_index8_samples state_query.internal_format.texture_2d_multisample_array.stencil_index8_samples Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa: Try to fix CopyTex[Sub]Image of stencil textures.Kenneth Graunke2016-04-261-2/+3
| | | | | | | | | | | | | ES prohibits this, but GL appears to allow it. We at least need this much, or else we'll crash as there's no source to read from. This fixed crashes in the ES tests before I realized I needed to prohibit stencil instead. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa: Disallow CopyTexSubImage on stencil formats in ES.Kenneth Graunke2016-04-261-0/+9
| | | | | | | | | | | Fixes - ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8 - ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Fix MapTextureImage for multi-slice/level stencil buffers.Kenneth Graunke2016-04-261-2/+2
| | | | | | | | | | | We called intel_miptree_get_image_offset() to get the image offsets for the current level/slice, but then proceeded to ignore the results and clobber level/slice 0 every time. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94713 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Move TCS output indirect_offset.file check out a level.Kenneth Graunke2016-04-261-42/+46
| | | | | | | | I want to add another condition. Moving the indirect_offset.file check out a level should make this a little easier. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/fs: Reduce the response length of sampler messages on Skylake.Kenneth Graunke2016-04-264-5/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Often, we don't need a full 4 channels worth of data from the sampler. For example, depth comparisons and red textures only return one value. To handle this, the sampler message header contains a mask which can be used to disable channels, and reduce the message length (in SIMD16 mode on all hardware, and SIMD8 mode on Broadwell and later). We've never used it before, since it required setting up a message header. This meant trading a smaller response length for a larger message length and additional MOVs to set it up. However, Skylake introduces a terrific new feature: for headerless messages, you can simply reduce the response length, and it makes the implicit header contain an appropriate mask. So to read only RG, you would simply set the message length to 2 or 4 (SIMD8/16). This means we can finally take advantage of this at no cost. total instructions in shared programs: 9091831 -> 9073067 (-0.21%) instructions in affected programs: 191370 -> 172606 (-9.81%) helped: 2609 HURT: 0 total cycles in shared programs: 70868114 -> 68454752 (-3.41%) cycles in affected programs: 35841154 -> 33427792 (-6.73%) helped: 16357 HURT: 8188 total spills in shared programs: 3492 -> 1707 (-51.12%) spills in affected programs: 2749 -> 964 (-64.93%) helped: 74 HURT: 0 total fills in shared programs: 4266 -> 2647 (-37.95%) fills in affected programs: 3029 -> 1410 (-53.45%) helped: 74 HURT: 0 LOST: 1 GAINED: 143 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Use inst->regs_written for rlen for texture instructionsJason Ekstrand2016-04-262-9/+3
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/fs: Properly report regs_written from SAMPLEINFOJason Ekstrand2016-04-262-2/+9
| | | | | | | | | The previous behavior would only allocate one register and then write four thus potentially stomping three innocent bystanders. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>