summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965/screen: Allow OpenGLES 3.1 for gen8+Jordan Justen2015-12-161-0/+5
| | | | | | | | OpenGLES 3.1 cannot be enabled for gen 7 (Ivy Bridge, Haswell) since they are still missing ARB_stencil_texturing. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]>
* i965: Enable compute shaders in more cases for OpenGLES 3.1Jordan Justen2015-12-161-1/+4
| | | | | | | | Previously we were checking the desktop OpenGL ARB_compute_shader requirements, but for OpenGLES 3.1, the requirements are lower. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]>
* main/version: Don't require ARB_compute_shader for OpenGLES 3.1Jordan Justen2015-12-161-3/+6
| | | | | | | | | The OpenGL ARB_compute_shader extension specfication requires at least 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1 only required 128. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* main: Allow compute shaders to be compiled with OpenGLES 3.1Jordan Justen2015-12-161-1/+1
| | | | | | | | | Previous OpenGLES 3.1 testing had been done when ARB_compute_shader was overridden to enabled. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]>
* main: Add MESA_VERBOSE=api for LinkProgram & UseProgramJordan Justen2015-12-161-0/+5
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* ir_to_mesa: Skip useless comparison instructions.Matt Turner2015-12-161-1/+7
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Remove inverse() from GLSL 1.20 and 1.30.Kenneth Graunke2015-12-161-3/+9
| | | | | | | | | | I apparently regressed this when rewriting the built-ins using ir_builder, in 76d2f73643f. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93387 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nv50: free memory allocated by the prog which reads MP perf countersSamuel Pitoiset2015-12-161-0/+5
| | | | | | | | | This fixes a memory leak introduced in 6a9c151 ("nv50: add compute-related MP perf counters on G84+") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.1" <[email protected]>
* st/osmesa: add OSMesaCreateContextAttribs() functionBrian Paul2015-12-161-3/+93
| | | | | | As with the previous commit, except for gallium. Reviewed-by: Jose Fonseca <[email protected]>
* osmesa: add new OSMesaCreateContextAttribs functionBrian Paul2015-12-161-1/+99
| | | | | | | This allows specifying a GL profile and version so one can get a core- profile context. Reviewed-by: Jose Fonseca <[email protected]>
* svga: don't use debug code in update_state() in release buildsBrian Paul2015-12-161-0/+4
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* nv50,nvc0: free memory allocated by performance metricsSamuel Pitoiset2015-12-166-4/+22
| | | | | | | | | The destroy_query() helper was actually never called. This fixes a memory leak while monitoring performance metrics. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: "11.1" <[email protected]>
* nvc0: free memory allocated by the prog which reads MP perf countersSamuel Pitoiset2015-12-161-0/+1
| | | | | | | | | This fixes a long time ago memory leak (even before all my query related changes). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* nvc0: fix metric-achieved_occupancy calculation on KeplerSamuel Pitoiset2015-12-161-1/+4
| | | | | | | The maximum number of resident warps per multiprocessor is 64 on Kepler instead of 48 on Fermi. Signed-off-by: Samuel Pitoiset <[email protected]>
* st/va: remove fence handling v3Christian König2015-12-165-22/+7
| | | | | | | | | | | It's nonsense to drain the pipeline like this. v2: keep the drain for DMA-buf exports. v3: flush before the export and after compositing and add TODO comment. Signed-off-by: Christian König <[email protected]> Reviewed-by: Julien Isorce <[email protected]> Tested-by: Julien Isorce <[email protected]>
* Revert "i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals"Neil Roberts2015-12-161-6/+7
| | | | | | | | | | | | | | | | | | | | This reverts commit 839793680f99b8387bee9489733d5071c10f3ace. The patch was breaking DRI3 because driGLFormatToImageFormat does not handle MESA_FORMAT_B8G8R8X8_SRGB which ended up making it fail to create the renderbuffer and it would later crash. It's not trivial to add this format because there is no __DRI_IMAGE_FORMAT nor __DRI_IMAGE_FOURCC define for the format either. I'm not sure how difficult adding this would be and whether adding a new format would require some sort of new version for DRI. Seeing as this might take a while to fix I think it makes sense to just revert the patch in the meantime in order to avoid regressing master. It is also not handled in intel_gles3_srgb_workaround and there may be other cases where it breaks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93388 Acked-by: Jason Ekstrand <[email protected]>
* i965: Fix crash when calling glViewport with no surface boundNeil Roberts2015-12-161-2/+4
| | | | | | | | | | | | | If EGL_KHR_surfaceless_context is used then glViewport can be called with NULL for the draw and read surfaces. This was previously causing a crash because the i965 driver tries to use this point to invalidate the surfaces and it was derferencing the NULL pointer. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93257 Cc: Nanley Chery <[email protected]> Cc: "11.1" <[email protected]> Tested-by: Nanley Chery <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* mesa/blit: Don't require the same format for mulitisample blitsNeil Roberts2015-12-161-2/+11
| | | | | | | | | | | | | | | | | | | | | | | Previously the GL spec required that whenever glBlitFramebuffer is used with either buffer being multisampled, the internal formats must match. However the GL 4.4 spec was later changed to remove this restriction. In the section entitled “Changes in the released Specification of July 22, 2013” it says: “Relax BlitFramebuffer in section 18.3.1 so that format conversion can take place during multisample blits, since drivers already allow this and some apps depend on it.” If most drivers already allowed this in earlier versions I think it's safe to assume that this is a spec bug and it should also be allowed in all versions. This patch just removes the restriction on desktop GL. For GLES there are conformance tests that assert the previous behaviour so it is probably safer to leave it in. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92706 Reviewed-by: Ian Romanick <[email protected]>
* st/va: retrieve size from the temporary img variableJulien Isorce2015-12-161-1/+1
| | | | | | | | "image" is not ready yet since it will be set at the end of the function by: *image = *img; Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Christian K<C3><B6>nig <[email protected]>
* draw: handle edge flags in llvm pathRoland Scheidegger2015-12-162-26/+61
| | | | | | | | | | | | | We just ignored them altogether. While this feature is rather old-fashioned supporting it is actually rather trivial. This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag and all (7) of point-vertex-id). v2: comment fixes, and make the use of the edgeflag in clipmask consistent with when it's actually there (should be impossible to hit a case where the difference would actually matter but still...) Reviewed-by: Brian Paul <[email protected]>
* draw: don't set start_instance and instance id for pt emitRoland Scheidegger2015-12-161-31/+31
| | | | | | | | | | | | This just adds confusion, these parameters are used when fetching vertices by translate, but certainly not when emitting hw vertices for drivers, they make no sense there (setting them has no consequences otherwise since there won't be any elements with instance_divisor set). So just set them to 0 (the draw_pipe_vbuf code for emitting vertices when the draw pipeline is run already does exactly that). Also while here do some whitespace cleanup. Reviewed-by: Brian Paul <[email protected]>
* nir/lower_system_values: Refactor and use the builder.Jason Ekstrand2015-12-151-29/+31
| | | | | | | | | Now that we have a helper in the builder for system values and a helper in core NIR to get the intrinsic opcode, there's really no point in having things split out into a helper function. This commit "modernizes" this pass to use helpers better and look more like newer passes. Reviewed-by: Eric Anholt <[email protected]>
* nir/builder: Add a load_system_value helperJason Ekstrand2015-12-152-10/+15
| | | | | | While we're at it, go ahead and make nir_lower_clip use it. Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_system_values: Stop supporting non-SSAJason Ekstrand2015-12-151-8/+6
| | | | | | The one user of this (i965) only ever calls it while in SSA form. Reviewed-by: Eric Anholt <[email protected]>
* nvc0: remove old comment related to metric calculationsSamuel Pitoiset2015-12-151-11/+0
| | | | | | I forgot to remove it when I refactored all performance metrics. Signed-off-by: Samuel Pitoiset <[email protected]>
* vc4: Add support for dumping executed commands to a file.Eric Anholt2015-12-153-0/+94
| | | | | | | | | | The VC4_DEBUG=cl,qpu is nice and all, but I want to be able to get more detailed dumps, and to replay the same exact commands in simulation. For that I need a dump with all of the VBOs, shaders, shader recs, etc. This dump can be parsed by vc4-gpu-tools. For now this is only doable from simulator mode, because otherwise we don't have access to the RCL contents generated by the kernel.
* vc4: Import updated vc4_drm.h with hang state.Eric Anholt2015-12-151-0/+45
|
* vc4: Only update vc4->msaa when the framebuffer changes.Eric Anholt2015-12-151-7/+0
| | | | | | Any update here should have been the same as in vc4_set_framebuffer_state(), except for the point where vc4_blit.c temporarily sets different state for its different buffers.
* vc4: Don't consider nr_samples==1 surfaces to be MSAA.Eric Anholt2015-12-156-21/+25
| | | | | | This is apparently a weirdness of gallium -- nr_samples==1 is occasionally used and means the same thing as nr_samples==0. Fixes a bunch of ARB_framebuffer_srgb blit cases in piglit.
* vc4: Fix min() wrapper definition for the simulator's kernel code.Eric Anholt2015-12-151-1/+1
|
* vc4: Warn instead of abort()ing on exec ioctl failures.Eric Anholt2015-12-151-3/+5
| | | | | | | | It's really harsh to abort() the X Server because of a momentary failure (particularly -ENOMEM). I don't see a way to pass an -ENOMEM up the stack from here, but we can at least log to stderr before proceeding on. Cc: "11.1" <[email protected]>
* radeonsi: fix perfcounter selection for SI_PC_MULTI_BLOCK layoutsNicolai Hähnle2015-12-151-1/+1
| | | | | | The incorrectly computed register count caused lockups. Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: remove unnecessary test in r600_pc_query_add_resultNicolai Hähnle2015-12-151-3/+0
| | | | | | | This test is a left-over of the initial development. It is unneeded and misleading, so let's get rid of it. Reviewed-by: Edward O'Callaghan <[email protected]>
* mesa/main: use BITSET_FOREACH_SET in perf_monitor_result_sizeNicolai Hähnle2015-12-151-4/+3
| | | | | | | This should make the code both faster and slightly clearer. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* freedreno/a4xx: fix fragcoord.z + fragdepthRob Clark2015-12-152-5/+5
| | | | | | | | | | | | | It seems like disabling earlyz on a4xx also, by defaults, disables fragcoord.z to the FS. For frag shaders that both read fragcoord(.z) and write fragdepth, we need to set some extra bits to prevent a lockup. This lets us get rid of the hack of disabling fragcoord.z (which prevented 0ad from lockups, but resulted in rendering corruption). Also fixes fbo-depth-sample-compare. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-12-156-92/+231
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/cmdline: don't dump nir by defaultRob Clark2015-12-151-3/+1
| | | | | | By default we only want the disasm dumped, which we get anyways. Signed-off-by: Rob Clark <[email protected]>
* st/va: remove nonesense HEVC picture id handlingChristian König2015-12-151-5/+0
| | | | | | | The picture id in this case is a VA-API surface handle, checking for a certain value can't be correct. Signed-off-by: Christian König <[email protected]>
* i965: Allocate URB space for HS and DS stages when required.Chris Forbes2015-12-153-36/+170
| | | | | | | | | | | | | | | | | | | | v2: (by Ken, incorporating feedback from Matt Turner): - Rewrite the push constant allocation code to be clearer. - Only apply the minimum VS entries workaround on Gen 8. v3: (by Ken) - Fix a bug in v2 where we failed to allocate the full push constant space when the number of enabled stages didn't divide the available push constant space evenly. (Any left over space is now allocated to the PS, as it was in v1.) - Fix an off-by-one error in v2's number of enabled stages calculation. - Use DIV_ROUND_UP for nicer formatting. - Line wrapping fixes. Signed-off-by: Chris Forbes <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* glsl: add support for explicit locations inside interface blocksTimothy Arceri2015-12-154-9/+105
| | | | | | | This change also adds explicit location support for structs and interfaces which is currently missing in Mesa but is allowed with SSO and GLSL 1.50+. Reviewed-by: Edward O'Callaghan <[email protected]>
* glsl: simplify interface matchingTimothy Arceri2015-12-151-108/+46
| | | | | | | | | | | | This makes the code easier to follow, should be more efficient and will makes it easier to add matching via explicit locations in the following patch. This patch also replaces the hash table with the newer resizable hash table this should be more suitable as the table is likely to only contain a small number of entries. Reviewed-by: Edward O'Callaghan <[email protected]>
* draw: remove clip_vertex from vertex headerRoland Scheidegger2015-12-155-40/+54
| | | | | | | | | | | | | | | | | | vertex header had both clip_pos and clip_vertex. We only really need one (clip_pos) because the draw llvm shader would overwrite the position output from the vs with the viewport transformed. However, we don't really need the second one, which was only really used for gl_ClipVertex - if the shader didn't have that the values were just duplicated to both clip_pos and clip_vertex. So, just use this from the vs output instead when we actually need it. Also change clip debug to output both the data from clip_pos and the clipVertex output (if available). Makes some things more complex, some things less complex, but seems more easy to understand what clipping actually does (and what values it uses to do its magic). Reviewed-by: Brian Paul <[email protected] Reviewed-by: Jose Fonseca <[email protected]>
* draw: use clip_pos, not clip_vertex for the fake guardband xy point clippingRoland Scheidegger2015-12-151-3/+3
| | | | | | | | Seems obvious now this should use the data from position and not clip_vertex (albeit might not really make a difference). Reviewed-by: Brian Paul <[email protected] Reviewed-by: Jose Fonseca <[email protected]>
* draw: rename vertex header membersRoland Scheidegger2015-12-156-42/+46
| | | | | | | | | clip -> clip_vertex and pre_clip_pos -> clip_pos. Looks more obvious to me what these values actually represent (so use something resembling the vs output names). Reviewed-by: Brian Paul <[email protected] Reviewed-by: Jose Fonseca <[email protected]>
* draw: don't pretend have_clipdist is per-vertexRoland Scheidegger2015-12-155-18/+20
| | | | | | | | | | This is just for code cleanup, conceptually the have_clipdist really isn't per-vertex state, so don't put it there (just dependent on the shader). Even though there wasn't really any overhead associated with this, we shouldn't store random shader information in the vertex header. Reviewed-by: Brian Paul <[email protected] Reviewed-by: Jose Fonseca <[email protected]>
* draw: use position not clipVertex output for xyz view volume clippingRoland Scheidegger2015-12-151-1/+10
| | | | | | | | | | | I'm pretty sure this should use position (i.e. pre_clip_pos) and not the output from clipVertex. Albeit piglit doesn't care. It is what we use in the clip test, and it is what every other driver does (as they don't even have clipVertex output and lower the additional planes to clip distances). Reviewed-by: Brian Paul <[email protected] Reviewed-by: Jose Fonseca <[email protected]>
* i965: Use DIV_ROUND_UP() in gen7_urb.c code.Kenneth Graunke2015-12-141-9/+8
| | | | | | | This is a newer convention, which we prefer over ALIGN(x, n) / n. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Make TES inputs match TCS outputs.Kenneth Graunke2015-12-141-0/+11
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Force VS -> TCS varyings to use the SSO VUE map layout.Kenneth Graunke2015-12-142-2/+5
| | | | | | | | | | | | The compact VUE map only works when varying packing is in use. Unfortunately, varying packing is disabled for TCS inputs. This is needed to fix Piglit's tcs-input-read-array-interface test. v2: Make lines fit in 80 columns (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Handle TCS outputs and TES inputs.Kenneth Graunke2015-12-141-2/+112
| | | | | | | | | | | | | | | | | | TCS outputs and TES inputs both refer to a common "patch URB entry" shared across all invocations. First, there are some number of per-patch entries. Then, there are per-vertex entries accessed via an offset for the variable and a stride times the vertex index. Because these calculations need to be done in both the vec4 and scalar backends, it's simpler to just compute the offset calculations in NIR. It doesn't necessarily make much sense to use per-vertex intrinsics afterwards, but that at least means we don't lose the per-patch vs. per-vertex information. v2: Use is_input/is_output helpers (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>