aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965/nir: Use nir_system_value_from_intrinsic to reduce duplication.Kenneth Graunke2015-09-082-60/+17
| | | | | | | | This code is all pretty much identical. We just needed the translation from one enum value to the other. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add a nir_system_value_from_intrinsic() function.Kenneth Graunke2015-09-082-0/+36
| | | | | | | | This converts NIR intrinsics that load system values into Mesa's SYSTEM_VALUE_* enumerations. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Mark topologies with adjacency information as G45+.Kenneth Graunke2015-09-081-4/+4
| | | | | | | These didn't exist on the original 965. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Fix value of _3DPRIM_TRIFAN_NOSTIPPLE.Kenneth Graunke2015-09-081-1/+1
| | | | | | | | | | | | TRIFAN_NOSTIPPLE has always been 0x16 - 0x15 is marked "Reserved" on all platforms. See the 965 PRM, Volume 2, Table 3-1, "3D Primitive Topology Type Encoding" for a list. We don't currently use this, and I don't expect we will, but we may as well not leave the bogus value around. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Add 64-bit dirty flag handling to brw_upload_pull_constantsChris Forbes2015-09-082-2/+2
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add defines for all new Gen7/8 URB opcodesChris Forbes2015-09-082-10/+16
| | | | | | | Tessellation needs to emit URB reads and atomics; Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen8+: Skip depth stalls on state changeBen Widawsky2015-09-081-0/+8
| | | | | | | | | | | | | Docs suggest this is no longer required starting with Gen8. Perf (no regressions in n=20) OglMultithread 0.67% OglTerrainPanInst 0.12% trex 0.45% warsow 0.64% Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* r600: don't use shader key without verifying shader type (v2)Dave Airlie2015-09-091-7/+12
| | | | | | | | | | | | | | | | | Since 7a32652231f96eac14c4bfce02afe77b4132fb77 r600: Turn 'r600_shader_key' struct into union we were accessing key fields that might be aliased in the union with other fields, so we should check what shader type we are compiling for before using key values from it. v1.1: make it compile v2: have caffeine, make it work - we don't set type until later, so don't reference it until we've set it. Reviewed-by: Edward O'Callaghan <[email protected]> Cc: "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965/skl: Use more compact hiz dimensionsBen Widawsky2015-09-081-32/+32
| | | | | | | | | | | | | | | | | I meant to do this here, but it was in the wrong place: commit c1151b18f2dce7c6f238f057e9c4fa8d912ce6b5 Author: Ben Widawsky <[email protected]> Date: Wed Jun 24 20:07:54 2015 -0700 i965/skl: Use more compact hiz dimensions NOTE: Jordan did go back and look at the original mailing list post. I mailed the right thing, and pushed the wrong one. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* st/mesa: increase viewport bounds limits for GL4 hwIlia Mirkin2015-09-081-2/+7
| | | | | | | | | According to the ARB_viewport_array spec, GL4 limit is higher than the GL3 limit. Also take this opportunity to fix the GL3 limit. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: always emit a full shader colormaskIlia Mirkin2015-09-081-1/+1
| | | | | | | | | | | | | | Indications are that if the colormask indicates a single bit set on fermi, that value will always be read from $r0 instead of a potentially higher register (if e.g. green is set). Not to upset the counting logic, always set the header up with a full color mask for each RT. Such a situation can basically only ever happen with generated blit shaders. Fixes the following piglit on Fermi (Kepler is unaffected): fbo-stencil blit GL_DEPTH32F_STENCIL8 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]>
* nir: UBO loads no longer use const_index[1]Iago Toral Quiroga2015-09-081-1/+0
| | | | | | | | Commit 2126c68e5cba killed the array elements parameter on load/store intrinsics that was stored in const_index[1]. It looks like that patch missed to remove this assignment in the UBO path. Reviewed-by: Jason Ekstrand <[email protected]>
* nv30: Fix max width / height checks in nv30 sifm codeHans de Goede2015-09-071-2/+2
| | | | | | | | | | | | | The sifm object has a limit of 1024x1024 for its input size and 2048x2048 for its output. The code checking this was trying to be clever resulting in it seeing a surface of e.g 1024x256 being outside of the input size limit. This commit fixes this. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]>
* i965: Disallow fast blit paths for CopyTexImage with PixelTransfer opsChris Wilson2015-09-072-0/+8
| | | | | | | | | | | | | | | | glCopyTexImage behaves similarly to glReadPixels with respect to the pixel transfer operations. Therefore if any are set we cannot use the simple blit-only fast paths. (Though if would be possible to relax the blorp path to handle pixel zoom, or we can just enhance meta.) Signed-off-by: Chris Wilson <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviwewed-by: Iago Toral <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* mesa/tests: Remove unneeded X11_CFLAGSJon TURNEY2015-09-071-1/+0
| | | | | | | | | | | | X11_CFLAGS is never defined. Path to X11 headers is not needed here, so just remove. Future work: Using AM_CFLAGS here looks wrong, as this Makefile only builds C++ files Signed-off-by: Jon TURNEY <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* glxl/tests: Use X11_INCLUDES instead of X11_CFLAGSJon TURNEY2015-09-071-1/+1
| | | | | | | | | | | X11_CFLAGS is undefined, so these tests will fail to build if x11proto is installed in a non-standard location. (See also commits 35189d76, bc93c3798, 54b028ba, d901d7e08, etc.) Signed-off-by: Jon TURNEY <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* svga: Fix surface view error handlingThomas Hellstrom2015-09-071-22/+26
| | | | | | | | | Make sure errors are correcly propagated. Also don't flush during state emission if emission fails. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* xa: add xa_surface_from_handle2 v2Rob Clark2015-09-073-11/+45
| | | | | | | | | | | | | | | | | Like xa_surface_from_handle(), but takes a handle type, rather than hard-coding 'shared' handle. This is needed to fix bugs seen with xf86-video-freedreno with xrandr rotation, for example. The root issue is that doing a GEM_OPEN ioctl on a bo that already has a GEM handle associated with the drm_file will result in two unique handles for the same bo. Which causes all sorts of follow-on fail. v2: - Add support for for fd handles. - Avoid duplicating code. - Bump xa version minor. Signed-off-by: Rob Clark <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* i965/nir/vec4: removed unneeded tex src swizzle setAlejandro Piñeiro2015-09-071-1/+0
| | | | | | At that point the swizzle should be correct. Reviewed-by: Jason Ekstrand <[email protected]>
* util: make mesa-sha1.c completely empty when there are no SHA1 implsIlia Mirkin2015-09-071-2/+2
| | | | | | | | | | | My earlier attempt to fix this missed the fact that there was a #else clause that assumes that you have openssh. This moves the whole thing under #ifdef HAVE_SHA1 which should avoid this issue. Fixes: 13bfa5201 (util: always include sha1 into the build) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91898 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* util: always include sha1 into the buildIlia Mirkin2015-09-063-8/+6
| | | | | | | | | SHA1 is now used in all builds when HAVE_SHA1 is defined. Adjust src to do the same thing, rather than predicating on shader cache. Fixes: 04e201d0c02 ("mesa: change 'SHADER_SUBST' facility to work with env variables") Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: don't fall back to 16F when 32F is requestedIlia Mirkin2015-09-061-14/+8
| | | | | | | | | | | Nothing in the spec allows for the reduced precision, and this also fixes st_QuerySamplesForFormat for nv50, which does not allow MS8 on RGBA32F. Now this will be respected instead of reporting MS8 as supported with an assumption that the format used will be RGBA16F. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: properly handle u_upload_alloc failureIlia Mirkin2015-09-061-1/+1
| | | | | | | | vbuf is never null. We want to make sure that a resource was allocated for the vbuf, which is *vbuf. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nouveau: don't mark full range as used on unmap with explicit flushIlia Mirkin2015-09-051-5/+7
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50: avoid using inline vertex data submit when gl_VertexID is usedIlia Mirkin2015-09-054-2/+14
| | | | | | | | | | | The hardware only generates vertexid when vertices come from a VBO. This fixes: vertexid-drawelements vertexid-drawarrays Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv50: don't flush vertex arrays when index buffer changesIlia Mirkin2015-09-051-4/+0
| | | | | | | | The index buffer is fed in inline over a pushbuf. It's not related to vertices or any caching that might be done on them. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50: rebind bo to bufctx when invalidating idxbuf storageIlia Mirkin2015-09-051-1/+5
| | | | | | | | There is nothing to be done on a dirty idxbuf, but the bo may have changed, so we have to rebind it to the bufctx. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50: clear buffer status on all vertex bufs, not just the first oneIlia Mirkin2015-09-051-1/+0
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50: fix drawing from tfb, direct-to-pushbuf submitsIlia Mirkin2015-09-054-14/+15
| | | | | | | | | | | The stride was being set to 0, which is illegal (and also non-sensical). Also we must wait for the buffer to become available for reading as otherwise a wrong value may be prefetched. Since we must wait for the buffer anyways, and it's mapped and in GART, we may as well avoid the annoyance of the indirect pushbuf submit. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* i965: Remove base miplevel from sampler state.Ben Widawsky2015-09-043-6/+1
| | | | | | | | | | | | Gen9 changes the meaning of this to coarse LOD quality mode. Although that's a desirable thing to be setting, it doesn't match the gen8 behavior and this was unintentional. More importantly, we don't ever use this field. So instead of getting it "wrong" drop it entirely. This is a respin of a patch which only [incorrectly] tried to address gen9. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* llvmpipe: convert double to long long instead of unsigned long longOded Gabbay2015-09-041-1/+1
| | | | | | | | | | | | | | | | | round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <[email protected]> CC: "10.6 11.0" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nv30: Implement color resolve for msaaHans de Goede2015-09-042-14/+8
| | | | | | | | | | | Note this is not ideal. Since the sifm can only do source sizes upto 1024x1024 we end up using the blitter on nv4x, which is not that fast. And on nv3x we end up using the cpu which is really slow. Cc: "10.6 11.0" <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv30: Fix creation of scanout buffersHans de Goede2015-09-041-0/+10
| | | | | | | | | | | | | | | | | | | | | Scanout buffers on nv30 must always be non-swizzled and have special width alignment constraints. These constrains have been taken from the xf86-video-nouveau src/nv_accel_common.c: nouveau_allocate_surface() function. nouveau_allocate_surface() applies these width constraints only when a tiled attribute is set, which it sets for all surfaces allocated via dri, and this "tiling" is not the same as swizzling, scanout surfaces must be linear / have a uniform_pitch or only complete garbage is shown. This commit fixes dri3 on nv30 showing a garbled display, with dri3 the scanout buffers are allocated by mesa, rather then by the ddx, and the wrong stride of these buffers was causing the garbled display. Cc: "10.6 11.0" <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Initialize pack field of qreg to 0 in qir_get_tempBoyan Ding2015-09-041-0/+1
| | | | | | | | | | | | | This avoids generation of undefined packing in qir and qpu instructions, fixing a lot of rendering errors. Fixes 8b36d107fdd (vc4: Pack the unorm-packing bits into a src MUL instruction when possible.) Cc: [email protected] Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Disallow PixelTransfer operations for tiled-memcpy TexImage/ReadPixelsChris Wilson2015-09-042-0/+8
| | | | | | | | | | | | | | | The tiled memcpy fast paths perform a simple blit (with only a couple of trivial pixel conversion routines) and do not accommodate PixelTransfer operations. Therefore if any are set, fallback to the regular routines. Note that PixelTransfer only applies to TexImage and ReadPixels, not to GetTexImage. Signed-off-by: Chris Wilson <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* i965/vec4: Don't unspill the same register in consecutive instructionsIago Toral Quiroga2015-09-041-8/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have spilled/unspilled a register in the current instruction, avoid emitting unspills for the same register in the same instruction or consecutive instructions following the current one as long as they keep reading the spilled register. This should allow us to avoid emitting costy unspills that come with little benefit to register allocation. v2: - Apply the same logic when evaluating spilling costs (Curro). v3: - Abstract the logic that decides if a register can be reused in a function. that can be used from both spill_reg and evaluate_spill_costs (Curro). v4: - Do not disallow reusing scratch_reg in predicated reads (Curro). - Track if previous sources in the same instruction read scratch_reg (Curro). - Return prev_inst_read_scratch_reg at the end (Curro). - No need to explicitily skip scratch read/write opcodes in spill_reg (Curro). - Fix the comments explaining what happens when we hit an instruction that does not read or write scratch_reg (Curro) - Return true early when the current or previous instructions read scratch_reg with a compatible mask. v5: - Do not return true early, the loop should not be expensive anyway and this adds more complexity (Curro). Reviewed-by: Francisco Jerez <[email protected]>
* i965: Add a debug option for spilling everything in vec4 codeIago Toral Quiroga2015-09-044-5/+7
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* dri/common: Tokenize driParseDebugString() argument before matching debug flags.Francisco Jerez2015-09-041-4/+13
| | | | | | | Fixes debug string parsing when one of the supported flags is a substring of another. Reviewed-by: Iago Toral Quiroga <[email protected]>
* dri/common: Fix codestyle of driParseDebugString().Francisco Jerez2015-09-041-8/+6
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl: error out on ES 3.1 if VS or FS present but not bothTapani Pälli2015-09-041-4/+25
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: error on linking if no shaders are attached to programTapani Pälli2015-09-041-0/+19
| | | | | | | This applies to OpenGL Core >= 4.5 and OpenGL ES >= 3.1. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: Improve disassembly of data port read messages.Kenneth Graunke2015-09-031-4/+27
| | | | | | | | We now print out the name of the message instead of its numerical value, and label the message control and surface numbers. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Optimize VUE map comparisons.Kenneth Graunke2015-09-032-4/+4
| | | | | | | | | | | | The entire VUE map is computed based on the slots_valid bitfield; calling brw_compute_vue_map on the same bitfield will return the same result. So we can simply compare those. struct brw_vue_map is 136 bytes; doing a single 8-byte comparison is much cheaper and should work just as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gs: Don't reserve space for clip plane uniforms.Kenneth Graunke2015-09-031-2/+0
| | | | | | | | These were only for legacy userclipping, which we no longer support in geometry shaders. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Don't do legacy userclipping in non-compatibility contexts.Kenneth Graunke2015-09-031-0/+1
| | | | | | | | | | | | | | | | | According to the GLSL 1.50 specification, page 76: "The shader must also set all values in gl_ClipDistance that have been enabled via the OpenGL API, or results are undefined." With this patch, we only enable clip distance writes when the shader actually writes them. We no longer force a value to be written when clip planes are enabled in the API. This could mean the first varying slot would be used as clip distances - I believe it should be the safe kind of undefined behavior. Empirically, it doesn't seem to cause a problem. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Remove the brw_vue_prog_key base class.Kenneth Graunke2015-09-039-63/+45
| | | | | | | | | The legacy userclip fields are only used for the vertex shader, and at that point there's only program_string_id and the tex struct, which are common to all keys. So there's no need for a "VUE" key base class. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Virtualize vec4_visitor::emit_urb_slot().Kenneth Graunke2015-09-034-16/+28
| | | | | | | | | | | | This avoids a downcast of key, which won't exist in the base class soon. I'm not a huge fan of this patch, but given that we're currently using inheritance, this seems like the "right" way to do it. The alternative is to make key a void pointer in the parent class and continue downcasting. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Store a key_tex pointer in vec4_visitor.Kenneth Graunke2015-09-033-8/+10
| | | | | | | | | I'm about to remove the base class for VS/GS/HS/DS program keys, at which point we won't be able to use key->tex anymore. Instead, we'll need to store a direct pointer (like we do in the FS backend). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Move legacy clip plane handling to vec4_vs_visitor.Kenneth Graunke2015-09-036-67/+74
| | | | | | | | | | | | | This is now only used for the vertex shader, so it makes sense to get it out of any paths run by the geometry shader. Instead of passing the gl_clip_plane array into the run() method (which is shared among all subclasses), we add it as a vec4_vs_visitor constructor parameter. This eliminates the bogus NULL parameter in the GS case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Delete the brw_vue_program_key::userclip_active flag.Kenneth Graunke2015-09-035-22/+15
| | | | | | | | | | | | | | | | | There are two uses of this flag. The primary use is checking whether we need to emit code to convert legacy gl_ClipVertex/gl_Position clipping to clip distances. In this case, we also have to upload the clip planes as uniforms, which means setting nr_userclip_plane_consts to a positive value. Checking if it's > 0 works for detecting this case. Gen4-5 also wants to know whether we're doing clipping at all, so it can emit user clip flags. Checking if output_reg[VARYING_SLOT_CLIP_DIST0] is set to a real register suffices for this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>