summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Fold vectorize_mov() back into the one caller.Kenneth Graunke2016-04-202-28/+16
| | | | | | | | | | After the previous patch, this helper is only called in one place. So, just fold it back in - there are a lot of parameters here and not much code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Rework opt_vector_float() control flow.Kenneth Graunke2016-04-201-27/+34
| | | | | | | | | | | | | This reworks opt_vector_float() so that there's only one place that flushes out any accumulated state and emits a VF. v2: Don't break the sequence for non-representable numbers - just skip recording their values. Only break it for non-MOVs or register changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: rename nir_foreach_block*() to nir_foreach_block*_call()Connor Abbott2016-04-206-10/+10
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Always split uniforms in array_access_to_pull_constantsJason Ekstrand2016-04-201-1/+3
| | | | | | | | | | | | | | | | Normally, we split uniforms at the end but in Vulkan, we bail because we don't want pull constants. However, we still need them split because pack_uniforms relies on it. I really don't like this patch not because it doesn't work (it does) but because now that we're using MOV_INDIRECT, uniform numbers and sizes don't really matter anymore. In the FS backend, uniform splitting and packing is handled all at once (actual re-assignment of locations happens later) and we really should do it that way in vec4 eventually as well. Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* i965/vec4: Use the correct offset for the swizzle shift in push constantsJason Ekstrand2016-04-201-1/+1
| | | | | | | | | This was actually caught by Ken in review the first time around but somehow didn't get fixed before the patches were pushed. :-( Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* i965/vec4: Use nir_intrinsic_base in the load_uniform implementationJason Ekstrand2016-04-201-1/+1
| | | | | | | | We shouldn't be reading the const_index directly Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
* mesa/st: enable compute shaders if images are also supportedBas Nieuwenhuizen2016-04-191-3/+4
| | | | | | | | v2: Also depend on atomic counters. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: Define miptree map functions static (trivial)Ben Widawsky2016-04-181-2/+2
| | | | | | | | | | | | | They were already declared as such. It was changed here: commit 31f0967fb50101437d2568e9ab9640ffbcbf7ef9 Author: Ian Romanick <[email protected]> Date: Wed Sep 2 14:43:18 2015 -0700 i965: Make intel_miptree_map_raw static Cc: Ian Romanick <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* meta: Don't botch color masks when changing drawbuffers.Kenneth Graunke2016-04-181-7/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Color clears should respect each drawbuffer's color mask state. Previously, we tried to leave the color mask untouched. However, _mesa_meta_drawbuffers_from_bitfield() ended up rebinding all the color drawbuffers in a different order, so we ended up pairing drawbuffers with the wrong color mask state. The new _mesa_meta_drawbuffers_and_colormask() function does the same job as the old _mesa_meta_drawbuffers_from_bitfield(), but also rearranges the color mask state to match the new drawbuffer configuration. This code was largely ripped off from Gallium's st_Clear code. This fixes ES31-CTS.draw_buffers_indexed.color_masks, which binds up to 8 drawbuffers, sets color masks for each, and then calls glClearBufferfv to clear each buffer individually. ClearBuffer causes us to rebind only one drawbuffer, at which point we used ctx->Color.ColorMask[0] (draw buffer 0's state) for everything. We could probably delete _mesa_meta_drawbuffers_from_bitfield(), but I'd rather not think about the i965 fast clear code. Topi is rewriting a bunch of that soon anyway, so let's delete it then. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94847 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meta: Don't smash ColorMask when using MESA_META_COLOR_MASK save bit.Kenneth Graunke2016-04-182-5/+4
| | | | | | | | | | | This allows meta operations to inspect the existing color mask, and then do their own smashing. BlitFramebuffer and Clear already override the color mask, so this was also redundant. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Don't allow OOB array access of imagesJason Ekstrand2016-04-151-15/+11
| | | | | | | | | | | | | | | | We have had a guard against OOB array access of images on IVB for a long time, but it can actually cause hangs on any GPU generation. This can happen due to getting an untyped SURFACE_STATE for a typed message. We didn't used to hit this with the piglit test on anything other than IVB because the OOB in the test would cause us to go past the top of the pull constant UBO and we would get a surface index of 0 which is was always a valid surface. Now that we're pushing small arrays, we can end up grabbing garbage from the GRF and going to some random index which causes a hang. The solution is to just do the bounds check on all hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94944 Reviewed-by: Francisco Jerez <[email protected]> Tested-by: Mark Janes <[email protected]>
* mesa/texstore: Use Driver.CompressedTexSubImage in the default ↵Nanley Chery2016-04-151-5/+5
| | | | | | | | | | | | CompressedTexImage Enable drivers to use their own implementation of this method instead of the mesa default. Since the drivers that currently overwrite dd_function_table::CompressedTexSubImage also overwrite ::CompressedTexImage, there should be no behavioral change. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965/vec4: Support full std140 layout for push constantsJason Ekstrand2016-04-151-5/+25
| | | | | | | | | Up until now, we have been able to assume that all push constants are vec4-aligned because this is what the GL driver gives us. In Vulkan, we need to be able to support full std140 because we get the layout from the client. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Handle MOV_INDIRECT in pack_uniform_registersJason Ekstrand2016-04-151-0/+18
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECTJason Ekstrand2016-04-152-0/+68
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Use can_do_writemask in can_reswizzleJason Ekstrand2016-04-151-3/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Move can_do_writemask to vec4_instructionJason Ekstrand2016-04-153-30/+30
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/surface_formats: Update some formats for more recent gensJason Ekstrand2016-04-151-12/+12
| | | | | | | | The surface format table hasn't entirely been kept up-to-date. This commit marks a couple more compressed formats as sampleable on gen8+ and adds the A4B4G4R4 format as renderable on gen9. Reviewed-by: Kenneth Graunke <[email protected]>
* xlib: remove MESA_GLX_VISUAL_HACKJohn Sheu2016-04-151-23/+19
| | | | | | | | | | | | This removes a hack introduced in 1999 in the first version of fakeglx.c, with the comment: /* XXX revisit this after 3.0 is finished. */ Mesa 4.0 was released in 2001. It is now 2016, and Mesa 11.0 was released last year. Reviewed-by: Alejandro Piñeiro <[email protected]>
* xlib: fix leaks of returned values from XGetVisualInfoJohn Sheu2016-04-151-8/+21
| | | | Reviewed-by: Alejandro Piñeiro <[email protected]>
* xlib: fix memory leak of and remove vishandle from XMesaVisualInfoJohn Sheu2016-04-152-39/+24
| | | | | | | | | | | | | | The vishandle member of XMesaVisualInfo is used to support the comparison of XVisualInfo instances by pointer value, in find_glx_visual(). The comparison however will always be false, as in every case the comparison is made, the VisualInfo instance being compared to is a new allocation passed in through a GLX API call. In addition, the XVisualInfo instance pointed to by vishandle is itself never freed, causing a memory leak. Since vishandle is essentially useless, we just remove it and thereby also fix the leak. Reviewed-by: Alejandro Piñeiro <[email protected]>
* xlib: do not cache return value of glXChooseVisual/glXGetVisualFromFBConfigJohn Sheu2016-04-151-18/+8
| | | | | | | | | | | | | | | The returned XVisualInfo from glXChooseVisual/glXGetVisualFromFBConfig is being cached in XMesaVisual.vishandle (and unconditionally overwritten on subsequent calls). However, these entry points are specified to return XVisualInfo instances to be owned by the caller and freed with XFree(), so the return values should not be retained. With this change, XMesaVisual.vishandle is essentially unused and will be removed in a subsequent change. v2: update commit message Reviewed-by: Alejandro Piñeiro <[email protected]>
* i965: Expose the surface format tableJason Ekstrand2016-04-143-18/+48
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* dri: Fix robust context creation via EGL attributeChad Versace2016-04-141-2/+23
| | | | | | | | | | | driCreateContextAttribs() emits an error if bit __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS is set for an ES context. But, EGL_EXT_create_context_robustness and EGL 1.5 both allow creation of robust ES contexts. One requests a robust ES context by setting the EGL_CONTEXT_OPENGL_ROBUST_ACCESS *attribute*, which Mesa's EGL layer translates into the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS *bit*. Reviewed-by: Marek Olšák <[email protected]>
* i965: Push everything if pull_param == NULLJason Ekstrand2016-04-142-2/+14
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Push small uniform arraysJason Ekstrand2016-04-141-23/+53
| | | | | | | | | | | | | Unfortunately, this also means that we need to use a slightly different algorithm for assign_constant_locations. The old algorithm worked based on the assumption that each read of a uniform value read exactly one float. If it encountered a MOV_INDIRECT, it would immediately bail and push the whole thing. Since we can now read ranges using MOV_INDIRECT, we need to be able to push a series of floats without breaking them up. To do this, we use an algorithm similar to the on in split_virtual_grfs. Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Rename demote_pull_constants to lower_constant_loadsJason Ekstrand2016-04-142-3/+3
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Get rid of the uniform_size arrayJason Ekstrand2016-04-146-33/+0
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constantsJason Ekstrand2016-04-144-51/+50
| | | | | | | | | | | | | | | This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Get rid of the param_size arrayJason Ekstrand2016-04-144-15/+0
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Stop relying on param_size in assign_constant_locationsJason Ekstrand2016-04-141-27/+17
| | | | | | | | | | | | | Now that we have MOV_INDIRECT opcodes, we have all of the size information we need directly in the opcode. With a little restructuring of the algorithm used in assign_constant_locations we don't need param_size anymore. The big thing to watch out for now, however, is that you can have two ranges overlap where neither contains the other. In order to deal with this, we make the first pass just flag what needs pulling and handle assigning pull constant locations until later. Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Get rid of reladdrJason Ekstrand2016-04-142-10/+2
| | | | | | | We aren't using it anymore. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use MOV_INDIRECT for all indirect uniform loadsJason Ekstrand2016-04-142-40/+87
| | | | | | | | | | | Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT. Reviewed-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardwareJason Ekstrand2016-04-142-13/+66
| | | | | | | | | | While we're at it, we also add support for the possibility that the indirect is, in fact, a constant. This shouldn't happen in the common case (if it does, that means NIR failed to constant-fold something), but it's possible so we should handle it. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnrJason Ekstrand2016-04-141-1/+1
| | | | | | | The subnr field is in bytes so we don't need to multiply by type_sz. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructionsJason Ekstrand2016-04-141-1/+0
| | | | | | | It should work fine without it and the visitor can set it if it wants. Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for doing MOV_INDIRECT on uniformsJason Ekstrand2016-04-141-1/+4
| | | | | Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make intel_get_param return an intBen Widawsky2016-04-141-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | This will fix the spurious error message: "Failed to query GPU properties." that was unintentionally added in cc01b63d730. This patch changes the function to return an int so that the caller is able to do stuff based on the return value. The equivalent of this patch was in the original series that fixed up the warning, but I dropped it at the last moment. It is required to make the desired behavior of not warning when trying to query GPU properties from the kernel unless there is something the user can do about it. v2: Use strerror (Jason) Make EINVAL check similar in all places (Ian) NOTE: Broadwell appears to actually have some issue where the kernel returns ENODEV when it shouldn't be. I will investigate this separately. Reported-by: Chris Forbes <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* st/mesa: fix sampler view leak in st_DrawAtlasBitmaps()Brian Paul2016-04-141-0/+6
| | | | | | | | | | | | I neglected to free the sampler view which was created earlier in the function. So for each glCallLists() command that used the bitmap atlas to draw text, we'd leak a sampler view object. Also, check for st_create_texture_sampler_view() failure and record GL_OUT_OF_MEMORY. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* i965/vec4: Use UD rather than D for uniform indirectsJason Ekstrand2016-04-142-6/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOADJason Ekstrand2016-04-142-3/+4
| | | | | Reveiewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/dead_variables: Configurably work with any variable modeJason Ekstrand2016-04-131-1/+1
| | | | | | | The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Switch to NIR for ldexp lowering.Kenneth Graunke2016-04-132-2/+1
| | | | | | | | | | | The old GLSL IR based lowering doesn't quite work right in all cases, and fails several dEQP-GLES31 and Vulkan CTS tests. Jason's new approach in NIR passes all the tests. There's not likely to be a ton of advantage to lowering early in GLSL IR anyway, so...switch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Implement the new imod and irem opcodesJason Ekstrand2016-04-132-0/+72
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Inline get_pull_constant_offsetJason Ekstrand2016-04-132-25/+14
| | | | | | | It's not really doing enough anymore to justify a helper function. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reveiewed-by: Kristian Høgsberg <[email protected]>
* scons: Allow building with Address Sanitizer.Jose Fonseca2016-04-131-2/+6
| | | | | | | | | | | | | | | | | | | | | | | libasan is never linked to shared objects (which doesn't go well with -z,defs). It must either be linked to the main executable, or (more practically for OpenGL drivers) be pre-loaded via LD_PRELOAD. Otherwise works. I didn't find anything with llvmpipe. I suspect the fact that the JIT compiled code isn't instrumented means there are lots of errors it can't catch. But for non-JIT drivers, the Address/Leak Sanitizers seem like a faster alternative to Valgrind. Usage (Ubuntu 15.10): scons asan=1 libgl-xlib export LD_LIBRARY_PATH=$PWD/build/linux-x86_64-debug/gallium/targets/libgl-xlib LD_PRELOAD=libasan.so.2 any-opengl-application Acked-by: Roland Scheidegger <[email protected]>
* mesa: Change an error code in glSamplerParameterI[iu]v().Kenneth Graunke2016-04-121-4/+6
| | | | | | | | | | | | This is supposed to be INVALID_OPERATION in ES. We already did this for the fv/iv variants, but not Iiv/Iuv, which are new in ES 3.2 (or extensions). Fixes: ES31-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965/tiled_memcpy: Fix rgba8_copy_16_aligned_dst() typoKristian Høgsberg Kristensen2016-04-121-4/+4
| | | | | | | | | Copy and paste error in commit eafeb8db66dae7619ff3cb039706b990d718cba7: i965/tiled_memcpy: Unroll bytes==64 case. Signed-off-by: Kristian Høgsberg Kristensen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/tiled_memcpy: Unroll bytes==64 case.Matt Turner2016-04-121-0/+16
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* i965/tiled_memcpy: Provide SSE2 for RGBA8 <-> BGRA8 swizzle.Roland Scheidegger2016-04-121-3/+40
| | | | | | | | | | | | | | | The existing code uses SSSE3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas SSE2 is always present with 64-bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments v4: [mattst88] Rebase Reviewed-by: Matt Turner <[email protected]>