aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: legalize vs unused sam dst componentsRob Clark2015-01-072-2/+9
| | | | | | | | We probably could be more clever elsewhere and mask out components that are not used. But either way, legalize should realize that there is also a write-after-write hazard with texture sample instructions. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: hack for old compilerRob Clark2015-01-071-0/+23
| | | | | | | Old compiler doesn't have ir3_block's.. so we need a special path. This hack can be dropped when ir3_compiler_old is retired. Signed-off-by: Rob Clark <[email protected]>
* tgsi: track max array per fileRob Clark2015-01-072-0/+4
| | | | | | | | | | | | NOTE IN[] and OUT[] don't need (have?) ArrayID's.. and TEMP[] can optionally have them. So we implicitly assume that ArrayID==0 always exists for each file. This is why array_max[file] is never less than zero. You can tell from indirect_files(_read/written) if the legacy array- id zero was actually used. Signed-off-by: Rob Clark <[email protected]>
* tgsi: keep track of read vs written indirectsRob Clark2015-01-072-0/+8
| | | | | | | | | | At least temporarily, I need to fallback to old compiler still for relative dest (for freedreno), but I can do relative src temp. Only a temporary situation, but seems easy/reasonable for tgsi-scan to track this. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* Revert "radeonsi: reduce the size of si_pm4_state"Marek Olšák2015-01-082-3/+12
| | | | | | This reverts commit 9141d8855555e45a057970e78969e1518ad3617d. It broke OpenCL.
* radeonsi: Fix crash when destroying si_screenTom Stellard2015-01-071-2/+4
| | | | | | | | | We were invalidating si_screen:tm by calling r600_destroy_common_screen() which frees the si_screen object. This caused the driver to crash in LLVMDisposeTargetMachine() since we were passing it an invalid pointer. https://bugs.freedesktop.org/show_bug.cgi?id=88170
* mesa: Don't use _mesa_generic_nop on Windows.José Fonseca2015-01-071-0/+9
| | | | | | | | | | | | | | | | | | | | It doesn't work on Windows because of STDCALL calling convention -- it's the callee responsibility to pop the arguments, and the number of arguments vary with the prototype --, so the stack pointer ends up getting corrupted. This is just a non-invasive stop-gap fix. A proper fix would be more elaborate, and require either: - a variation of __glapi_noop_table which sets GL_INVALID_OPERATION error - stop using APIENTRY on all internal _mesa_* functions. Tested with piglit gl-1.0-beginend-coverage (it now fails instead of crashing). VMware PR1350505 Reviewed-by: Brian Paul <[email protected]>
* glapi: Force frame pointer elimination on Windows.José Fonseca2015-01-071-0/+22
| | | | | | | | | | | | To catch mismatches in cdecl vs stdcall calling convention. See code comment for more detailed explanation. Tested with piglit gl-1.0-beginend-coverage (it now also crashes on debug builds.) VMware PR1350505. Reviewed-by: Brian Paul <[email protected]>
* radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shadersMarek Olšák2015-01-073-4/+12
| | | | | | | v2: complete rewrite Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeonsi: emit SURFACE_SYNC lastMarek Olšák2015-01-071-23/+35
| | | | | | | This fixes a case where a transform feedback buffer is fed back as an index buffer, because SURFACE_SYNC must be after VS_PARTIAL_FLUSH. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: flush all CB/DB caches unconditionally when changing the framebufferMarek Olšák2015-01-071-11/+7
| | | | | | This is easier to read and will work better with shader image stores. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: change TC cache flushing strategy for texturesMarek Olšák2015-01-072-4/+6
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: improve and fix streamout flushingMarek Olšák2015-01-073-10/+40
| | | | | | | | | | | - we don't usually need to flush TC L2 - we should flush KCACHE (not really an issue now since we always flush KCACHE when updating descriptors, but it could be a problem if we used CE, which doesn't require flushing KCACHE) - add an explicit VS_PARTIAL_FLUSH flag Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use TC L2 for CP DMA operations with shader resources on CIKMarek Olšák2015-01-073-10/+39
| | | | | | | | | So that TC L2 doesn't need to be flushed. The only problem is with index buffers, which don't use TC. A simple solution is added that flushes TC L2 before a draw call (TC_L2_dirty). Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use TC L2 for updating descriptors on CIKMarek Olšák2015-01-072-5/+10
| | | | | | This allows not flushing TC L2 on CIK later. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't use TC L2 for updating descriptors on SIMarek Olšák2015-01-072-2/+14
| | | | | | | | | | | | It's causing problems, because we mix uncached CP DMA with cached WRITE_DATA when updating the same memory. The solution for SI is to use uncached access here, because CP DMA doesn't support cached access. CIK will be handled in the next patch. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: only flush the right set of caches for CP DMA operationsMarek Olšák2015-01-079-34/+48
| | | | | | | | That's either framebuffer caches or caches for shader resources. The motivation is that framebuffer caches need to be flushed very rarely here. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement separate ICACHE and KCACHE flush for SIMarek Olšák2015-01-071-9/+17
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add a combined flag for flushing a framebufferMarek Olšák2015-01-073-20/+10
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: rename flush flags, split the TC flag into L1 and L2Marek Olšák2015-01-077-91/+109
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: separate cache flush flagsMarek Olšák2015-01-075-26/+39
| | | | | | I will rename them for radeonsi. Reviewed-by: Michel Dänzer <[email protected]>
* r600g: move r6xx-specific streamout flush flagging into r600gMarek Olšák2015-01-072-9/+7
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: only set BC_OPTIMIZE_DISABLE when necessaryMarek Olšák2015-01-072-6/+15
| | | | | | SPI_PS_IN_CONTROL is moved into the SPI mapping state. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: do not define FACE as an ordinary PS inputMarek Olšák2015-01-071-1/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove flatshade from the shader keyMarek Olšák2015-01-073-7/+7
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove special handling of TGSI_INTERPOLATE_COLOR in shader codegenMarek Olšák2015-01-071-6/+10
| | | | | | | | It doesn't do anything useful. And colors are floating-point, so we can use fs.interp, remove "flatshade" from the shader key, and rely on the FLAT_SHADE state only (in the next patch). Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement VERTEXID_NOBASE and BASEVERTEX system valuesMarek Olšák2015-01-071-0/+10
| | | | | | | | Only done for completeness. Not used by anything yet. Tested by advertising PIPE_CAP_VERTEXID_NOBASE. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix VertexID for OpenGLMarek Olšák2015-01-071-2/+5
| | | | | | | This fixes all failing piglit VertexID tests. Cc: 10.4 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: clarify a hw bug in shader exportsMarek Olšák2015-01-071-5/+10
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use ordered compares for SSG and face selectionMarek Olšák2015-01-072-3/+3
| | | | | | | | | | | Ordered compares are what you have in C. Unordered compares are the result of negating ordered compares (they return true if either argument is NaN). That special NaN behavior is completely useless here, and unordered compares produce horrible code with all stable LLVM versions. (I think that has been fixed in LLVM git) Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove unused and not useful variablesMarek Olšák2015-01-073-6/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove init config from statesMarek Olšák2015-01-076-5/+4
| | | | | | It really doesn't do anything there. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: reduce the size of si_pm4_stateMarek Olšák2015-01-072-12/+3
| | | | | | | | - the relocs array is unused, remove it - ndw is at most 115 (init), set 140 as the maximum - compute needs 4 buffers per state, graphics only needs 1; set 4 as the maximum Reviewed-by: Michel Dänzer <[email protected]>
* tgsi: add uses_centroid into tgsi_shader_infoMarek Olšák2015-01-072-0/+4
|
* st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEXMarek Olšák2015-01-071-1/+2
| | | | | Cc: 10.2 10.3 10.4 <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* vbo: ignore primitive restart if FixedIndex is enabled in DrawArraysMarek Olšák2015-01-071-1/+2
| | | | | | | | | | | | | | From GL 4.4 Core profile: If both PRIMITIVE_RESTART and PRIMITIVE_RESTART_FIXED_INDEX are enabled, the index value determined by PRIMITIVE_RESTART_FIXED_INDEX is used. If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not performed for array elements transferred by any drawing command not taking a type parameter, including all of the *Draw* commands other than *DrawEle- ments*. Cc: 10.2 10.3 10.4 <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* vc4: Fix scaling W projection of the Z coordinate when there's a Z offset.Eric Anholt2015-01-061-3/+3
| | | | | | Fixes piglit glsl-fs-fragcoord-zw-perspective, es3conform gl_FragCoord_z_frag, and the rest of the piglit glsl 1.10 interpolation tests.
* vc4: Fix deletion from the program cache.Eric Anholt2015-01-061-1/+1
| | | | | They key is, oddly enough, in the key field, not in the data field (which is the vc4_compiled_shader *). Fixes regular failures in fp-long-alu.
* vc4: Skip storing the Z/S contents when it's invalidated.Eric Anholt2015-01-061-0/+11
| | | | | | | Improves framerate of 5 seconds of es2gears by 1.57473% +/- 0.669409% (n=67). Reviewed-by: Jose Fonseca <[email protected]>
* gallium: Plumb the swap INVALIDATE_ANCILLARY flag through more layers.Eric Anholt2015-01-062-0/+17
| | | | | | | | | v2: Instead of telling the driver that the window system ancillaries have been invalidated (when the driver doesn't know which of its buffers are the window system's!), introduce a method for invalidating specific surfaces. Reviewed-by: Jose Fonseca <[email protected]>
* egl: Inform the client API when ancillary buffers may become undefined.Eric Anholt2015-01-067-15/+44
| | | | | | | This is part of the EGL spec, and is useful for a tiled renderer to avoid the memory bandwidth cost of storing the depth/stencil buffers. Reviewed-by: Jose Fonseca <[email protected]>
* ax_prog_flex.m4: Merge upstream OpenBSD fixes.Vinson Lee2015-01-061-2/+2
| | | | | | | | | | | | Merge the following upstream autoconf-archive patches. ax_prog_flex: change grep syntax to accept e.g. "flex.real" in case a wrapper or symlink is used. AX_PROG_FLEX: avoid use of grep empty string escape extension (fix for OpenBSD) AX_PROG_FLEX: Also accept gflex. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jonathan Gray <[email protected]>
* radeon/llvm: Use amdgcn triple for SI+ on LLVM >= 3.6Tom Stellard2015-01-064-16/+27
|
* radeonsi: Cache LLVMTargetMachine object in si_screenTom Stellard2015-01-066-26/+51
| | | | | | | | | | Rather than building a new one every compile. This should reduce some of the overhead of compiling shaders. One consequence of this change is that we lose the MachineInstrs dumps when dumping the shaders via R600_DEBUG. The LLVM IR and assembly is still dumped, and if you still want to see the MachineInstr dump, you can run the dumped LLVM IR through llc.
* mesa: create, use new _mesa_texture_base_format() functionBrian Paul2015-01-056-9/+23
| | | | Reviewed-by: Eric Anholt <[email protected]>
* mesa: remove unused ctx parameter for _mesa_select_tex_image()Brian Paul2015-01-0512-34/+30
| | | | Reviewed-by: Eric Anholt <[email protected]>
* swrast: use new _mesa_base_tex_image() helperBrian Paul2015-01-056-42/+47
| | | | Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: use new _mesa_base_tex_image() helperBrian Paul2015-01-055-5/+14
| | | | | | This involved adding a new st_texture_image_const() helper also. Reviewed-by: Eric Anholt <[email protected]>
* mesa: add _mesa_base_tex_image() helper functionBrian Paul2015-01-051-0/+10
| | | | Reviewed-by: Eric Anholt <[email protected]>
* mesa: simplify a conditional in detach_shader()Brian Paul2015-01-051-3/+1
| | | | Reviewed-by: Eric Anholt <[email protected]>