summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* st/mesa: use PRId64 for printing 64-bit intsBrian Paul2014-08-111-1/+4
| | | | | | | v2: use signed types/formats Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: use PRId64 for printing 64-bit intsBrian Paul2014-08-112-21/+25
| | | | | | | | | | | Silences MinGW warnings: warning: unknown conversion type character ‘l’ in format [-Wformat] warning: too many arguments for format [-Wformat-extra-args] v2: use signed types/formats Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: define and use ALL_TYPE_BITS in varray.c codeBrian Paul2014-08-111-16/+17
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: add comment that GL_CLIP_DISTANCE0 == GL_CLIP_PLANE0 in enable.cBrian Paul2014-08-111-2/+2
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* configure.ac: Do not require llvm on x32Maarten Lankhorst2014-08-111-0/+1
| | | | | Cc: "10.2" <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]>
* i965: Don't check for format differences when using the blorp blitterNeil Roberts2014-08-111-54/+12
| | | | | | | | | | | | | | Previously the blorp blitter wouldn't be used if the source and destination buffer had a different format other than swizzling between RGB and BGR and adding or removing a dummy alpha channel. However there's no reason why the blorp code path can't be used to do almost all format conversions so this patch just removes the checks. However it does explicitly disable converting to/from MESA_FORMAT_Z24_UNORM_X8_UINT because there is a similar check brw_blorp_copytexsubimage. This doesn't cause any Piglit test regressions at least on Ivybridge. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/eu: Allow math on immediates on Broadwell.Kenneth Graunke2014-08-101-3/+6
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Update jump distance scaling for Broadwell.Kenneth Graunke2014-08-101-0/+4
| | | | | | | | | | Broadwell measures jump distances in bytes, so we need to scale by 16. v2: Update the function in brw_eu.h, not in brw_eu_emit.c. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Refactor jump distance scaling to use a helper function.Kenneth Graunke2014-08-103-17/+32
| | | | | | | | | | | | | | | | | | | Different generations of hardware measure jump distances in different units. Previously, every function that needed to set a jump target open coded this scaling, or made a hardcoded assumption (i.e. just used 2). Most functions start with the number of instructions to jump, and scale up to the hardware-specific value. So, I made the function match that. Others start with a byte offset, and divide by a constant (8) to obtain the jump distance. This is actually 16 / 2 (the jump scale for Gen5-7). v2: Make the helper a static inline defined in brw_eu.h, instead of an actual function in brw_eu_emit.c (as suggested by Matt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Set UIP on ELSE instructions on Broadwell.Kenneth Graunke2014-08-101-0/+6
| | | | | | | | Broadwell adds UIP on ELSE instructions. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Make it clear that brw_patch_break_count only runs on Gen4-5.Kenneth Graunke2014-08-101-0/+2
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Make it clear that brw_find_loop_end only runs on Gen6+.Kenneth Graunke2014-08-101-0/+2
| | | | | | | | | It has Gen6+ knowledge baked in, and indeed is only called for Gen6+, but it wasn't immediately obvious that this was the case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Port Broadwell CMP destination type hack to brw_eu_emit.c.Kenneth Graunke2014-08-101-0/+8
| | | | | | | | See gen8_generator::CMP(). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Explicitly disable instruction compaction on Broadwell for now.Kenneth Graunke2014-08-101-1/+1
| | | | | | | | | | | | Until now, it's been off implicitly: we never call the compactor function. When we merge the generators, we'll start calling it, so we should make it do nothing. Matt will enable instruction compaction properly later. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Use Haswell atomic messages on Broadwell.Kenneth Graunke2014-08-101-2/+2
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Change gen == 7 to gen >= 7 in a couple brw_eu_emit.c cases.Kenneth Graunke2014-08-101-2/+2
| | | | | | | | | | | Broadwell is going to use the brw_eu_emit.c code soon. We want to get the fake MRF handling and URB HWord channel mask handling. We don't need the CMP thread switch workaround, though. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/clip: Removing scissor atomBen Widawsky2014-08-101-2/+2
| | | | | | | | | | | | | | | | | Now that we no longer use ctx->DrawBuffer->_Xmin and related fields to program the screen-space viewport extents, we don't depend on any scissoring state. So we can drop the +_NEW_SCISSOR dependency. On GEN8, a change in scissor state does not effect anything for the clipper/sf hardware state. The hardware will always do the right thing once the viewport extents are programmed. We can therefore remove the unecessary state emission. Ken originally spotted this. v2: Reword the commit message. Remove spurious hunk. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/guardband: Enable for all viewport dimensions (GEN8+)Ben Widawsky2014-08-101-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The goal of guardband clipping is to try to avoid 3d clipping because it is an expensive operation. When guardband clipping is disabled, all geometry that intersects the viewport is sent to the FF 3d clipper. Objects which are entirely enclosed within the viewport are said to be "trivially accepted" while those entirely outside of the viewport are, "trivially rejected". When guardband clipping is turned on the above behavior is changed such that if the geometry is within the guardband, and intersects the viewport, it skips the 3d clipper. Prior to GEN8, this was problematic if the viewport was smaller than the screen as it could allow for rendering to occur outside of the viewport. That could be mitigated if the programmer specified a scissor region which was less than or equal to the viewport - but this is not required for correctness in OpenGL. In theory you could be clever with the guardband so as not to invoke this problem. We do not do this, and have no data that suggests we should bother (nor the converse data). With viewport extents in place on GEN8, it should be safe to turn on guardband clipping for all cases While here, add a comment to the code which confused me thoroughly. v2: Update grammar in commit message. Reword comments based on Ken's suggestion. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Simplify viewport extents programming on GEN8Ben Widawsky2014-08-101-9/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Viewport extents are a 3rd rectangle that defines which pixels get discarded as part of the rasterization process. The actual pixels drawn to the screen are an intersection of the drawing rectangle, the viewport extents, and the scissor rectangle. It permits the use of guardband clipping in all cases (see later patch). The actual pixels drawn to the screen are an intersection of the drawing rectangle, the viewport extents, and the scissor rectangle. Scissor rectangle is not super important for this discussion as it should always help do the right thing provided the programmer uses it. switch (viewport dimensions, drawrect dimension) { case viewport > drawing rectangle: no effects; break; case viewport == drawing rectangle: no effects; break; case viewport < drawing rectangle: Pixels (after the viewport transformation but before expensive rastersizing and shading operations) which are outside of the viewport are discarded. } I am unable to find a test case where this improves performance, but in all my testing it doesn't hurt performance, and intuitively, it should not ever hurt performance. It also permits us to use the guardband more freely (see upcoming patch). v2: Updating commit message. v3: Commit message updates requested by Ken Reviewed-by: Kenneth Graunke <[email protected]>
* i965/guardband: Improve comments for guardband clippingBen Widawsky2014-08-101-4/+18
| | | | | | | | | | While working in this part of the code I had a great deal of trouble understanding what it was trying to do, and matching it with the spec. (mostly due bad wording in the PRM). To help future people, I've cleaned up the wording and provided some ascii art. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Support the allow_glsl_extension_directive_midshader option.Kenneth Graunke2014-08-102-0/+4
| | | | | | | | | | | This adds support for Marek's new driconf parameter, which avoids totally white rendering in Unigine Valley (which attempts to enable the GL_ARB_sample_shading extension in an illegal place). Signed-off-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75664 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: set virtual_grf_count in assign_regs()Connor Abbott2014-08-101-0/+4
| | | | | | | | This lets us call dump_instructions() after register allocation without failing an assertion. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* i965/fs: don't read from uninitialized memory while assigning registersConnor Abbott2014-08-101-6/+6
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* i965/fs: Fix bad whitespace.Matt Turner2014-08-101-2/+2
|
* gallium/radeon: Set gpu_address to 0 if r600_virtual_address is falseNiels Ole Salscheider2014-08-101-0/+2
| | | | | | | | | | | Without this patch I get the following during DMA transfers: [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream ! radeon 0000:01:00.0: CP DMA dst buffer too small (21475829792 4096) This is a fixup for e878e154cdfd4dbb5474f776e0a6d86fcb983098. Signed-off-by: Niels Ole Salscheider <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: simplify constant buffer upload for big endianMarek Olšák2014-08-101-18/+4
| | | | | | | | Point util_memcpy_cpu_to_le32 to a buffer storage directly. v2: simplify more Reviewed-by: Michel Dänzer <[email protected]>
* winsys/radeon: fix compile warningsMarek Olšák2014-08-091-3/+4
|
* r600g/compute: fix compile warningsMarek Olšák2014-08-092-10/+11
| | | | Trivial.
* r300g: handle new shader capsMarek Olšák2014-08-091-0/+2
| | | | Trivial.
* radeonsi: fix CMASK and HTILE allocation on TahitiMarek Olšák2014-08-092-3/+56
| | | | | | | | | | | | | | | | Tahiti has 12 tile pipes, but P8 pipe config. It looks like there is no way to get the pipe config except for reading GB_TILE_MODE. The TILING_CONFIG ioctl doesn't return more than 8 pipes, so we can't use that for Hawaii. This fixes a regression caused by 9b046474c95f15338d4c748df9b62871bba6f36f on Tahiti. v2: add an assertion and print an error on failure Cc: [email protected] Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: remove r600_resource_vaMarek Olšák2014-08-091-9/+0
| | | | | Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* gallium/radeon: use gpu_address from r600_resourceMarek Olšák2014-08-093-21/+14
| | | | | Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: use gpu_address from r600_resourceMarek Olšák2014-08-095-39/+29
| | | | | Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: use gpu_address from r600_resourceMarek Olšák2014-08-096-56/+41
| | | | | Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* gallium/radeon: store VM address in r600_resourceMarek Olšák2014-08-093-2/+7
| | | | | | | This will help to get rid of the buffer_get_virtual_address calls. Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: remove useless r600_resource_va callsMarek Olšák2014-08-091-18/+9
| | | | | | | R600-R700 don't support virtual memory. Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: always prefer SWITCH_ON_EOP(0) on CIKMarek Olšák2014-08-094-10/+46
| | | | | | | | | | | | | | The code is rewritten to take known constraints into account, while always using 0 by default. This should improve performance for multi-SE parts in theory. A debug option is also added for easier debugging. (If there are hangs, use the option. If the hangs go away, you have found the problem.) Reviewed-by: Alex Deucher <[email protected]> v2: fix a typo, set max_se for evergreen GPUs according to the kernel driver
* radeonsi: fix a hang with instancing in Unigine Heaven/Valley on HawaiiMarek Olšák2014-08-091-5/+2
| | | | | | | | This isn't documented anywhere, but it's the only thing that works for this case. Cc: [email protected] Reviewed-by: Alex Deucher <[email protected]>
* radeon,r200: fix buffer validation after CS flushMarek Olšák2014-08-098-15/+8
| | | | | | | | | This validates all bound buffers (CB, ZB, textures, DMA) at the beginning of CS. This fixes "bo->space_accouned" assertion failures. Tested by: Jochen Rollwagen <[email protected]> Cc: [email protected] Reviewed-by: Alex Deucher <[email protected]>
* st/mesa: fix blit-based partial TexSubImage for 1D arraysMarek Olšák2014-08-091-0/+2
| | | | | | | | This fixes piglit spec/EXT_texture_array/render-1darray. Cc: [email protected] Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: fix DrawPixels(GL_STENCIL_INDEX)Marek Olšák2014-08-091-7/+4
| | | | | | | | | This is a bug which was probably uncovered recently by Jason's commits and broke this. The problem is _mesa_base_tex_format(GL_STENCIL_INDEX) returns -1. Tested-by: Michel Dänzer <[email protected]>
* st/mesa: dump TGSI before calling into the driverMarek Olšák2014-08-091-12/+10
| | | | | | If the driver crashes in create_xx_shader, you want to see the shader. Reviewed-by: Ilia Mirkin <[email protected]>
* configure.ac: Use LIBS rather than LDFLAGS to add -ldl to dladdr checkJon TURNEY2014-08-091-3/+4
| | | | | | | | | | | | | ec8ebff "Check for dladdr()" erroneously uses LDFLAGS rather than LIBS to add -ldl to the dladdr check. Replace the workaround in 39a4cc4 of explicitly checking in libdl, with a more correct approach of using LIBS. Signed-off-by: Jon TURNEY <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Tested-by: Pali Rohár <[email protected]> Cc: "10.2" <[email protected]>
* vc4: Add support for the COS instruction.Eric Anholt2014-08-081-0/+38
|
* vc4: Add support for the SIN instruction.Eric Anholt2014-08-081-0/+35
| | | | v2: Rebase on helpers.
* vc4: Fix register aliasing for packing of scaled coordinates.Eric Anholt2014-08-081-11/+18
| | | | Fixes glean fragProg1's "ADD test" and likely many others.
* vc4: Add some debug code for forcing fragment shader output color.Eric Anholt2014-08-081-0/+15
|
* u_primconvert: Copy min/max_index from the original primitive.Eric Anholt2014-08-081-4/+2
| | | | | | | | | | | | | | | | | | These values are supposed to be the minimum/maximum index values used to read from the vertex buffers. This code either copies index values out of the old IB (so, same min/max as the original draw call), or generates a new IB (using index values between the start and the start + count of the old array draw info, which just happens to be what min/max_index are set to by st_draw.c). We were incorrectly setting the max_index in the converting-from-glDrawArrays case to the start vertex plus the number of vertices generated in the new IB, which broke QUADS primitive conversion on VC4 (where max_index really has to be correct, or the kernel might reject your draw call due to buffer overflow). Reviewed-by: Rob Clark <[email protected]> (from verbal description of the patch)
* vc4: Fix using and emitting the 1/W from the vertex/coord shaders.Eric Anholt2014-08-081-14/+20
| | | | v2: Rebase on helpers change.
* vc4: Add support for swizzles of 32 bit float vertex attributes.Eric Anholt2014-08-082-20/+73
| | | | | | | | | | | | Some tests start working (useprogram-flushverts, for example) due to getitng the right vertices now. Some that used to pass start failing with memory overflow during binning, which is weird (glsl-fs-texture2drect). And a couple stop rendering correctly (glsl-fs-bug25902). v2: Move the attribute format setup in the key from after search time to before the search. v3: Fix reading of attributes other than position (I forgot to respect attr and stored everything in inputs 0-3, i.e. position).