summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: early exit in si_clear if there's nothing to doMarek Olšák2015-07-311-0/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix a regression since the resource_copy_region cleanupMarek Olšák2015-07-311-1/+1
| | | | | | | | | Broken since: 46b2b3b - radeonsi: don't change pipe_resource in resource_copy_region Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91444 Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* radeonsi: fix broken st/nine from merging tessellationMarek Olšák2015-07-311-2/+7
| | | | st/nine uses GENERIC slots greater than 60.
* radeonsi: move CP DMA functions to their own fileMarek Olšák2015-07-316-236/+274
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add a debug flag that disables printing ISA in shader dumpsMarek Olšák2015-07-313-9/+13
|
* radeonsi: add a debug flag that disables printing TGSI in shader dumpsMarek Olšák2015-07-313-1/+3
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: add a debug flag that disables printing the LLVM IR in shader dumpsMarek Olšák2015-07-316-29/+29
| | | | This is for shader-db and should reduce size of shader dumps.
* radeonsi: store shader disassemblies in memory for future usersMarek Olšák2015-07-317-17/+18
| | | | | This will be used by the new ddebug pipe. I'm including it now to avoid conflicts with other patches.
* radeonsi: don't use llvm.AMDIL.fraction for FRC and DFRACMarek Olšák2015-07-311-4/+16
| | | | | | | | There are 2 reasons for this: - LLVM optimization passes can work with floor - there are patterns to select v_fract from floor anyway There is no change in the generated code.
* gallium/radeon: re-enable unsafe math for graphics shadersMarek Olšák2015-07-311-0/+4
| | | | | | | | | | | | | | | | This reverts commit 4db985a5fa9ea985616a726b1770727309502d81. The grass no longer disappears, which was the reason the commit was reverted. This might affect tessellation. We'll see. Totals from affected shaders: SGPRS: 151672 -> 150232 (-0.95 %) VGPRS: 90620 -> 89776 (-0.93 %) Code Size: 3980472 -> 3920836 (-1.50 %) bytes LDS: 67 -> 67 (0.00 %) blocks Scratch: 1357824 -> 1202176 (-11.46 %) bytes per wave Reviewed-by: Tom Stellard <[email protected]>
* gallium/radeon: don't use rsq_actionMarek Olšák2015-07-311-7/+3
| | | | Reviewed-by: Dave Airlie <[email protected]>
* gallium/radeon: move r600-specific code to r600gMarek Olšák2015-07-312-152/+150
| | | | Reviewed-by: Tom Stellard <[email protected]>
* gallium/radeon: remove unused variables and old commentsMarek Olšák2015-07-314-35/+0
| | | | Reviewed-by: Dave Airlie <[email protected]>
* gallium/radeon: remove build_intrinsic and build_tgsi_intrinsicMarek Olšák2015-07-314-108/+58
| | | | | | duplicated now Reviewed-by: Dave Airlie <[email protected]>
* gallivm: add LLVMAttribute parameter to lp_build_intrinsicMarek Olšák2015-07-317-19/+24
| | | | | | This will help remove some duplicated code from radeon. Reviewed-by: Dave Airlie <[email protected]>
* gallium/util: clear up that debug_get_flags_option returns a 64-bit maskMarek Olšák2015-07-312-7/+7
| | | | Reviewed-by: Kai Wasserbäch <[email protected]>
* radeonsi: completely rework updating descriptors without CP DMAMarek Olšák2015-07-314-271/+128
| | | | | | | | | | | | | | | | | | | | The patch has a better explanation. Just a summary here: - The CPU always uploads a whole descriptor array to previously-unused memory. - CP DMA isn't used. - No caches need to be flushed. - All descriptors are always up-to-date in memory even after a hang, because CP DMA doesn't serve as a middle man to update them. This should bring: - better hang recovery (descriptors are always up-to-date) - better GPU performance (no KCACHE and TC flushes) - worse CPU performance for partial updates (only whole arrays are uploaded) - less used IB space (no CP_DMA and WRITE_DATA packets) - simpler code - hopefully, some of the corruption issues with SI cards will go away. If not, we'll know the issue is not here. Reviewed-by: Michel Dänzer <[email protected]>
* i965/fs: Fix regression with SIMD8 VS since ↵Francisco Jerez2015-07-311-1/+2
| | | | | | | | | | | | | | b5f1a48e234d47b24df38cb562cffb8941d43795. With num_direct_uniforms == 0 there's no space allocated in the param_size array for the one block of direct uniforms -- On the FS stage this would be a harmless no-op because it would simply re-set one of the param_size entries allocated for the sampler units to zero, but on the VS stage it has been reported to cause memory corruption followed by a crash -- Surprising how a full piglit run on Gen8 didn't catch it. Reported-and-reviewed-by: "Lofstedt, Marta" <[email protected]>
* i965/gen9: Add hs, ds, and cs thread + urb infoBen Widawsky2015-07-301-0/+10
| | | | | | | | | | | | | | | | | For SKL: These are the production values. For BXT: These are low estimates to enable platforms. This patch was originally part of i965/skl: Add production thread counts and URB size but was split out at Jordan's request (which I found to be reasonable). Note on stable inclusion: 10.6 does not care about hs, and ds. It does care about cs, but since Jordan was the one that asked me to extract it, I'll leave it up to him to deal with a backport to stable is required. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/bxt: Use more conservative thread countsBen Widawsky2015-07-301-2/+4
| | | | | | | | | | | | Since we really do not know what may occur in the future, pick a more conservative value for thread counts until we know better what values are correct. As far as I can tell, the old values will work fine, but some of the registers seem to indicate that going even lower is possible and the purpose of having early support is to enable as many configurations that can possibly exist (we can trim things down after platforms begin shipping later). Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/skl: Add production thread counts and URB sizeBen Widawsky2015-07-301-5/+5
| | | | | | | | | | | | | This patch adjusts the SKL values to the best known values we have. v2: Remove HS/DS/CS fields. Adding this makes most sense to add to the GEN9_FEATURES macro, however, doing that would require updating BXT values, and Jordan requested I not do that. Conveniently, this request makes a lot of sense wrt to stable backport as HS, and DS do not even exist there. Cc: [email protected] Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* vc4: Lower uniform loads to scalar in NIR.Eric Anholt2015-07-302-31/+81
| | | | | This also moves the vec4-to-byte-addressing math into NIR, so that algebraic has a chance at it.
* vc4: Move some FS input lowering into NIR.Eric Anholt2015-07-302-35/+50
|
* vc4: Move program keys to the header file.Eric Anholt2015-07-302-47/+49
| | | | | I want to be able to inspect them from other files for lowering passes in NIR.
* vc4: Lower NIR inputs to scalar as well.Eric Anholt2015-07-302-4/+44
| | | | | For now this is just scalarizing, but it also means we'll get to dump a bunch of QIR-based lowering in a moment.
* vc4: Start adding a NIR-based output lowering pass.Eric Anholt2015-07-304-7/+137
| | | | | | For now, this just splits up store_output intrinsics to be scalars, and drops unused outputs in the coordinate shader. My goal is to be able to drop a bunch of my VC4-specific optimization by letting NIR handle it.
* vc4: Mark our shaders as single-threaded.Eric Anholt2015-07-302-0/+6
| | | | | I had my understanding of this bit flipped. We're using the full register space, so we need to say so.
* vc4: Avoid leaking indirect array access UBOs.Eric Anholt2015-07-301-0/+2
|
* vc4: Avoid overflowing various static tables.Eric Anholt2015-07-304-4/+4
|
* vc4: Fix return values from recent validation changes.Eric Anholt2015-07-301-4/+4
|
* docs: trivial cleanup of GL3.txt, remove redundant radeonsi entries.Kai Wasserbäch2015-07-311-2/+2
| | | | | | | Follow-up to 1b2b0e42ce47bfd1fcb5513ed2c23b9bb7a5a5b8 Signed-off-by: Kai Wasserbäch <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* st/mesa: don't draw instead of asserting in transform feedbackDave Airlie2015-07-313-4/+7
| | | | | | | | | | | | if we get a request to take the count from feedback, but there is no buffer to take it from, just draw as if we got 0 vertices so nothing. This fixes this assert killing the ogl conform, and a piglit test I've sent. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: remove now unused _mesa_get_uniform_locationTimothy Arceri2015-07-302-79/+0
| | | | | Cc: Tapani Pälli <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* mesa: remove now unused subscript validationsTimothy Arceri2015-07-302-108/+0
| | | | | Cc: Tapani Pälli <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* mesa: fix and simplify resource query for arraysTimothy Arceri2015-07-305-92/+106
| | | | | | | | | | | | | | | | | | | | | | This removes the need for multiple functions designed to validate an array subscript and replaces them with a call to a single function. The change also means that validation is now only done once and the index is retrived at the same time, as a result the getUniformLocation code can be simplified saving an extra hash table lookup (and yet another validation call). This chage also fixes some tests in: ES31-CTS.program_interface_query.uniform V3: rebase on subroutines, and move the resource index array == 0 check into _mesa_GetProgramResourceIndex() to simplify things further V2: Fix bounds checks for program input/output, split unrelated comment fix and _mesa_get_uniform_location() removal into their own patch. Cc: Tapani Pälli <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965/bxt: Don't use brw_device_info_skl_early on BXTNeil Roberts2015-07-301-1/+3
| | | | | | | | Previously it could end up using the “SKL early” device on BXT depending on the revision number. This would probably break things because for example has_llc would be wrong. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: set stage flag for structs and arrays in resource listTimothy Arceri2015-07-301-3/+13
| | | | | | | | This fixes the remaining failing tests in: ES31-CTS.program_interface_query.uniform-types Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* docs: consolidate radeonsi in GL3.txtDave Airlie2015-07-301-16/+16
| | | | | | move into DONE for GL4.0 and GL4.1 Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: enable GL4.1 and update documentation (v2)Dave Airlie2015-07-303-9/+10
| | | | | | | | | | This enables GL4.1 for radeonsi, and updates the docs in the correct places. v2: enable only for llvm 3.7 which has fixes in place. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: add GS multiple streams support (v2)Dave Airlie2015-07-306-39/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | This is the final piece for ARB_gpu_shader5, The code is based on the r600 code from Glenn Kennard, and myself. While developing this, I'm not 100% sure of all the calculations made in the GS registers, this is why the max_stream is worked out there and used to limit the changes in registers. Otherwise my initial attempts either regressed GS texelFetch tests or primitive-id-restart. The current code has no regressions in piglit. This commit doesn't enable ARB_gpu_shader5, since that just bumps the glsl level to 4.00, so I'll just do a separate patch for 4.10. v1.1: fix bug introduced in rebase. v2: Address Marek's review comments, remove my llvm stream code for simpler C, move gsvs_ring and gs_next_vertex to arrays. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* Delete unused functions in format parserAnuj Phogat2015-07-291-7/+0
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* i965: Change the type of max_{vs, hs, ...}_threads variables to unsignedAnuj Phogat2015-07-292-7/+7
| | | | | | | | | Fixes following compiler warning: brw_cs.cpp:386:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* Delete duplicate function is_power_of_two() and use _mesa_is_pow_two()Anuj Phogat2015-07-298-26/+15
| | | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* gallium/auxiliary: Ensure c99_math.h is included.Jose Fonseca2015-07-291-1/+2
| | | | | | As it is needed for exp2. Trivial.
* c99_math: (trivial) implement exp2 for MSVC tooRoland Scheidegger2015-07-291-0/+6
| | | | Unsurprisingly doesn't build otherwise with old msvc.
* i965/bxt: Support 3src simd16 instructionsBen Widawsky2015-07-291-3/+1
| | | | | | | | | This is easily accomplished by moving simd16 3src to GEN9_FEATURES. v2: small cleanup to make it more similar to GEN8_FEATURES Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* targets/dri: scons: add missing link against libdrmEmil Velikov2015-07-291-0/+2
| | | | | | | | | Otherwise the final dri module will have (additional) unresolved symbols. Cc: Brian Paul <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviwed-by: Jose Fonseca <[email protected]>
* svga: scons: remove unused HAVE_SYS_TYPES_H defineEmil Velikov2015-07-292-2/+0
| | | | | | | | | There isn't a single instance in mesa that mentions HAVE_SYS_TYPES_H, other than this file. Cc: Jose Fonseca <[email protected]> Acked-by: Brian Paul <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* glsl: Avoid double promotion.Matt Turner2015-07-291-2/+2
|
* mesa: Avoid double promotion.Matt Turner2015-07-2914-49/+49
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>