summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* draw: Delete unneeded LLVM stuff earlier.Frank Henigman2014-05-141-15/+4
| | | | | | | | | | Free up unneeded LLVM stuff immediately after generating vertex shader code. Saves about 500K per shader. v2: Don't bother calling gallivm_free_function (Jose) Signed-off-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Separate freeing LLVM intermediate data from freeing final code.Frank Henigman2014-05-142-7/+22
| | | | | | | | | | | | Split free_gallivm_state() into two steps. First step is gallivm_free_ir() which cleans up the LLVM scaffolding used to generate code while preserving the code itself. Second step is gallivm_free_code() to free the memory occupied by the code. v2: s/gallivm_teardown/gallivm_free_ir/ (Jose) Signed-off-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: One code memory pool with deferred free.Frank Henigman2014-05-144-1/+283
| | | | | | | | | | | | | | | | Provide a JITMemoryManager derivative which puts all generated code into one memory pool instead of creating a new one each time code is generated. This saves significant memory per shader as the pool size is 512K and a small shader occupies just several K. This memory manager also defers freeing generated code until you tell it to do so, making it possible to destroy the LLVM engine while keeping the code, thus enabling future memory savings. v2: Fix compilation errors with LLVM 3.4 (Jose) Signed-off-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Run passes per module, not per function.José Fonseca2014-05-141-28/+19
| | | | | | This is how it is meant to be done nowadays. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use LLVM global context.José Fonseca2014-05-141-23/+17
| | | | | | | | | | | I saw that LLVM internally uses its global context for some things, even when we use our own. Given ours is also global, might as well use LLVM's. However, sepearate contexts can still be enabled with a simple source code modification, for when the need/benefit arises. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Stop using module providers.José Fonseca2014-05-142-27/+7
| | | | | | | Nowadays LLVMModuleProviderRef is just an alias for LLVMModuleRef, so its use just causes unnecessary confusion. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm,draw,llvmpipe: Remove support for versions of LLVM prior to 3.1.José Fonseca2014-05-1415-548/+20
| | | | | | | Older versions haven't been tested probably don't work anyway. But more importantly, code supporting it is hindering further work. Reviewed-by: Roland Scheidegger <[email protected]>
* configure: Require LLVM 3.1.José Fonseca2014-05-141-0/+6
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* scons: Require LLVM 3.1José Fonseca2014-05-141-44/+13
| | | | | | Support for prior versions will be removed in the following change. Reviewed-by: Roland Scheidegger <[email protected]>
* i965: Reformat brw_set_src1 so it can be easily found with grep.Matt Turner2014-05-131-3/+4
|
* i965: fix size assert for gen7 in brw_init_compaction_tables()Samuel Iglesias Gonsalvez2014-05-131-4/+4
| | | | | | | | It should compare with it's own size. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
* i965: Relax accumulator dependency scheduling on Gen < 6Iago Toral Quiroga2014-05-133-59/+36
| | | | | | | | | | | Many instructions implicitly update the accumulator on Gen < 6. The instruction scheduling code just calls add_barrier_deps() for each accumulator access on these platforms, but a large class of operations don't actually update the accumulator -- mostly move and logical instructions. Teaching the scheduling code about this would allow more flexibility to schedule instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740 Reviewed-by: Matt Turner <[email protected]>
* glsl: simplify the M_PI*f macros, fixes build on OpenBSDJonathan Gray2014-05-131-5/+3
| | | | | | | | | | | | | | | | | | The M_PI*f macros used a preprocessor paste to append 'f' to M_PI defines, which works if the values are only numbers but breaks on OpenBSD where M_PI definitions have casts and brackets to meet requirements of a future version of POSIX, http://austingroupbugs.net/view.php?id=801 http://austingroupbugs.net/view.php?id=828 Simplify the M_PI*f macros by using casts directly in the defines as suggested by Kenneth Graunke. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78665 Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Jonathan Gray <[email protected]>
* docs: Really add the 10.1.3 release nots this timeCarl Worth2014-05-131-0/+90
| | | | | Commit a96c3bccf6791359d1159ebe9475e0ed5cf790ed intended to add these, but I forgot to add the file.
* freedreno/a3xx: occlusion query supportRob Clark2014-05-135-3/+185
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add support for hw queriesRob Clark2014-05-1310-8/+734
| | | | | | | | | | | Real GPU queries need some infrastructure to track samples per tile and accumulate the results. But fortunately this can be shared across GPU generation. See: https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries Signed-off-by: Rob Clark <[email protected]>
* freedreno/query: allow multiple query implementationsRob Clark2014-05-136-107/+269
| | | | | | | | Split out fd_query into an abstract base class, to allow multiple implementations. The current sw based queries are moved into fd_sw_query. Signed-off-by: Rob Clark <[email protected]>
* mesa: Dump ARB_vp/fp source and IR when MESA_GLSL=dump.Kenneth Graunke2014-05-131-1/+26
| | | | | | | | | | As far as I can tell, Mesa hasn't had a convenient way to dump ARB_vp/fp source until now. Using MESA_GLSL=dump is convenient, since it means you can use a single environment variable to dump a program's shaders, no matter which language they're written in. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage.Kenneth Graunke2014-05-131-1/+1
| | | | | | | | | | | | | | | The point of copytexsubimage_using_blit_framebuffer is to use a hardware accelerated BlitFramebuffer path. If that fails, we shouldn't do a swrast blit---we should try our CTSI fallback code. This is especially important for i965 and GLES, where we don't even create a swrast context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: "10.2" <[email protected]>
* i965/gen8: Set depth extent fieldJordan Justen2014-05-131-1/+1
| | | | | | | | | | | | The depth extent field is used to limit the allowed slice range that can be rendered to. With the previous setting, only slice 0 could be rendered. This fixes piglit amd_vertex_shader_layer-layered-depth-texture-render. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gen8 depth: Set depth size based on LOD0 for 3D texturesJordan Justen2014-05-131-2/+2
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gen7 depth: Set depth size based on LOD0 for 3D texturesJordan Justen2014-05-131-2/+2
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gen8 renderbuffer: Set depth size based on LOD0 for 3D texturesJordan Justen2014-05-131-1/+1
| | | | | | | | Fixes piglit's 'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped' Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/gen7 renderbuffer: Set depth size based on LOD0 for 3D texturesJordan Justen2014-05-131-1/+1
| | | | | | | | | | | | If blorp is disabled for color clears, then piglit's 'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped' will fail. Currently, gen8 fails similarly on this test because gen8 does not use blorp. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* freedreno/a3xx: add point-sizeRob Clark2014-05-131-4/+14
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2014-05-134-54/+252
| | | | Signed-off-by: Rob Clark <[email protected]>
* glsl_to_tgsi: remove unnecessary dead code elimination passBryan Cain2014-05-131-45/+5
| | | | | | | | | With the more advanced dead code elimination pass already being run, eliminate_dead_code was making no difference in instruction count, and had an undesirable O(n^2) runtime. So remove it and rename eliminate_dead_code_advanced to eliminate_dead_code. Reviewed-by: Marek Olšák <marek.olsak at amd.com>
* ralloc: Omit detailed license information about talloc.José Fonseca2014-05-131-4/+3
| | | | | | | | | | | | That information misleads source code auditing tools to think that ralloc itself is released under LGPL v3. Instead, simply state talloc is not licensed under a permissive license. v2: Use wording suggested by Kenneth. Reviewed-by: Brian Paul <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Avoid redundant call to brw_merge_inputs() in brw_try_draw_prims()Iago Toral Quiroga2014-05-131-7/+6
| | | | | | | | We always call brw_merge_inputs() right before looping over the primitives but this can be called inside the loop for each primitive too. In the case we do it for the first primitive the call is redundant and can be skipped. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Do not call lhs->variable_referenced() multiple timesIago Toral Quiroga2014-05-131-3/+2
| | | | | | Instead take the result from the first call and use it where needed. Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Refactor state save/restore for framebuffer texture blitsTopi Pohjolainen2014-05-132-22/+52
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* wayland: Move version 2 request to end of interface specificationKristian Høgsberg2014-05-121-16/+19
| | | | | | | | | | We're moving towards requiring interface additions to be appended to the end of the interface block. No functional change, opcodes are assigned as before, but version 2 additions are now grouped together, which prevents a scanner warning. Cc: "10.2" <[email protected]> Signed-off-by: Kristian Høgsberg <[email protected]>
* glsl: the number of samplers is already calculated so use itTimothy Arceri2014-05-131-2/+1
| | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Stop doing remapping of "special" regs.Eric Anholt2014-05-121-37/+0
| | | | | | | | Now that we aren't using pixel_[xy] in live variables, nothing is looking at these regs after the visitor stage. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Generalize the pixel_x/y workaround for all UW types.Eric Anholt2014-05-121-4/+4
| | | | | | | | | | | | | | | This is the only case where a fs_reg in brw_fs_visitor is used during optimization/code generation, and it meant that optimizations had to be careful to not move pixel_x/y's register number without updating it. Additionally, it turns out we had a couple of other UW values that weren't getting this treatment (like gl_SampleID), so this more general fix is probably a good idea (though I wasn't able to replicate problems with either pixel_[xy]'s values or gl_SampleID, even when telling the register allocator to reuse registers immediately) Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move has_hiz from the slice to the level.Eric Anholt2014-05-125-30/+25
| | | | | | | | The value depends only on the level, so no need to store the bool per slice. Shrinks intel_mipmap_slice from 24 bytes to 16, while slotting into an existing hole in intel_mipmap_level. Reviewed-by: Chad Versace <[email protected]>
* meta: Refactor configuration of renderbuffer samplingTopi Pohjolainen2014-05-122-13/+30
| | | | | | Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Refactor binding of renderbuffer as texture imageTopi Pohjolainen2014-05-122-30/+47
| | | | | | Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Merge compiling and linking of blit programTopi Pohjolainen2014-05-123-31/+39
| | | | | | Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Expose coordinate scissoring and mirroringTopi Pohjolainen2014-05-124-118/+213
| | | | | | Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen8: Use helper variables for surface parametersTopi Pohjolainen2014-05-121-4/+8
| | | | | | Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nv50,nvc0: fix blit 3d path for 1d array texturesIlia Mirkin2014-05-111-0/+6
| | | | | | | | | | Need to adjust coordinates since the shader receives the array index as depth in z, but the TEX instruction expects it to be the second coordinate for a 1D array texture. This fixes fbo-generatemipmap-array. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ben Skeggs <[email protected]> Cc: "10.2" <[email protected]>
* nv50,nvc0: leave queries on during blit, turn them on for 2d engineIlia Mirkin2014-05-116-6/+35
| | | | | | | | Fixes the new logic of the conditional rendering piglit test. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ben Skeggs <[email protected]> Cc: "10.2" <[email protected]>
* mesa/st: leave current query enabled during glBlitFramebufferIlia Mirkin2014-05-113-0/+4
| | | | | | | | | Also make sure that pipe_blit_info gets zero'd out so that query isn't accidentally left enabled. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add bit to pipe_blit_info to leave current query enabledIlia Mirkin2014-05-111-0/+3
| | | | | | | | | | Previously the implication was that queries should be disabled during blits. However glBlitFramebuffer() is supposed to obey the current query, and this new bit will indicate that to the driver. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nv50: fix setting of texture ms info to be per-stageIlia Mirkin2014-05-113-6/+10
| | | | | | | | | | Different textures may be bound to each slot for each stage. So we need to be able to upload ms parameters for each one without stages overwriting each other. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ben Skeggs <[email protected]> Cc: "10.1 10.2" <[email protected]>
* nv50/ir: make sure to reverse cond codes on all the OP_SET variantsIlia Mirkin2014-05-111-1/+2
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ben Skeggs <[email protected]> Cc: "10.2 10.1" <[email protected]>
* freedreno/a2xx: fix compiler warningRob Clark2014-05-111-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeonsi: prepare depth export registers at compile timeMarek Olšák2014-05-103-14/+14
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: simplify depth/stencil export codeMarek Olšák2014-05-101-11/+5
| | | | Reviewed-by: Michel Dänzer <[email protected]>