summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* ilo: s/TRANSFER_MAP_UNSYNC/TRANSFER_MAP_GTT_UNSYNC/Chia-I Wu2014-07-282-6/+6
| | | | | It maps to drm_intel_gem_bo_map_unsynchronized(), which results in unsynchronized GTT mapping.
* ilo: drop unused context param from transfer functionsChia-I Wu2014-07-281-115/+100
| | | | Many of the transfer functions do not need an ilo_context. Drop it.
* ilo: tidy up transfer mapping/unmappingChia-I Wu2014-07-281-88/+89
| | | | | | Add xfer_map() to replace map_bo_for_transfer(). Add xfer_unmap() and xfer_alloc_staging_sys() to simplify texture and buffer mapping/unmapping, and enable more code sharing between them.
* ilo: tidy up choose_transfer_method()Chia-I Wu2014-07-281-84/+164
| | | | | | Add a bunch of helper functions and a big comment for choose_transfer_method(). This also fixes handling of PIPE_TRANSFER_MAP_DIRECTLY to not ignore tiling.
* ilo: free transfers with util_slab_free()Chia-I Wu2014-07-281-1/+1
| | | | We used FREE() in one of the error path.
* clover: Add clUnloadPlatformCompiler.EdB2014-07-282-1/+6
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* clover: Add clCreateProgramWithBuiltInKernels.EdB2014-07-282-1/+23
| | | | | | | [ Francisco Jerez: Check for devices not associated with the specified context. Style fix. ] Reviewed-by: Francisco Jerez <[email protected]>
* glsl/cs: Add several GLSL compute shader variablesJordan Justen2014-07-271-0/+6
| | | | | | | | With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit: built-in-constants tests/spec/arb_compute_shader/minimum-maximums.txt Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* main/cs: Add additional compute shader constant valuesJordan Justen2014-07-272-0/+18
| | | | | | | | With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit: * arb_compute_shader-minmax Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* glsl: No longer require ubo block index to be constant in ir_validateChris Forbes2014-07-261-1/+0
| | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Accept nonconstant array references in lower_ubo_referenceChris Forbes2014-07-261-11/+32
| | | | | | | | | | | | Instead of falling back to just the block name (which we won't find), look for the first element of the block array. We'll deal with the rest in the backend by arranging for the blocks to be laid out contiguously. V2: Squashed together patches 3, 5 of V1, plus a naming tweak. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Convert uniform_block in lower_ubo_reference to ir_rvalue.Chris Forbes2014-07-261-7/+8
| | | | | | | | | | Previously this was a block index with special semantics for -1. With ARB_gpu_shader5, this need not be a compile-time constant, so allow any rvalue here and convert the -1 to a NULL pointer. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Mark entire UBO array active if indexed with non-constant.Chris Forbes2014-07-261-19/+31
| | | | | | | | | Without doing a lot more work, we have no idea which indices may be used at runtime, so just mark them all. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Allow non-constant UBO array indexing with GLSL4/ARB_gpu_shader5.Chris Forbes2014-07-261-1/+2
| | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ilo: simplify ilo_flush()Chia-I Wu2014-07-263-20/+30
| | | | Move fence creation to the new ilo_fence_create().
* r600g/compute: Defrag the pool at the same time as we grow itBruno Jiménez2014-07-252-23/+19
| | | | | | | | | | | | This allows us two things: we now need less item copies when we have to defrag+grow the pool (to just one copy per item) and, even in the case where we don't need to defrag the pool, we reduce the data copied to just the useful data that the items use. Note: The fallback path is a bit ugly now, but hopefully we won't need it much. Reviewed-by: Tom Stellard <[email protected]>
* r600g/compute: Try to use a temporary resource when growing the poolBruno Jiménez2014-07-251-18/+43
| | | | | | | | | | | | | Now, before moving everything to host memory, we try to create a new resource to use as a pool. I we succeed we just use this resource and delete the previous one. If we fail we fallback to using the shadow. This should make growing the pool faster, and we can also save 64KB of memory that were allocated for the 'shadow', even if they weren't used. Reviewed-by: Tom Stellard <[email protected]>
* freedreno: fix typo in gpu version checkRob Clark2014-07-251-1/+1
| | | | | | | Opps, I should use larger fonts, I guess. Reported-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split out shader compiler from a3xxRob Clark2014-07-2525-477/+580
| | | | | | | | | | | | | | | | | | | | | | Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 |- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: rename ir3_shader to ir3Rob Clark2014-07-2512-55/+55
| | | | | | | | First step of reoganization split out compiler (so it can be shared between a3xx and a4xx). Rename ir3_shader -> ir3 (since we'll want the name ir3_shader for a higher level object). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: scheduler vs pred regRob Clark2014-07-252-3/+51
| | | | | | | The scheduler also needs to be aware of predicate register (p0) in addition to address register (a0). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: little cleanupsRob Clark2014-07-254-39/+19
| | | | | | Remove some obsolete comments, rename deref->addr. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: enable/disable wa's based on patch-levelRob Clark2014-07-254-8/+34
| | | | | | | | It seems like for the most part, different behaviors, workarounds, etc, should be conditional on GPU patch revision (ie. a320.0 vs a320.2) rather than GPU id (a320 vs a330). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: make IR heap dyanmicRob Clark2014-07-252-8/+43
| | | | | | | | | | | | The fixed size heap is a remnant of the fdre-a3xx assembler. Yet it is convenient for being able to free the entire data structure in one shot without worrying about leaking nodes. Change it to dynamically grow the heap size (adding chunks) as needed so we don't have an artificial upper limit on shader size (other than hw limits) and don't always have to allocate worst-case size. Signed-off-by: Rob Clark <[email protected]>
* r600g/compute: Fix singed/unsigned comparison compiler warnings.Jan Vesely2014-07-251-7/+7
| | | | | | | The iteration variables go from 0 anyway. Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* clover: Query the device to see if images are supportedTom Stellard2014-07-253-1/+8
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* gallium: Add PIPE_CAP_COMPUTE_IMAGES_SUPPORTEDTom Stellard2014-07-253-1/+11
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* r600g/compute: Allow compute_memory_defrag to defragment between resourcesBruno Jiménez2014-07-252-5/+7
| | | | | | This will be used in the following patch to avoid duplicated code Reviewed-by: Tom Stellard <[email protected]>
* r600g/compute: Allow compute_memory_move_item to move items between resourcesBruno Jiménez2014-07-252-16/+16
| | | | | | v2: Remove unnecesary variables Reviewed-by: Tom Stellard <[email protected]>
* gbm: Search LIBGL_DRIVERS_PATH if GBM_DRIVERS_PATH is not setDylan Baker2014-07-241-1/+11
| | | | | | | | | | | | | | | | | | | | | The GBM_DRIVERS_PATH environment variable is not documented, and only used to set the location of gbm drivers, while LIBGL_DRIVERS_PATH is used for everything else, and is documented. Generally this split leads to confusion as to why gbm doesn't work. This patch will read LIBGL_DRIVERS_PATH as a fallback if GBM_DRIVERS_PATH is not set. The comments clearly indicate that using LIBGL_DRIVERS_PATH is preferred over GBM_DRIVERS_PATH. v2: - Use GBM_DRIVERS_PATH as a fallback v3: [[email protected]] - Make LIBGL_DRIVERS_PATH the fallback Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* winsys/radeon: fix indentationJerome Glisse2014-07-241-29/+29
| | | | | | | Can we please keep it clean and avoid ending up in messy situation like ddx. Signed-off-by: Jérôme Glisse <[email protected]>
* Add an accelerated version of F_TO_I for x86_64Jason Ekstrand2014-07-241-1/+5
| | | | | | | | | | | | | According to a quick micro-benchmark, this new version is 20% faster on my Haswell laptop. v2: Removed the XXX note about x86_64 from the comment v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC support for free. v4: Enable it for all x86_64 builds, not just with USE_X86_64_ASM Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Decide predicate/predicate_inverse outside of the for loop.Matt Turner2014-07-241-9/+14
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Swap if/else conditions in SEL peephole.Matt Turner2014-07-241-3/+3
| | | | | | Will clarify make the next commit easier to read. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Improve dead control flow elimination.Matt Turner2014-07-241-10/+15
| | | | | | | | ... to eliminate an ELSE instruction followed immediately by an ENDIF. instructions in affected programs: 704 -> 700 (-0.57%) Reviewed-by: Kenneth Graunke <[email protected]>
* nvc0/ir: support 2d constbuf indexingIlia Mirkin2014-07-241-0/+14
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: emit LDC subopsIlia Mirkin2014-07-241-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gk110/ir: emit load constant subopIlia Mirkin2014-07-241-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* mesa/st: add support for interpolate_at_* opsIlia Mirkin2014-07-241-3/+9
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* nv50/ir: fix phi/union sources when their def has been mergedIlia Mirkin2014-07-241-0/+8
| | | | | | | | | | | | | In a situation where double-register values are used, the phi nodes can still end up being u32 values. They all get merged into one RA node though. When fixing up the merge (which comes after the phi node), the phi node's def would get fixed, but not its sources which would remain at the low register value. This maintains the invariant that a phi node's defs and sources are allocated the same register. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: fix hard-coded TYPE_U32 sized registerIlia Mirkin2014-07-241-3/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: mark shader header if fp64 is usedIlia Mirkin2014-07-241-0/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: keep track of whether the program uses fp64Ilia Mirkin2014-07-242-2/+7
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: make sure that the local memory allocation is aligned to 0x10Ilia Mirkin2014-07-241-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* mesa: add ARB_clear_texture.xml to file list, remove duplicate declsIlia Mirkin2014-07-242-12/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* ilo: check the tilings of imported handlesChia-I Wu2014-07-241-30/+36
| | | | Just to be cautious.
* ilo: clean up resource bo renamingChia-I Wu2014-07-244-51/+63
| | | | | s/alloc_bo/rename_bo/ as that is what the functions do. Simplify bo allocation and move the complexity to bo renaming.
* ilo: share some code between {tex,buf}_create_boChia-I Wu2014-07-241-59/+55
| | | | | Add resource_get_bo_name() and resource_get_bo_initial_domain() for use by both functions.
* ilo: use native 3-component vertex formats on GEN7.5+Chia-I Wu2014-07-242-1/+6
| | | | GEN7.5 gains support for those formats natively.
* ilo: allow for device-dependent format translationChia-I Wu2014-07-245-32/+39
| | | | Pass ilo_dev_info to all format translation functions.