summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* mesa: reduce the size of gl_viewport_attribMarek Olšák2018-02-132-2/+2
| | | | | | | | | | | All drivers convert these to float, so there is no reason to use double. The piglit test that expects double precision from glGet will be adjusted not to require it (there is a piglit patch). gl_context::ViewportArray: 512 -> 384 bytes Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: reduce the size of gl_texture_objectMarek Olšák2018-02-132-30/+30
| | | | Reviewed-by: Brian Paul <[email protected]>
* mesa: reduce the size of gl_programMarek Olšák2018-02-132-7/+6
| | | | | | gl_program: 1456 -> 976 bytes Reviewed-by: Brian Paul <[email protected]>
* mesa: reduce the size of gl_image_unit (v2)Marek Olšák2018-02-132-7/+8
| | | | | | | | gl_context::ImageUnits: 6144 -> 4608 bytes v2: use ASSERT_BITFIELD_SIZE Reviewed-by: Brian Paul <[email protected]>
* mesa: further reduce the size of ctx->TextureMarek Olšák2018-02-132-23/+26
| | | | Reviewed-by: Brian Paul <[email protected]>
* mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8Marek Olšák2018-02-135-2/+37
| | | | | | | | | | GL allows doing glTexEnv on 192 texture units, while in reality, only MaxTextureCoordUnits units are used by fixed-func shaders. There is a piglit patch that adjusts piglits/texunits to check only MaxTextureCoordUnits units. Reviewed-by: Brian Paul <[email protected]>
* mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unitMarek Olšák2018-02-1332-174/+250
| | | | Reviewed-by: Brian Paul <[email protected]>
* mesa: inline init_texture_unitMarek Olšák2018-02-131-51/+39
| | | | | | because this is going to be changed Reviewed-by: Brian Paul <[email protected]>
* mesa: use GLenum16 in a few more placesMarek Olšák2018-02-131-3/+3
| | | | Reviewed-by: Brian Paul <[email protected]>
* anv: Move setting current_pipeline to cmd_state_initJason Ekstrand2018-02-121-1/+1
| | | | | | | | | | | We were setting current_pipeline to UINT32_MAX and then calling cmd_cmd_state_reset which memsets the entire state struct to 0 which implicitly resets current_pipeline to 3D. I have no idea how this hasn't caused everything to explode. Fixes: cd3feea74582 "anv/cmd_buffer: Rework anv_cmd_state_reset" cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Don't resolve or ambiguate non-existent layersJason Ekstrand2018-02-121-2/+10
| | | | | | | | | | | The previous code was trying to avoid non-existent layers by taking a MAX with anv_image_aux_layers. Unfortunately, it wasn't taking into account that layer_count starts at base_layer which may not be zero. Instead, we need to subtract base_layer from anv_image_aux_layers with a guard against roll-over. Fixes: de3be6180169f9 "anv/cmd_buffer: Rework aux tracking" Reviewed-by: Nanley Chery <[email protected]>
* i965: Fix bugs in intel_from_planarDaniel Stone2018-02-121-27/+29
| | | | | | | | | | | | | | This commit fixes two bugs in intel_from_planar. First, if the planar format was non-NULL but only had a single plane, we were falling through to the planar case. If we had a CCS modifier and plane == 1, we would return NULL instead of the CCS plane. Second, if we did end up in the planar_format == NULL case and the modifier was DRM_FORMAT_MOD_INVALID, we would end up segfaulting in isl_drm_modifier_has_aux. Cc: [email protected] Fixes: 8f6e54c92966bb94a3f05f2cc7ea804273e125ad Signed-off-by: Daniel Stone <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: Fix compiler warning about uninitialized 'set'Eric Anholt2018-02-121-1/+1
| | | | | | | The compiler doesn't figure out that we only get result == VK_SUCCESS if set got initialized. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl/tests: Fix strict aliasing warning about int64/double.Eric Anholt2018-02-121-3/+19
| | | | | Fixes: 4bf986274728 ("glsl/tests: Add UINT64 and INT64 types") Reviewed-by: Rhys Kidd <[email protected]>
* ac/nir: Fix compiler warning about uninitialized dw_addr.Eric Anholt2018-02-121-1/+1
| | | | | | | | Even switching the def's condition to be the same chip revision check as the use, the compiler doesn't figure it out. Just NULL-init it. Fixes: ec53e527421d ("ac/nir: Add ES output to LDS for GFX9.") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax.Eric Anholt2018-02-121-1/+1
| | | | | | | My gcc doesn't figure out that dims >= 1 (seems reasonable), and doesn't notice that ddmax is used from the same no_rho_opt as its initialization. Reviewed-by: Roland Scheidegger <[email protected]>
* anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.Kenneth Graunke2018-02-121-2/+1
| | | | | | | | | | | | | The kernel used to have execbuf parameters to program the INSTPM bit for whether 3DSTATE_CONSTANT_* should be relative to dynamic state base address or an absolute address. However, they never worked in the presence of hardware contexts, so I deleted them a while back. It doesn't make sense to set this flag, as it doesn't exist anymore. It also never did anything anyway - the flag is zero, so |'ing it in did nothing. The default is relative anyway. Reviewed-by: Jason Ekstrand <[email protected]>
* r200: remove left over dead codeEric Engestrom2018-02-121-20/+0
| | | | | | | | | | 0aaa27f29187ffb739c7 removed the references to this array without removing the array itself Cc: Ian Romanick <[email protected]> Fixes: 0aaa27f29187ffb739c7 "mesa: Pass the translated color logic op dd_function_table::LogicOpcode" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* ac/nir: remove backlink to nir_to_llvm_contextSamuel Pitoiset2018-02-121-6/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: remove nir_to_llvm_context::moduleSamuel Pitoiset2018-02-121-13/+10
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: remove nir_to_llvm_context::builderSamuel Pitoiset2018-02-121-95/+92
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type()Samuel Pitoiset2018-02-121-13/+13
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: drop nir_to_llvm_context from visit_var_atomic()Samuel Pitoiset2018-02-121-7/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex()Samuel Pitoiset2018-02-121-5/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: drop nir_to_llvm_context from visit_load_push_constant()Samuel Pitoiset2018-02-121-6/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: drop nir_to_llvm_context from cast_ptr()Samuel Pitoiset2018-02-121-3/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index()Samuel Pitoiset2018-02-121-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: drop nir_to_llvm_context from emit_f2f16()Samuel Pitoiset2018-02-121-15/+14
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: remove unused parameters in abi::load_tess_coord()Samuel Pitoiset2018-02-123-10/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: remove useless bitcast in load_tess_coord()Samuel Pitoiset2018-02-121-8/+3
| | | | | | | nir_intrinsic_load_tess_coord always returns a v3i32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add load_resource() to the ABISamuel Pitoiset2018-02-122-7/+25
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add load_sample_mask_in() to the ABISamuel Pitoiset2018-02-123-8/+17
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: move view_index to the ABISamuel Pitoiset2018-02-122-15/+18
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: move push_constants to the ABISamuel Pitoiset2018-02-122-4/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: move tg_size to the ABISamuel Pitoiset2018-02-122-3/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: remove unused nir_to_llvm_context:{defs,phis}Samuel Pitoiset2018-02-121-3/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* egl/gbm: Fix compiler warning about visual matching.Eric Anholt2018-02-121-1/+1
| | | | | | | The compiler doesn't know that num_visuals > 0. Fixes: 37a8d907cc16 ("egl/gbm: Ensure EGLConfigs match GBM surface format") Reviewed-by: Daniel Stone <[email protected]>
* freedreno: small fix for flushing dependent batchesRob Clark2018-02-101-0/+13
| | | | | | | | | | | | | | | Flush a resource's previous write_batch synchronously. Because a resource's associated batches are not updated until after the flush thread submits rendering to the kernel, this was causing a bit of confusion in the following loop. This fixes a bug that appeared with recent stk. Perhaps we need to re-work things a bit to clear out dependent patches in the ctx's thread and use a fence to deal with the period between when a flush is queued and when it is submitted to the kernel. But this will do until time permits a larger refactor. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: intra-block schedulingRob Clark2018-02-101-22/+104
| | | | | | | | | | | | Because of loops, we can't schedule all of a block's predecessors first. Instead just assume that the result consumed in a block was written far enough away in all paths into a block. And do an intra-block scheduling pass to figure out if there are any cases where we need to insert extra nop's. This works out better than always assuming the worst case (ie. that a value live into a block was written in the last instruction in the predecessor block). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: "boost" the depth of if/else conditionRob Clark2018-02-101-5/+6
| | | | | | | Account for the move to predicate register, to try to avoid needing to insert extra NOPs later. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: account for arrays in delayslot calcRob Clark2018-02-101-2/+30
| | | | | | | | | | | | | Normally false-deps are not something to consider, since they mostly exist for delay-slot related reasons: * barriers * ordering writes after read * SSBO/image access ordering The exception is a false-dependency on an array store. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: more clever legalize algorithmRob Clark2018-02-101-42/+96
| | | | | | | | | | | | | | Previously we didn't handle flow control in legalize, and instead just set (ss)(sy) on the first instruction in every block. Which isn't very clever. Instead, consider output state of all predecessor blocks, so we only set a sync bit if needed for any possible path leading into a block. Because of loops, we can't require that all successor blocks are legalized before a given block, so instead run in a loop until results converge. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: track block predecessorsRob Clark2018-02-102-7/+25
| | | | | | Useful in the following patches. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: clean up dangling false-dep'sRob Clark2018-02-102-0/+46
| | | | | | | | | | | | | Maybe there is a better way for this.. where it comes useful is "array" loads, which end up as a false-dep for a later array store. If all the uses of an array load are CP'd into their consumer, it still leaves the dangling array load, leading to funny things like: mov.u32u32 r5.y, r0.y mov.u32u32 r5.y, r0.z Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle IMMED for mad 2nd src special caseRob Clark2018-02-101-2/+4
| | | | | | | Consider also immediates for swapping the first two srcs, because they can be lowered to constant. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove ir3 phi instructionRob Clark2018-02-108-205/+16
| | | | | | Now that we convert phi webs to ssa, we can drop all this. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove lower_if_else passRob Clark2018-02-104-328/+0
| | | | | | Now that it is unused. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add experimental GCM passRob Clark2018-02-101-0/+7
| | | | | | | | | | Generally seems to do worse on instruction count and register usage, according to shader-db. But shader-db also doesn't do a very good job of weighting loop bodies, so that might not be totally valid. So add an env variable to enable GCM pass for easier experimentation. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: change opt passesRob Clark2018-02-101-0/+14
| | | | | | | There are more useful nir passes added since initial conversion to nir. But ir3 was never updated to use them. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use peephole select passRob Clark2018-02-101-1/+1
| | | | | | | | | | | | | | | Agressively lowering all if/else to selects in some extreme cases results in much higher register pressure. Using peephole select instead with a modest threshold speeds up alu2 4x! 16 seems like a good limit, low enough to help alu2 but not too low that it penalizes everything else. With a bit better scheduling of the instruction that moves a value into a predicate register, we might be able to lower this limit a bit more in the future, but since we need 6 cycles from the move to predicate register to predicated branch, that puts some sort of lower bound on how far we can lower this threshold. Signed-off-by: Rob Clark <[email protected]>