summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* r600g,radeonsi: implement get_device_reset_statusMarek Olšák2015-07-035-4/+30
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: don't be confused by eliminated indirectsRob Clark2015-07-032-0/+14
| | | | | | | | If an instruction using address register value gets eliminated, we need to remove it from the indirects list, otherwise it causes mayhem in sched for scheduling address register usage. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: sched fixes for addr register usageRob Clark2015-07-031-12/+65
| | | | | | | | | | | | | | | | A handful of fixes and cleanups: 1) If we split addr/pred, we need the newly created instruction to end up in the unscheduled_list 2) Avoid scheduling a write to the address register if there is no instruction using the address register that is otherwise ready to schedule. Note that I currently don't bother with the same logic for predicate register, since the only instructions using predicate (br/kill) don't take any other src registers, so this situation should not arise. 3) few other cosmetic cleanups Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix indirects trackingRob Clark2015-07-035-10/+23
| | | | | | | | | | cp would update instr->address but not update the indirects array resulting in sched getting confused when it had to 'spill' the address register. Add an ir3_instr_set_address() helper to set instr->address and also update ir->indirects, and update all places that were writing instr->address to use helper instead. Signed-off-by: Rob Clark <[email protected]>
* gallium/ttn: mark location specially in nir for color0-writes-allIlia Mirkin2015-07-032-0/+10
| | | | | | | | | | We need to distinguish a shader that has separate writes to each MRT from one which is supposed to write the data from MRT 0 to all the MRTs. In TGSI this is done with a property. NIR doesn't have that, so encode it as a funny location and decode on the other end. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nv50/ir: don't emit src2 in immediate formIlia Mirkin2015-07-021-2/+2
| | | | | | | | | In the immediate form, src2 == dst, so it does not need to be emitted. Otherwise it overlaps with the immediate value's low bits. Fixes: 09ee907266 (nv50/ir: Fold IMM into MAD) Cc: "10.6" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: tune PREFER_BLIT_BASED_TEXTURE_TRANSFER capabilityAlexandre Courbot2015-07-011-1/+2
| | | | | | | | | | Prefer blit-based texture transfers only if the chip has dedicated VRAM since it would translate to a copy into the same memory on shared-memory chips. Signed-off-by: Alexandre Courbot <[email protected]> Reported-by: Ilia Mirkin <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: create screen fence objects with coherent attributeAlexandre Courbot2015-07-021-2/+6
| | | | | | | | | | | | | | | This is required on non-coherent architectures to ensure the value of the fence is correct at all times. Failure to do this results in the display freezing for a few seconds every now and then on Tegra. The NOUVEAU_BO_COHERENT is a no-op for coherent architectures, so behavior on x86 should not be affected by this patch. Also bump the required libdrm version to 2.4.62, which introduced this flag. Signed-off-by: Alexandre Courbot <[email protected]> Reviewed-by: Martin Peres <[email protected]>
* ilo: remove ilo_image_paramsChia-I Wu2015-07-011-75/+47
| | | | It suffices to use ilo_image_layout directly.
* ilo: add image_init_gen6_transfer_layout()Chia-I Wu2015-07-011-75/+37
| | | | It replaces img_init_for_transfer().
* ilo: add image_set_gen6_bo_size()Chia-I Wu2015-07-013-118/+89
| | | | It replaces img_calculate_bo_size().
* ilo: add image_set_gen6_{hiz,mcs}Chia-I Wu2015-07-011-49/+61
| | | | They replace img_calculate_{hiz,mcs}_size().
* ilo: add image_get_gen6_monolithic_size()Chia-I Wu2015-07-011-67/+67
| | | | It replaces img_align().
* ilo: add image_get_gen6_lods()Chia-I Wu2015-07-011-88/+148
| | | | It replaces img_init_lods() and img_init_layer_height().
* ilo: add image_get_gen{6,7}_alignment()Chia-I Wu2015-07-011-159/+177
| | | | They replace img_init_alignments().
* ilo: add image_get_gen6_{hiz,mcs}_enable()Chia-I Wu2015-07-011-101/+97
| | | | They replace img_init_aux().
* ilo: add image_get_gen6_tiling()Chia-I Wu2015-07-011-132/+177
| | | | It replaces img_init_tiling().
* ilo: add image_get_gen6_layout()Chia-I Wu2015-07-011-82/+107
| | | | It replaces only img_init_walk() right now. It will replace all img_init_*().
* nv50/ir: copy joinAt when splitting both before and afterIlia Mirkin2015-07-013-0/+5
| | | | | | | | | | | | | | | | | | | | | The current implementation only moves the joinAt when splitting after the given instruction, not before it. So if you have a BB with foo instr bar joinat and thus with joinAt set, we end up first splitting before instr, at which point the instr's bb is updated to the new bb. Since that bb doesn't have a joinAt set (despite containing one), when splitting after the instr, there is nothing to copy over. Since the joinat will be in the "split" bb irrespective of whether we're splitting before or after the instruction, move it over in either case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91124 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
* freedreno: use consistent version string formatTimothy Arceri2015-07-011-1/+1
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/from_ssa: add a flag to not convert everything from SSAConnor Abbott2015-06-301-1/+1
| | | | | | | | | | | | | We already don't convert constants out of SSA, and in our backend we'd like to have only one way of saying something is still in SSA. The one tricky part about this is that we may now leave some undef instructions around if they aren't part of a phi-web, so we have to be more careful about deleting them. v2: rename and flip meaning of flag (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno/ir3: cache defining instructionRob Clark2015-06-303-69/+91
| | | | | | | | | It is silly to traverse back to find first instruction that writes part of a larger "virtual" register many times per instruction (plus per use as a src to later instructions). Cache this information so we only figure it out once. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix RA issue with faninRob Clark2015-06-301-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | The fanin source could be grouped, for example with shaders like: VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[9] DCL SAMP[0] DCL SVIEW[0], 2D, FLOAT DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[1].xyyy 1: MOV TEMP[0].w, IN[1].wwww 2: TXF TEMP[0], TEMP[0], SAMP[0], 2D 3: MOV OUT[1], TEMP[0] 4: MOV OUT[0], IN[0] 5: END The second arg to the isaml is IN[1].w, so we need to look at the fanin source to get the correct offset. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add ir3_shader_disasm()Rob Clark2015-06-303-120/+124
| | | | | | | | | Split out most of dump_info() from ir3_cmdline compiler into a function that can be used both by cmdline compiler and also for the disasm debug option. This way, for FD_MESA_DEBUG=disasm we also get to see intput/ output registers, etc. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix for sparse-samplersRob Clark2015-06-301-3/+7
| | | | | | | | | Some piglit tests, like arb_fragment_program-sparse-samplers, result in having a null samp#0 but valid samp#1. TODO: a3xx probably needs similar fix Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix crash in fail pathRob Clark2015-06-303-3/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix crash in RARob Clark2015-06-301-2/+5
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fixes for indirect writesRob Clark2015-06-303-4/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix constlen in case of load_uniform_indirectRob Clark2015-06-301-0/+5
| | | | | | | | We can't rely on what we get from the assembler if we have indirect addressing of constant file, since the assembler doesn't know the array index. This got lost in the transition to NIR. Signed-off-by: Rob Clark <[email protected]>
* nv50/ir: fix emission of address reg in 3rd sourceIlia Mirkin2015-06-301-2/+6
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91056 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
* nv30: align transfer stride to 64, required by blit, sifm transfer implsIlia Mirkin2015-06-291-2/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: allow vertex state creation with 0 elementsIlia Mirkin2015-06-291-2/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: reset fragprog bufctx at bind timeIlia Mirkin2015-06-291-1/+8
| | | | | | | | A clear will do a partial validate, which will in turn reference all the buffers in the bufctx again. However the fragprog last validated might have already been deleted. So reset the bufctx when updating state. Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: modernize fp upload logicIlia Mirkin2015-06-291-10/+14
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: provide a minimum map buffer alignmentIlia Mirkin2015-06-291-1/+2
| | | | | | | Otherwise we return 0, which is out of spec. Return 64 like all the other nouveau drivers. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_COMPUTE_CAP_SUBGROUP_SIZEGrigori Goronzy2015-06-294-0/+38
| | | | | | | We need this to implement OpenCL's CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE. Reviewed-by: Francisco Jerez <[email protected]>
* nv30: avoid leaking blit fp/vpIlia Mirkin2015-06-291-0/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv40: enable base vertexIlia Mirkin2015-06-293-4/+5
| | | | | | | Still appears to have issues with negative indices less than -1M, but that's a corner case of a corner case. Signed-off-by: Ilia Mirkin <[email protected]>
* radeonsi: add support for geometry shader invocations.Dave Airlie2015-06-274-1/+13
| | | | | Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: add support for viewport array (v3)Dave Airlie2015-06-276-40/+69
| | | | | | | | | | | | | | | | | This isn't pretty and I'd suggest it the pm4 interface builder could be tweaked to do this more efficently, but I'd need guidance on how that would look. This seems to pass the few piglit tests I threw at it. v2: handle passing layer/viewport index to fragment shader. fix crash in blit changes, add support to io_get_unique_index for layer/viewport index update docs. v3: avoid looking up viewport index and layer in es (Marek). Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nv50/ir: propagate modifier to right arg when const-folding madIlia Mirkin2015-06-261-1/+4
| | | | | | | | | | An immediate has to be the second arg of an ADD operation. However we were mistakenly propagating the modifier of the non-folded value to the folded immediate argument. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91117 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
* mesa: Enable subdir-objects globally.Matt Turner2015-06-267-14/+0
| | | | Reviewed-by: Emil Velikov <[email protected]>
* ilo: define ILO_IMAGE_MAX_LEVEL_COUNTChia-I Wu2015-06-264-8/+16
| | | | | Define ILO_IMAGE_MAX_LEVEL_COUNT for ilo_image and remove unnecessary header includes.
* ilo: replace pipe_format by gen_surface_formatChia-I Wu2015-06-2613-142/+174
| | | | | Replace pipe_format by gen_surface_format in ilo_image. Change how depth format is specified in ilo_state_zs.
* ilo: always use the specified image formatChia-I Wu2015-06-264-69/+115
| | | | | Move silent promotion of PIPE_FORMAT_ETC1_RGB8 or combined depth/stencil out of core.
* ilo: replace pipe_texture_target by gen_surface_typeChia-I Wu2015-06-268-125/+98
| | | | | Replace pipe_texture_target by gen_surface_type in ilo_image. Change how GEN6_SURFTYPE_CUBE is specified in ilo_state_surface and ilo_state_zs.
* ilo: initialize ilo_image from ilo_image_infoChia-I Wu2015-06-263-179/+242
| | | | Convert pipe_resource to ilo_image_info for image initialization.
* ilo: remove ilo_image_disable_aux()Chia-I Wu2015-06-263-28/+2
| | | | Fail resource creation when aux bo allocation fails.
* ilo: improve SURFTYPE_BUFFER validationsChia-I Wu2015-06-262-81/+139
| | | | Reorganize the validations to make them more systematic.
* ilo: remove ilo_bufferChia-I Wu2015-06-2610-68/+109
| | | | | | | | | | Since the addition of ilo_vma, it was used only to pad a bo for sampling engine surfaces. Replace it entirely with these functions ilo_state_surface_buffer_size() ilo_state_vertex_buffer_size() ilo_state_index_buffer_size() ilo_state_sol_buffer_size()