summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir: add serialization and deserializationConnor Abbott2017-10-314-0/+1248
| | | | | | | | | | | | | | | | | | | | v2 (Jason Ekstrand): - Various whitespace cleanups - Add helpers for reading/writing objects - Rework derefs - [de]serialize nir_shader::num_* - Fix uses of blob_reserve_bytes - Use a bitfield struct for packing tex_instr data v3: - Zero nir_variable struct on deserialization. (Jordan) - Allow nir_serialize.h to be included in C++. (Jordan) - Handle NULL info.name. (Jason) - Set info.name to NULL when name is NULL. (Jordan) Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* mesa/st: implement max combined output resources limiting.Dave Airlie2017-11-011-0/+6
| | | | | | | if the driver sets the cap, then use the value it gives us. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium: add cap for driver specified max combined shader resources.Dave Airlie2017-11-0118-1/+20
| | | | | | | | Some hw (evergreen) has a limit on how many combined (images/buffers/mrts) a fragment shader can access. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/sb: bail out if prepare_alu_group() doesn't find a proper schedulingGert Wollny2017-11-012-20/+31
| | | | | | | | | | | | | | | It is possible that the optimizer ends up in an infinite loop in post_scheduler::schedule_alu(), because post_scheduler::prepare_alu_group() does not find a proper scheduling. This can be deducted from pending.count() being larger than zero and not getting smaller. This patch works around this problem by signalling this failure so that the optimizers bails out and the un-optimized shader is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142 Cc: <[email protected]> Signed-off-by: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: fix culldist_writemask in nir pathTimothy Arceri2017-11-011-2/+1
| | | | | | | | | | | | The shared si_create_shader_selector() code already offsets the mask. Fixes the following piglit tests: arb_cull_distance/clip-cull-3.shader_test arb_cull_distance/clip-cull-4.shader_test Fixes: 29d7bdd179bb (radeonsi: scan NIR shaders to obtain required info) Reviewed-by: Marek Olšák <[email protected]>
* nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARBNeil Roberts2017-10-311-2/+22
| | | | | | | | | | | | | | | | | | Previously the values were calculated by just shifting ~0 by the invocation ID. This would end up including bits that are higher than gl_SubGroupSizeARB. The corresponding CTS test effectively requires that these high bits be zero so it was failing. There is a Piglit test as well but this appears to checking the wrong values so it passes. For the two greater-than bitmasks, this patch adds an extra mask with (~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero. Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3 Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected] Signed-off-by: Neil Roberts <[email protected]>
* i965: Check CCS_E compatibility for texture view renderingNanley Chery2017-10-311-2/+27
| | | | | | | | | | | | | | | | Only use CCS_E to render to a texture that is CCS_E-compatible with the original texture's miptree (linear) format. This prevents render operations from writing data that can't be decoded with the original miptree format. On Gen10, with the new CCS_E-enabled formats handled, this enables the driver to pass the arb_texture_view-rendering-formats piglit test. v2. Add a TODO for texturing. (Jason) Cc: <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/isl: Disable some gen10 CCS_E formats for nowNanley Chery2017-10-311-0/+24
| | | | | | | | | CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2 formats with CCS_E. None of these formats fit within the current blorp_copy framework so disable them until support is added. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meson: pass correct args to gles2 ABI testEric Engestrom2017-10-311-1/+4
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: pass correct args to gles1 ABI testEric Engestrom2017-10-311-1/+4
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: pass correct args to gbm symbol testEric Engestrom2017-10-311-2/+4
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: pass correct args to wayland-egl symbol testEric Engestrom2017-10-311-1/+4
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* automake+meson: don't run egl symbol check on libglvnd libEric Engestrom2017-10-312-6/+15
| | | | | | | | We might want to add a symbol check for the glvnd variant though. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: pass correct env/args to egl testsEric Engestrom2017-10-311-2/+8
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gles2: fail symbol check if lib is missingEric Engestrom2017-10-311-1/+9
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gles1: fail symbol check if lib is missingEric Engestrom2017-10-311-1/+9
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gbm: fail symbol check if lib is missingEric Engestrom2017-10-311-1/+10
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* wayland-egl: fail symbol check if lib is missingEric Engestrom2017-10-311-1/+9
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl: fail symbol check if lib is missingEric Engestrom2017-10-311-1/+9
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: set visibility flags on gbmDylan Baker2017-10-311-1/+1
| | | | | | | | | This is done in autotools, and is an oversight in the meson build. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Eric Engestrom <[email protected]>
* meson: Don't link gbm with threadsDylan Baker2017-10-311-1/+1
| | | | | | | | | | It's supposed to be linked with pthread-stubs (if the platform needs pthread-stubs). Pthread stubs support isn't (yet) implemented in the meson build, so add a TODO. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Use true and false instead of yes and no for tristate optionsDylan Baker2017-10-312-6/+6
| | | | | | | | | | | This allows a user to not care whether they're setting a tristate or a boolean option, which is a nice user facing feature, and something I've personally run into. Suggested-by: Adam Jackson <[email protected]> Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx.Andrey Grodzovsky2017-10-317-1/+16
| | | | | Signed-off-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* meson: do not search for needless depsErik Faye-Lund2017-10-312-12/+22
| | | | | | | | | If we don't want to use these deps, there's no good reason to search for them in the first place. This should shave a bit of time for the initial build. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: bail out when binding the same vertex buffersSamuel Pitoiset2017-10-311-2/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: bail out when binding the same index bufferSamuel Pitoiset2017-10-312-0/+14
| | | | | | | DOW3 appears to hit this path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* meson: use dep_m in libgalliumErik Faye-Lund2017-10-311-1/+1
| | | | | | | | The u_format_other.c users sqrtf, which on some systems require a math-library. So let's make sure we link with it. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radv: use correct alloc function when loading from diskTimothy Arceri2017-10-311-1/+14
| | | | | | | | | Fixes regression in: dEQP-VK.api.object_management.alloc_callback_fail.graphics_pipeline Fixes: 1e84e53712ae "radv: add cache items to in memory cache when reading from disk" Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965: Fix ARB_indirect_parameters logic.Plamena Manolova2017-10-301-31/+16
| | | | | | | | | | | | | | | | | | | | | | | | This patch modifies the ARB_indirect_parameters logic in brw_draw_prims, so that our implementation isn't affected if another application attempts to use predicates. Previously we were using a predicate with a DELTAS_EQUAL comparison operation and relying on the MI_PREDICATE_DATA register being 0. Our code to initialize MI_PREDICATE_DATA to 0 was incorrect, so we were accidentally using whatever value was written there. Because the kernel does not initialize the MI_PREDICATE_DATA register on hardware context creation, we might inherit the value from whatever context was last running on the GPU (likely another process). The Haswell command parser also does not currently allow us to write the MI_PREDICATE_DATA register. Rather than fixing this and requiring an updated kernel, we switch to a different approach which uses a SRCS_EQUAL predicate that makes no assumptions about the states of any of the predicate registers. Fixes Piglit's spec/arb_indirect_parameters/tf-count-arrays test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103085 Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't flag BRW_NEW_SURFACES unless some push constants are dirty.Kenneth Graunke2017-10-301-2/+1
| | | | | | | | | | | | | | | | | Due to a gaffe on my part, we were re-emitting all binding table entries on every single draw call. The push_constant_packets atom listens to BRW_NEW_DRAW_CALL, but skips emitting 3DSTATE_CONSTANT_XS for each stage unless stage_state->push_constants_dirty is true. However, it flagged BRW_NEW_SURFACES unconditionally at the end, by mistake. Instead, it should only flag it if we actually emit 3DSTATE_CONSTANT_XS for a stage. We can move it a few lines up, inside the loop - the early continues will skip over it if push constants aren't dirty for a stage. With INTEL_NO_HW=1 set, improves performance of GFXBench5 gl_driver_2 on Apollolake at 1280x720 by 1.01122% +/- 0.470723% (n=35). Reviewed-by: Rafael Antognolli <[email protected]>
* intel/genxml: Fix decoding of groups with fields smaller than a DWord.Kenneth Graunke2017-10-302-10/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Groups containing fields smaller than a DWord were not being decoded correctly. For example: <group count="32" start="32" size="4"> <field name="Vertex Element Enables" start="0" end="3" type="uint"/> </group> gen_field_iterator_next would properly walk over each element of the array, incrementing group_iter, and calling iter_group_offset_bits() to advance to the proper DWord. However, the code to print the actual values only considered iter->field->start/end, which are 0 and 3 in the above example. So it would always fetch bits 3:0 of the current DWord when printing values, instead of advancing to each element of the array, printing bits 0-3, 4-7, 8-11, and so on. To fix this, we add new iter->start/end tracking, which properly advances for each instance of a group's field. Caught by Matt Turner while working on 3DSTATE_VF_COMPONENT_PACKING, with a patch to convert it to use an array of bitfields (the example above). This also fixes the decoding of 3DSTATE_SBE's "Attribute Active Component Format" fields. Reviewed-by: Jordan Justen <[email protected]>
* glsl: Fix bad formatting in a commentIan Romanick2017-10-301-1/+1
| | | | | | Trivial Signed-off-by: Ian Romanick <[email protected]>
* broadcom/vc5: Force blending to treat alpha as 1 for formats without alpha.Eric Anholt2017-10-303-7/+27
| | | | | | | Fixes fbo-blending-formats on RGB8 and 565. We will still need to demote blending to shader code in the MRT case to fix it in general, but that can be added when we start doing 32F blending (which also needs to be done in the shader).
* broadcom/vc5: Do BGRA vs RGBA swapping for the BLEND_CONSTANT_COLOR.Eric Anholt2017-10-304-11/+30
| | | | Fixes many of the fbo-blending-formats tests.
* broadcom/vc5: Pack clear colors according to the TLB internal format/type.Eric Anholt2017-10-302-10/+49
| | | | | | | | | | The previous packing I did got us all the R*16F and R*32F formats, where the pipe format basically matched the TLB's format, but since the clear color will just be memcpyed to the TLB, we should be looking at its format for deciding how to pack. Fixes RGB565, RGB5_A1 and RGBA10 fbo-clear-formats tests and improves 4444.
* broadcom/vc5: Don't do r/b channel swapping on 565.Eric Anholt2017-10-301-1/+7
| | | | The HW's format actually matches the gallium format.
* broadcom/vc5: Use the proper gallium format for our RGB10_A2.Eric Anholt2017-10-301-1/+1
| | | | This keeps us from needing our own reswizzling of the B vs R fields.
* broadcom/vc5: Add some comments about the texture/output format ordering.Eric Anholt2017-10-301-7/+15
| | | | | | | | The output formats are consistent with their channels appearing from low to high in their name. Textures are interpreted the same way, but their names may have the channels swapped around. I'm retaining the texture names so that we are consistent with the documentation, but I want to leave a warning for others.
* broadcom/vc5: Drop duplicated setup of clip_window_height_in_pixels.Eric Anholt2017-10-301-1/+0
|
* broadcom/vc5: Don't forget to actually turn on stencil testing.Eric Anholt2017-10-301-0/+3
| | | | | I had the rest of stencil state set up, but forgot to actually enable it in the higher level configuration bits packet.
* broadcom/vc5: Stop lowering negates to subs.Eric Anholt2017-10-301-1/+8
| | | | | | | | In the case of fneg(0.0), we were getting back 0.0 instead of -0.0. We were also needing an immediate 0 value for ineg, when there's an opcode to do the job properly. Fixes fs-floatBitsToInt-neg.shader_test.
* broadcom/vc5: Set up MSAA texture type according to the internal format.Eric Anholt2017-10-302-2/+39
| | | | | It gets most of EXT_framebuffer_multisample-formats passing, but doesn't really work for texture views.
* broadcom/vc5: Use the sampler view's format, not the resource's.Eric Anholt2017-10-303-8/+1
| | | | | This should help with texture views, though I just noticed this while reading the code.
* broadcom/vc5: Emit raw loads for MSAA buffers.Eric Anholt2017-10-301-0/+58
| | | | | Similar to stores, but we also need to emit dummy stores in between each load, to flush out the previous queued load.
* broadcom/vc5: Use raw stores for MSAA buffers.Eric Anholt2017-10-301-15/+97
| | | | | | | | | | | We were storing the resolved pixels in all cases, but nr_samples > 0 means we should be keeping the per-sample values. We will probably want to change the job structure at some point, as we'll want to recognize full-buffer resolves and do the resolved store in the same job as the original rendering, meaning we'll need to track both the MSAA and single-sample resources in the job. However, this will be enough to build the rest of the MSAA support.
* broadcom/vc5: Add lowering for txf_ms to a txf on a 2x2-scaled texture.Eric Anholt2017-10-306-4/+96
| | | | | | | | | The HW has no native sampler support for multisample textures, but since we only need to support txf_ms and the layout is UIF, we just need to scale up the texcoords and then add in the sample. This drops the old TEXTURE_MSAA_ADDR special uniform, since we're treating MSAA textures as textures, rather than basically texbos like VC4 had to.
* broadcom/vc5: Lay out MSAA textures/renderbuffers as UIF scaled by 4.Eric Anholt2017-10-302-14/+37
| | | | | | We just need to multiply width/height by 2 each, and always set them up as UIF tiling, since that's how the TLB will store them in raw (per-sample) mode.
* broadcom/vc5: Keep output height pad out of the store TLB general address.Eric Anholt2017-10-301-1/+1
| | | | The equivalent load already had the pad separated out.
* broadcom/vc5: Drop padding bits from the texture shader state's address.Eric Anholt2017-10-301-1/+1
|
* broadcom/vc5: Drop alignment bits from texture P1's address.Eric Anholt2017-10-301-1/+1
|