summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* swr: [rasterizer] Slight assert refactoringTim Rowley2017-03-2017-256/+296
| | | | | | | | Make asserts more robust. Add SWR_INVALID(...) as a replacement for SWR_ASSERT(0, ...) Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer] Backend code adjustmentsTim Rowley2017-03-205-45/+70
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Fix the early and late depthstencil eventsTim Rowley2017-03-201-5/+5
| | | | | | The coverage and stencil mask arguments were reversed. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Implement double pumped SIMD16 TESSTim Rowley2017-03-201-79/+177
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast/core/scripts] Fix archrast multithreading issueTim Rowley2017-03-206-16/+52
| | | | | | | | Per pixel stats are cached but were not always being flushed as threads moved from one draw context to the next. Added an explicit flush to allow all archrast objects to flush any cached events. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Remove redundant data from archrast filesTim Rowley2017-03-202-137/+103
| | | | | | | If count can be derived from other counts then this can be done in post processing scripts. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast/scripts] Further archrast cleanupsTim Rowley2017-03-203-164/+104
| | | | | | Removed redundant data being written out to file Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Fix RECT_LIST primitive assemblyTim Rowley2017-03-201-2/+2
| | | | | | | The bug would make the 3rd component of attributes on the second triangle of a RECT be invalid. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer common] Add InterpolateComponentFlat utilityTim Rowley2017-03-201-0/+13
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Fix performance issue with archrast statsTim Rowley2017-03-201-15/+15
| | | | | | | Performance is now 50x faster with archrast now that we're properly filtering out all of the rdtsc begin/end. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Implement SIMD16 GS and STREAMOUTTim Rowley2017-03-201-51/+251
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Add additional API eventsTim Rowley2017-03-202-0/+48
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core/scripts] Autogen backend initialization function(s)Tim Rowley2017-03-207-226/+398
| | | | | | | | | | | Autogen functions that instantiates different BackendPixelRate templates. Functions get split into separate files after reaching a user defined threshold (currently 512 per file) to speed up compilation. This change will enable the addition of more template flags in the pixel back end. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] backend.h declares gBackendPixelRateTableTim Rowley2017-03-202-1/+8
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Finish SIMD16 PA OPT including tesselationTim Rowley2017-03-201-21/+247
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Finish SIMD16 PA OPT except tesselationTim Rowley2017-03-202-274/+1405
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Support sparse numa id values on all OSesTim Rowley2017-03-201-27/+53
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* i965: Skip register write detection when possible.Kenneth Graunke2017-03-201-2/+8
| | | | | | | | | | | | | | | | | | | Detecting register write support by trial and error introduces a stall at screen creation time, which it would be nice to avoid. Certain command parser versions guarantee this will work (see the giant comment in intelInitScreen2 below, or a few commits ago): - Ivybridge: version >= 1 (kernel v3.16) - Baytrail: version >= 2 (kernel v3.19) - Haswell: version >= 7 (kernel v4.8) For simplicity, we don't bother with version 1 in this patch. This assumes that the user hasn't disabled aliasing PPGTT via a kernel command line parameter. Don't do that - you're only breaking things. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Set screen->cmd_parser_version to 0 if we can't write registers.Kenneth Graunke2017-03-201-6/+11
| | | | | | | | | | | | | | | | If we can't write registers, then the effective command parser version is 0 - it may exist, but it's not usefully enabling anything. See kernel commit 1ca3712ca3429a617ed6c5f87718e4f6fe4ae0c6 (in v4.8) where the kernel starts doing this for us. This makes us do more or less the same thing on older kernels. This should preserve a bit of sanity by allowing us to perform a screen->cmd_parser_version > N check to determine that we really can use the features promised by command parser version N. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Document the sad story of the kernel command parser.Kenneth Graunke2017-03-201-0/+97
| | | | | | | | This should help us figure out the complexities of which kernel versions we need to get various features on various platforms. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Fall back to GL 4.2/4.3 on Haswell if the kernel isn't new enough.Kenneth Graunke2017-03-201-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | In commit d2590eb65ff28a9cbd592353d15d7e6cbd2c6fc6 I enabled GL 4.5 on Haswell...but failed to check if we could do indirect compute shader dispatch...and query buffer objects. Indirect compute shader dispatch requires command parser version 5 (kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in Linux v4.4). On earlier kernels we would have disabled ARB_compute_shader, which is a mandatory part of OpenGL 4.3+. Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG, which mean command parser version 7 (Linux v4.8). On earlier kernels we would have disabled ARB_query_buffer_object, which is a mandatory part of OpenGL 4.4+. The new version support looks like: - Kernel 4.1 and older => OpenGL 3.3 - Kernel 4.2-4.3 => OpenGL 4.2 - Kernel 4.4-4.7 => OpenGL 4.3 - Kernel 4.8+ => OpenGL 4.5 Cc: "17.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* r600g/sb: Fix memory leak by reworking uses list (rebased)Constantine Kharlamov2017-03-204-61/+28
| | | | | | | | | | | | | | | | | | | | | | | | | The author is Heiko Przybyl(CC'ing), the patch is rebased on top of Bartosz Tomczyk's one per Dieter Nützel's comment. Tested-by: Constantine Charlamov <[email protected]> v2: Resend the patch again through git-email. The prev. rebase was sent through Thunderbird, which screwed up tab characters, making the patch not apply. -------------- When fixing the stalls on evergreen I introduced leaking of the useinfo structure(s). Sorry. Instead of allocating a new object to hold 3 values where only one is actually used, rework the list to just store the node pointer. Thus no allocating and deallocation is needed. Since use_info and use_kind aren't used anywhere, drop them and reduce code complexity. This might also save some small amount of cycles. Thanks to Bartosz Tomczyk for finding the bug. Reported-by: Bartosz Tomczyk <bartosz.tomczyk86 at gmail.com <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>> Signed-off-by: Heiko Przybyl <lil_tux at web.de <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>> Supersedes: https://patchwork.freedesktop.org/patch/135852 Signed-off-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: check the IR type before waiting for a compute compilation fenceMarek Olšák2017-03-201-1/+3
| | | | | | | This should fix OpenCL getting stuck. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100288 Reviewed-by: Samuel Pitoiset <[email protected]>
* aubinator: Move the guts of decode_group() to decoder.c.Kenneth Graunke2017-03-203-31/+42
| | | | | | This lets us use it outside of the aubinator binary itself. Reviewed-by: Lionel Landwerlin <[email protected]>
* aubinator: Drop spec parameter to decode_group().Kenneth Graunke2017-03-201-13/+12
| | | | | | No longer necessary - the iterator gets it from the group. Reviewed-by: Lionel Landwerlin <[email protected]>
* aubinator: Make the iterator store a pointer to structure descriptions.Kenneth Graunke2017-03-203-27/+11
| | | | | | | | | | | When the iterator encounters a structure field, it now looks up the gen_group for that structure definition and saves a pointer to it. This lets us drop a lot of ridiculous code in the caller, which looked at item->value (<struct NAME dword>), strtok'd the structure name back out, and looked it up itself. Reviewed-by: Lionel Landwerlin <[email protected]>
* aubinator: Track the current field's starting dword offset.Kenneth Graunke2017-03-203-26/+18
| | | | | | | | | | The iterator code already computed this value, then we stored it in the structure name, strtok'd it back out, and also manually computed it when printing dword headers. Just put the value in the struct and use it. Way simpler. Reviewed-by: Lionel Landwerlin <[email protected]>
* aubinator: Drop decode_structure() helper.Kenneth Graunke2017-03-201-16/+9
| | | | | | | It made more sense when decode_group() took a bunch of extra options, but now that there's only one...we may as well pass 0 and call it a day. Reviewed-by: Lionel Landwerlin <[email protected]>
* aubinator: Drop unused print_dword_headers flag.Kenneth Graunke2017-03-201-5/+4
| | | | | | | | | I added this flag in 65a9d5eabb05e4925c1c9a17836cad57304210d6 but it was completely unused. Both callers appear to have printed dword headers, so we can just drop the flag and continue doing it unconditionally. Reviewed-by: Lionel Landwerlin <[email protected]>
* aubinator: Store a pointer from gen_group back to gen_spec.Kenneth Graunke2017-03-202-0/+2
| | | | | | | When decoding a structure field within a group, we may want to look up that structure type. Having a gen_spec pointer makes it easy to do so. Reviewed-by: Lionel Landwerlin <[email protected]>
* aubinator: Store enum textual name in iter->value.Kenneth Graunke2017-03-203-19/+15
| | | | | | | | | | | | | | | gen_field_iterator_next() produces a string representing the value of the field. For enum values, it also produced a separate "description" string containing the textual name of the enum. The only caller of this function combines the two, printing enums as "<numeric value> (<texture enum name>)". We may as well just store that in item->value directly, eliminating the description field, and a layer of wrapping. v2: Use non-overlapping source and destination strings in snprintf. Reviewed-by: Lionel Landwerlin <[email protected]>
* si_descriptor: move velems nullity check before dereferenceJulien Isorce2017-03-201-4/+11
| | | | | | | | | CID 1399479: Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking velems suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeon_drm_bo: explicitly check return value of drmCommandWriteReadJulien Isorce2017-03-201-2/+7
| | | | | | | CID 1313492 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* si_pipe: remove nullity check after dereferenceJulien Isorce2017-03-201-3/+0
| | | | | | | | | sscreen cannot be NULL CID 1354483 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeon: initialize hole variable before calling container_ofJulien Isorce2017-03-201-1/+1
| | | | | | | | | Like in a few other places in that radeon_drm_bo.c file. CID 715739. Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* intel: Correct the BDW surface state sizeNanley Chery2017-03-202-4/+3
| | | | | | | | | | | The PRMs state that this packet is 16 DWORDS long. Ensure that the last three DWORDS are zeroed as required by the hardware when allocating a null surface state. Cc: <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* r600g: Fix out of bounds accessBartosz Tomczyk2017-03-202-20/+22
| | | | | | | | | fc_sp variable should indicate number of elements in fc_stack array, but fc_sp was increased at beginning of fc_pushlevel function. It leads to situation where idx=0 was never used, and last 32 element was stored outside fs_stack array. Signed-off-by: Marek Olšák <[email protected]>
* r600g: update sb documentationConstantine Kharlamov2017-03-201-3/+6
| | | | | | | v2: s/r600/r600g in the title Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: make condition clearerConstantine Kharlamov2017-03-201-6/+8
| | | | | | | | | | | | | | | | | The second check in the old code looked pretty much unreachable, esp. because it's not obvious that "max_entries" could be zero. To find out that it was intentional I had to run some checks, and to dig into the old versions of the file. So, rewrite the check to make the intention clear. v2: s/r600/r600g in the title, and per Dieter Nützel's comment wrap lines of condition. Signed-off-by: Constantine Kharlamov <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Acked-by: Dieter Nützel <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* anv/genX: Solve the vkCreateGraphicsPipelines crashXu,Randy2017-03-201-2/+2
| | | | | | | | | | | | The crash is due to NULL pColorBlendState, which is legal if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use any color attachments. Test: Sample subpasses from LunarG can run without crash Signed-off-by: Xu,Randy <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]>
* radv: fix logic for when to flush on multiple CS emissionDave Airlie2017-03-201-8/+8
| | | | | | | | | The current code evaluated to always true, we only want to flush on the first submit. Rename the variable to do_flush, and only emit on the first iteration. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* spirv: Implement IsInf using an integer comparisonJason Ekstrand2017-03-201-1/+1
| | | | | | | | | | | Since we already do fabs on the one source, we're guaranteed to get positive infinity if we get any infinity at all. Since +inf only has one IEEE 754 representation, we can use an integer comparison and avoid all of the ordered/unordered issues. Cc: Dave Airlie <[email protected]> Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/meta: fix image clears for r4g4 format.Dave Airlie2017-03-201-0/+8
| | | | | | | This just uses an 8-bit clear and packs the values. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* Revert "radv: fallback to an in-memory cache when no pipline cache is provided"Dave Airlie2017-03-203-13/+6
| | | | | | | | | | | This reverts commit 2845a108a9a8bd4b0e6e9b590c976452fb99eb10. This break VK-GL-CTS randomly. ./deqp-vk --deqp-case=dEQP-VK.texture.filtering.3d.formats.r4g4b4a4* bounces around here from 6/6 to 3/6 or 4/6 to hanging. Signed-off-by: Dave Airlie <[email protected]>
* mesa: disable glthread when glNewList() is calledTimothy Arceri2017-03-201-1/+1
| | | | | | | glNewList() swaps dispatch tables, and we don't have anything in place to handle that in glthread. Tested-by: Michel Dänzer <[email protected]>
* radv: fix primitive reset index emissionDave Airlie2017-03-201-1/+1
| | | | | | | | | | This was meant to be checking the index type to get the correct index not the last emitted one. This fixes: dEQP-VK.pipeline.input_assembly.primitive_restart.index_type_uint32.triangle_strip_with_adjacency Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "13.0 17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* util/disk_cache: check rename resultGrazvydas Ignotas2017-03-201-2/+6
| | | | | | | | | I haven't seen this causing problems in practice, but for correctness we should also check if rename succeeded to avoid breaking accounting and leaving a .tmp file behind. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util/disk_cache: delete .tmp if target existsGrazvydas Ignotas2017-03-201-1/+3
| | | | | | | | | | | At the time of target file check, .tmp file is already created and file lock is held, so we should remove the .tmp, like in other error paths. With this, piglit no longer leaves large amount of empty .tmp files behind, which waste directory entries and may interfere with eviction. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util/disk_cache: fix stored_keys indexGrazvydas Ignotas2017-03-201-2/+2
| | | | | | | | | | | | | | It seems there is a bug because: - 20 bytes are compared, but only 1 byte stored_keys step is used - entries can overlap each other by 19 bytes - index_mmap is ~1.3M in size, but only first 64K is used With this fix for Deus Ex: - startup time (from launch to Feral logo): ~38s -> ~16s - disk_cache_has_key() hit rate: ~50% -> ~96% Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nv30: create uploader after pipe->screen is setIlia Mirkin2017-03-191-6/+6
| | | | | | Fixes crashes after recent upload rework. Signed-off-by: Ilia Mirkin <[email protected]>