summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: fix Gather4 with integer formatsMarek Olšák2016-09-051-3/+96
| | | | | | | | | | The closed compiler does the same thing. This fixes: GL45-CTS.texture_gather.*-int-* (18 tests) Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix a crash in imageSize for cubemap arraysMarek Olšák2016-09-051-3/+1
| | | | | | | | | | | Sometimes it was f32, other times it was i32. Now it's always i32. This fixes: GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shaderMarek Olšák2016-09-051-1/+6
| | | | | | | | | This fixes: GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation .gl_PatchVerticesIn Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix cubemaps viewed as 2DMarek Olšák2016-09-051-0/+4
| | | | | | | | | | | This fixes: GL43-CTS.texture_view.view_sampling v2: fix a typo, merge both if statements Cc: [email protected] Reviewed-by: Dave Airlie <[email protected]> (v1) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: always use the same function signature for llvm.SI.exportMarek Olšák2016-09-051-4/+4
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: return correct eviction stats for NVX_gpu_memory_infoMarek Olšák2016-09-051-2/+7
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: also eliminate DCC fast clear in resource_get_handleMarek Olšák2016-09-051-2/+3
| | | | | | | just do what the comment says Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use the current ctx for CMASK elimination in resource_get_handleMarek Olšák2016-09-051-6/+11
| | | | | | | For coherency with the current context. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use the current ctx for DCC decompression in resource_get_handleMarek Olšák2016-09-051-3/+3
| | | | | | | For coherency with the current context. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: derive buffer placement and flags only at initializationMarek Olšák2016-09-055-49/+67
| | | | | | | | | | Invalidated buffers don't have to go through it. Split r600_init_resource into r600_init_resource_fields and r600_alloc_resource. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set more sampler settingsMarek Olšák2016-09-052-2/+12
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* noop: implement resource_get_handleMarek Olšák2016-09-051-2/+14
| | | | X+DRI3 locks up if the returned handle is invalid.
* noop: set missing functionsMarek Olšák2016-09-052-0/+75
|
* noop: simplify some functionsMarek Olšák2016-09-051-49/+7
|
* glx/glvnd: list the strcmp arguments in correct orderEmil Velikov2016-09-051-2/+2
| | | | | | | | | | | Currently, due to the inverse order, strcmp will produce negative result when the needle is towards the start of the haystack. Thus on the next iteration(s) we'll end up further towards the end and eventually fail to locate the entry. Cc: "12.0" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir/tests: Update the CF tests to not assume fake edgesJason Ekstrand2016-09-041-4/+4
| | | | | | | | | | | In aad4f1550, we removed the concept of "fake" edges from NIR. Now, if you have a block at the end of an infinite loop it really has no predecessors. This updates the unit tests to match. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97587 Tested-by: Aaron Watry <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* gk110/ir: fix quadop dall emissionIlia Mirkin2016-09-041-2/+2
| | | | | | | | | We recently starting to always emit the NDV (== dall) bit for quadops. However it was folded into the wrong code word. Fixes: e0a067ed48 (nv50/ir: always emit the NDV bit for OP_QUADOP) Signed-off-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* android: intel: fix include paths in new "common" libraryMauro Rossi2016-09-031-0/+6
| | | | | | Fixes building error in libmesa_intel_common static library Reviewed-by: Jason Ekstrand <[email protected]>
* a3xx: use window scissor to simulate viewport xy clipIlia Mirkin2016-09-031-10/+26
| | | | | | | | | | | | | Unfortunately a3xx does not have a separate disable for depth clipping, so when depth clamp is enabled, we disable the whole 3d clipper logic. This in turn also gets rid of the xy clip that it would normally do. When we detect this would happen, instead we integrate the viewport into the window scissor. This may have slightly different behavior around wide points, but it's unlikely that anything depends on this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* a3xx: make use of software clipping when hw can't handle itIlia Mirkin2016-09-036-4/+36
| | | | | | | | | The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs, or a clip vertex, or clip distances are in use, then we must use the fallback discard-based clipping from the frag shader. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* a3xx: make sure to actually clamp depth as requestedIlia Mirkin2016-09-032-2/+30
| | | | | | | | | | | We were previously ... not clamping. I guess this meant that everything got clamped to 1/0, which was enough to pass the existing tests. Or perhaps the clamping would only happen to the rasterized depth value and not the frag shader's output depth value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0/ir: allow min/max instructions to be dual-issued in pairsKarol Herbst2016-09-031-2/+12
| | | | | | | | | | | | | | changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=60000 /width=1024 /height=640: inst_executed: 1.03G inst_issued1: 614M -> 580M inst_issued2: 213M -> 230M score: 1021 -> 1030 Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* anv: Move cmd_buffer_config_l3 into anv_cmd_buffer.cJason Ekstrand2016-09-036-207/+162
| | | | | | | | This is the only remaining part of genX_l3.c and there's really no good reason for it to be in its own file. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* anv/cmd_buffer: Move emit_lri and emit_lrm higher upJason Ekstrand2016-09-031-19/+19
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* anv: Refactor pipeline l3 config setupJason Ekstrand2016-09-035-67/+25
| | | | | | | | | Now that we're using gen_l3_config.c, we no longer have one set of l3 config functions per gen and we can simplify a bit. Also, we know that only compute uses SLM so we don't need to look for it in all of the stages. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* anv: Leverage the shared L3$ config codeJason Ekstrand2016-09-032-345/+37
| | | | | | | | | When Jordan first implement L3$ configuration for Vulkan, he copied+pasted from the GL driver because we had no good place to share it. Now that we have src/intel/common, we should be sharing these tables. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel: Pull the guts of gen7_l3_state.c into a shared helperJason Ekstrand2016-09-035-334/+436
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel: Rename brw_get_device_name/info to gen_get_device_name/infoJason Ekstrand2016-09-0312-17/+17
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel: s/brw_device_info/gen_device_info/Jason Ekstrand2016-09-0375-380/+380
| | | | | | | | | | | | | Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel: Add a new "common" library for more code sharingJason Ekstrand2016-09-0320-9/+77
| | | | | | | The first thing to go in this new library is brw_device_info. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/blorp: fix typo in android makefileMauro Rossi2016-09-031-1/+1
| | | | Acked-by: Jason Ekstrand <[email protected]>
* nir: remove unused variableTimothy Arceri2016-09-031-2/+0
| | | | | | This was let over from aad4f15506c Reviewed-by: Jason Ekstrand <[email protected]>
* nir: remove some fields from nir_shader_compiler_optionsConnor Abbott2016-09-031-3/+0
| | | | I accidentally added these with 0dc4cab. Oops!
* nir: fix bug with moves in nir_opt_remove_phis()Connor Abbott2016-09-031-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 144cbf8 ("nir: Make nir_opt_remove_phis see through moves."), Ken made nir_opt_remove_phis able to coalesce phi nodes whose sources are all moves with the same swizzle. However, he didn't add the logic necessary for handling the fact that the phi may now have multiple different sources, even though the sources point to the same thing. For example, if we had something like: if (...) a1 = b.yx; else a2 = b.yx; a = phi(a1, a2) ... = a then we would rewrite it to if (...) a1 = b.yx; else a2 = b.yx; ... = a1 by picking a random phi source, which in this case is invalid because the source doesn't dominate the phi. Instead, we need to change it to: if (...) a1 = b.yx; else a2 = b.yx; a3 = b.yx; ... = a3; Fixes 12 CTS tests: ES31-CTS.functional.tessellation.invariance.outer_edge_symmetry.quads* Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add nir_after_phis() cursor helperConnor Abbott2016-09-031-5/+14
| | | | | | | And re-implement nir_after_cf_node_and_phis() using it. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: expose max atomic counter/buffer consts for tess in ES 3.2Ilia Mirkin2016-09-031-2/+2
| | | | | | | | Curiously OES/EXT_tessellation_shader leave these out, while ES 3.2 adds them in. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mapi: don't forget to expose GetPointerv in GL ES 3.2Ilia Mirkin2016-09-031-1/+1
| | | | | | | | I left this out of my previous commit that went around enabling all of the other ES 3.2 entrypoints. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* main: add KHR_robustness to ES 3.2 extension requirementsIlia Mirkin2016-09-031-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nv50,nvc0: respect render condition enable flag when clearing rt/zsIlia Mirkin2016-09-032-12/+24
| | | | | | | | This is a newly added flag. We always pass false into it from nv50_clear_texture, but other callers may want to respect the render condition. (And the functions were originally spec'd to respect it.) Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: don't dual-issue ops that depend or interfere with each otherKarol Herbst2016-09-033-14/+23
| | | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]> [imirkin: rewrite to split up the helpers and move more logic to target] Signed-off-by: Ilia Mirkin <[email protected]>
* nir: Remove fake edges in the CF handling codeJason Ekstrand2016-09-021-57/+2
| | | | | | | | | | | | | | | | When NIR was first introduced, Connor added this fake-edge hack to work around issues related to unreachable blocks. Thanks to GLSL IR's jump lowering code, the only unreachable code you can have is a block after an infinite loop. With SPIR-V, we didn't have the jump lowering code so we could also end up with the "if (...) { break; } else { continue; }" case which generates an unreachable block after the if. Because of this, most of NIR had to be fixed up for handling unreachable blocks. The only remaining case of not handling unreachable blocks was specifically the block-after-infinite-loop case in dead_cf which was fixed by the previous commit. We can now delete the fake edge hack. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/dead_cf: Don't crash on unreachable after-loop blocksJason Ekstrand2016-09-021-1/+2
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nvc0: reduce the initial code segment size to 512KBSamuel Pitoiset2016-09-011-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: allow to resize the code segment dynamicallySamuel Pitoiset2016-09-011-1/+24
| | | | | | | | | | | | | When an application uses a ton of shaders, we need to evict them when the code segment is full but this is not really a good solution if monster shaders are used because code eviction will happen a lot. To avoid this, it seems better to dynamically resize the code segment area after each eviction. The maximum size is arbitrary fixed to 8MB which should be enough. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add a new bin for the code segmentSamuel Pitoiset2016-09-012-4/+6
| | | | | | | | | To avoid the bins list to grow up indefinitely when the code segment size will be bumped, we need to separate that bin from the SCREEN one because it contains other resources like the uniform bo. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add nvc0_screen_resize_text_area() helperSamuel Pitoiset2016-09-013-10/+40
| | | | | | | | This function will be helpful for resizing the code segment area when we need to evict all shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: re-upload currently bound shaders after code evictionSamuel Pitoiset2016-09-011-0/+27
| | | | | | | | | | | | | This fixes a very old issue which happens when the code segment size is full. A bunch of real applications like Tomb Raider, F1 2015, Elemental, hit that issue because they use a ton of shaders. In this case, all shaders are evicted (for freeing space) but all currently bound shaders also need to be re-uploaded and SP_START_ID have to be updated accordingly. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: refactor the program upload processSamuel Pitoiset2016-09-013-32/+59
| | | | | | | | | | This refactoring will help for fixing the "out of code space" eviction issue because we will need to reupload the code for all currently bound shaders but it's slightly different than uploading a new fresh code. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: fix noop_scissor range issue on width/heightJordan Justen2016-09-011-7/+7
| | | | | | | | | | | | | If scissor X or Y was set to a negative value then the previous code might have indicated noop scissors when the scissor range actually was masking a portion of the framebuffer. Since fb->_Xmin, _Xmax, _Ymin and _Ymax take scissors into account, we can use these to test for a noop scissor. Cc: [email protected] Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Jordan Justen <[email protected]>
* glsl: Only force varyings to be flat when varying packing.Kenneth Graunke2016-09-011-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Varying packing would like to mark certain variables as flat. This works as long as both sides of the interfaces are changed accordingly. However, with SSO, we disable varying packing on the outermost stages. We also disable varying packing for certain tessellation stages. With SSO, we operate on the producer and consumer separately. Checks based on the consumer stage and variable are risky, and can easily lead to altering one half of the interface between stages, breaking SSO pipeline IO validation. Just stop monkeying around with interpolation modes unless required for varying packing. There's no point. This also disables it in unsafe SSO cases. Fixes CTS tests: *.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_MaxPatchVertices_Position_PointSize Also fixes Piglit's spec/oes_geometry_shader/sso_validation: - user-defined-gs-input-not-in-block.shader_test - user-defined-gs-input-in-block.shader_test Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]>