aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965: Store number of threads in brw_cs_prog_dataJordan Justen2016-06-019-37/+31
| | | | | | Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add nir based intrinsic lowering and thread ID uniformJordan Justen2016-06-014-0/+190
| | | | | | | | | | | | | | | | | | | | | | | | | We add a lowering pass for nir intrinsics. This pass can replace nir intrinsics with driver specific nir lower code. We lower the gl_LocalInvocationIndex intrinsic based on a uniform which is loaded with a thread specific ID. We also lower the gl_LocalInvocationID based on gl_LocalInvocationIndex. v2: * Create variable during lowering pass. (Ken) v3: * Don't create a variable, but instead just insert an intrisic call to load a uniform from the allocated location. (Jason) v4: * Don't run this pass if thread_local_id_index < 0 Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Put CS local thread ID uniform in last push registerJordan Justen2016-06-011-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | This thread ID uniform will be used to compute the gl_LocalInvocationIndex and gl_LocalInvocationID values. It is important for this uniform to be added in the last push constant register. fs_visitor::assign_constant_locations is updated to make sure this happens. The reason this is important is that the cross-thread push constant registers are loaded first, and the per-thread push constant registers are loaded after that. (Broadwell adds another push constant upload mechanism which reverses this order, but we are ignoring this for now.) v2: * Add variable in intrinsics lowering pass * Make sure the ID is pushed last in assign_constant_locations, and that we save a spot for the ID in the push constants v3: * Simplify code based with Jason's suggestions. Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add uniform for a CS thread local base IDJordan Justen2016-06-014-1/+25
| | | | | | | | | | | | v4: * Force thread_local_id_index to -1 for now, and have fs_visitor::setup_cs_payload look at thread_local_id_index. This enables us to more easily cut over from the old local ID layout to the new layout, as suggested by Jason. Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add nir channel_num system valueJordan Justen2016-06-012-0/+16
| | | | | | | | | v2: * simd16/32 fixes (curro) Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Make lowering gl_LocalInvocationIndex optionalJordan Justen2016-06-016-5/+22
| | | | | | Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Add glsl LowerCsDerivedVariables optionJordan Justen2016-06-016-13/+26
| | | | | | | | | | v2: * Move lower flag to context constants. (Ken) Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Copy the offset when lowering logical pull constant sendsJason Ekstrand2016-06-011-0/+8
| | | | | | | | This fixes 64 Vulkan CTS tests per gen Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299 Reviewed-by: Francisco Jerez <[email protected]> Cc: "12.0" <[email protected]>
* glsl/distance: make sure we use clip dist varying slot for lowered var.Dave Airlie2016-06-021-0/+1
| | | | | | | | When lowering, we always want to use the clip dist varying. Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* winsys/amdgpu: decay max_ib_size over timeNicolai Hähnle2016-06-011-0/+2
| | | | | | So that memory use will eventually decrease again after a temporary peak. Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: implement IB chaining on the gfx ringNicolai Hähnle2016-06-012-18/+109
| | | | | | As a consequence, CE IB size never triggers a flush anymore. Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: consolidate IB size management in amdgpu_ib_finalizeNicolai Hähnle2016-06-011-9/+9
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon/winsys: introduce radeon_winsys_cs_chunkNicolai Hähnle2016-06-0111-75/+98
| | | | | | | We will chain multiple chunks together and will keep pointers to the older chunks to support IB dumping. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/sid: add packet definitions for IB chainingNicolai Hähnle2016-06-012-0/+15
| | | | | | While we're at it, add packet printing in si_debug. Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: start with smaller IBs, growing as necessaryNicolai Hähnle2016-06-012-10/+71
| | | | | | | | | | | | | | | | | This avoids allocating giant IBs from the outset, especially for CE and DMA. Since we now limit max_dw only by the size that the buffer happens to be (which, due to the buffer cache, can be even larger than the rounded-up size we request), the new function amdgpu_ib_max_submit_dwords controls when we submit an IB. With this change, we effectively never flush prematurely due to the CE IB, after an initial warm-up phase. v2: - clean up buffer_size calculation Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: add amdgpu_ib and amdgpu_cs_from_ib helper functionsNicolai Hähnle2016-06-012-7/+37
| | | | | | | The latter function allows getting the containing amdgpu_cs from any IB (including non-main ones). Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: extract IB big buffer allocation for re-useNicolai Hähnle2016-06-011-17/+29
| | | | Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: add IB buffer in amdgpu_get_new_ibNicolai Hähnle2016-06-011-121/+113
| | | | | | | Adding the buffer when we start using it for the IB makes the logic for chaining a bit simpler. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: use cs_check_space throughoutNicolai Hähnle2016-06-015-10/+7
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon/winsys: add cs_check_spaceNicolai Hähnle2016-06-013-0/+23
| | | | Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: simplify interface of amdgpu_get_new_ibNicolai Hähnle2016-06-012-14/+14
| | | | | | We'll want to have an amdgpu_cs pointer for future changes. Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: add amdgpu_cs_has_user_fenceNicolai Hähnle2016-06-011-4/+8
| | | | | | v2: style change Reviewed-by: Marek Olšák <[email protected]>
* i965: Fix isoline reads in scalar TES.Kenneth Graunke2016-06-011-1/+1
| | | | | | | | | | | | | Isolines aren't reversed. commit 5b2d8c2273c6f fixed this for the vec4 TES backend, but not the scalar one. Found while debugging GL45-CTS.tessellation_shader. tessellation_control_to_tessellation_evaluation.gl_tessLevel. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: [email protected]
* st/mesa: implement PBO downloads for ReadPixelsNicolai Hähnle2016-06-013-3/+160
| | | | | | | | | v2: require PIPE_CAP_SAMPLER_VIEW_TARGET; technically only needed for some of the texture targets, but all hardware that has shader images should also have this cap. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: hook up a no-op try_pbo_readpixelsNicolai Hähnle2016-06-011-20/+37
| | | | | | | | For better bisectability given that the order of some of the fallback tests in the blit path are rearranged. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: add layer_offset to PBO fragment shaderNicolai Hähnle2016-06-012-4/+16
| | | | | | | | | This will be used to select a slice of a 3D texture. v2: fix a comment (Marek) Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: create PBO download fragment shadersNicolai Hähnle2016-06-012-7/+79
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: add PBO download enable bit and fragment shadersNicolai Hähnle2016-06-012-0/+17
| | | | | | | | | | For downloads, the fragment shader must know the source texture target, hence we may cache multiple fragment shaders. v2: break long line (Marek) Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: move shareable parts of PBO upload state and draw to st_pbo.cNicolai Hähnle2016-06-013-106/+129
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: move PBO buffer address calculation to st_pbo.cNicolai Hähnle2016-06-013-120/+204
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: move PBO upload fs creation to st_pbo.cNicolai Hähnle2016-06-013-77/+80
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: rename pbo_upload to pboNicolai Hähnle2016-06-013-48/+48
| | | | | | | At the same time, rename members that are upload-specific to say so. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: move PBO vertex and geometry shader creation to st_pbo.cNicolai Hähnle2016-06-013-89/+97
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: begin moving PBO functions into their own fileNicolai Hähnle2016-06-016-61/+130
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/cso: allow saving the first fragment shader image slotNicolai Hähnle2016-06-013-4/+53
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_inlines: allow NULL src in util_copy_image_viewNicolai Hähnle2016-06-011-4/+11
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_BARRIER_ALL defineNicolai Hähnle2016-06-011-0/+1
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: Use Geom.VerticesOut == -1 to specify unsetIan Romanick2016-06-013-6/+6
| | | | | | | | | Because apparently layout(max_vertices=0) is a thing. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* i965: If control_data_header_size_bits is zero, don't do EndPrimitiveIan Romanick2016-06-012-0/+6
| | | | | | | | This can occur when max_vertices=0 is explicitly specified. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* mesa: Fix bogus strncmpIan Romanick2016-06-011-1/+1
| | | | | | | | | | | | | | The string "[0]\0" is the same as "[0]" as far as the C string datatype is concerned. That string has length 3. strncmp(s, length_3_string, 4) is the same as strcmp(s, length_3_string), so make it be strcmp. v2: Not the same as strncmp(..., 3). Noticed by Ilia. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "12.0" <[email protected]>
* radeonsi: set correct stencil tile mode for texturingMarek Olšák2016-06-011-2/+8
| | | | | | Sadly, this doesn't affect SI and VI in any way. Reviewed-by: Michel Dänzer <[email protected]>
* winsys/amdgpu: set flags correctly when allocating depth-stencil buffersMarek Olšák2016-06-011-2/+8
| | | | | | This mimics Vulkan. It also documents how to fix stencil texturing. Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: lower memory usage during texture transfersMarek Olšák2016-06-012-4/+29
| | | | | | | | | | | | This improves throughput by keeping TTM overhead down. Some piglit tests such as texelFetch and streaming-texture-leak will use less memory now. v2: use gart_size / 4 as the threshold Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: invalidate busy linear textures for whole-texture uploadsMarek Olšák2016-06-011-2/+28
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: degrade tiled textures mapped often to linearMarek Olšák2016-06-012-0/+103
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: clean up and better comment use_staging_textureMarek Olšák2016-06-011-19/+23
| | | | | | | Next commits will add other things around this. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: set some colorbuffer register fields at emit timeMarek Olšák2016-06-013-50/+47
| | | | | | | to allow reallocating the texture storage with different parameters Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: implement global resetting of texture descriptorsMarek Olšák2016-06-014-6/+64
| | | | | | | it will be used by texture reallocation Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: move code for setting one shader image into separate functionMarek Olšák2016-06-011-71/+82
| | | | | | | v2: fix set_shader_images(..., NULL). Found by Christoph Haag. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: set some image descriptor fields at bind timeMarek Olšák2016-06-014-71/+111
| | | | | | | | mainly the fields that can change by reallocating a texture and changing the tile mode Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>