summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Skip SIMD lowering destination zipping if possible.Francisco Jerez2016-06-021-0/+55
| | | | | | | | | | | | | | | | | | Skipping the temporary allocation and copy instructions is easy (just return dst), but the conditions used to find out whether the copy can be optimized out safely without breaking the program are rather complex: The destination must be exactly one component of at most the execution width of the lowered instruction, and all source regions of the instruction must be either fully disjoint from the destination or be aligned with it group by group. v2: Don't handle partial source-destination overlap for simplicity (Jason). No instruction count regressions with respect to v1 in either shader-db or the few FP64 shader_runner test-cases with partial overlap I've checked manually. Cc: [email protected] Reviewed-by: Jason Ekstrand <[email protected]>
* blorp: Fix 16x multisample scaled blitsAnuj Phogat2016-06-021-7/+10
| | | | | | | | | Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled (with added 16x sample support) now passes with this patch. Cc: "12.0" <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meta: Fix indentation in shader codeAnuj Phogat2016-06-021-2/+2
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Acked-by: Matt Turner <[email protected]>
* mesa/copyimage: report INVALID_VALUE for missing cube faceDave Airlie2016-06-031-1/+1
| | | | | | | | | | | | The specs says INVALID_VALUE for exceeding dimensions, which is really what is happening here. This fixes: GL45-CTS.copy_image.non_existent_mipmap Cc: "11.2 12.0" <[email protected]> Reviewed-by: Antia Puentes <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/copyimage: fix num samples check to handle renderbuffers.Dave Airlie2016-06-031-4/+7
| | | | | | | | | | | | This test was only happening for textures, but there is nothing in the spec to say this, so test it for all cases. This fixes: GL45-CTS.copy_image.invalid_target Cc: "11.2 12.0" <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* freedreno/a4xx: silence coverity warningRob Clark2016-06-021-0/+6
| | | | | | CID 1362451 Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx+a4xx: fix potential null ptr derefRob Clark2016-06-022-2/+4
| | | | | | | | Coverity spotted the a3xx case (not sure why not the a4xx). CID 1362452 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix coverity warningRob Clark2016-06-021-1/+3
| | | | | | CID 1362453 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use nir_shader_get_entrypoint() helperRob Clark2016-06-021-10/+1
| | | | | | Should also fix coverity warning: CID 1362454 Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix incorrect enum typeRob Clark2016-06-021-1/+1
| | | | | | | | a4xx has it's own enum, different from a2xx/a3xx. Spotted by coverity: CID 1362458, 1362459 Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix coverity negative array index warningRob Clark2016-06-021-0/+2
| | | | | | | | | | Never can happen, since query would not have been created in the first place if pidx(query_type) return negative. Lets let coverity realize this. CID 1362460 Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix dereference before null checkRob Clark2016-06-021-2/+1
| | | | | | | | | | ptr can actually never be null so just drop the check. CID 1362464 (#1 of 1): Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking ptr suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Rob Clark <[email protected]>
* gallium/util: remove u_stagingRob Clark2016-06-023-205/+0
| | | | | | | Unused, and fixes a couple of coverity warnings: CID 1362171, 1362170 Signed-off-by: Rob Clark <[email protected]> Acked-by: Marek Olšák <[email protected]>
* freedreno/a3xx: only update/emit bordercolor state when neededRob Clark2016-06-023-17/+27
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: only update/emit bordercolor state when neededRob Clark2016-06-023-17/+26
| | | | | | I noticed in stk that it was contributing to a lot of overhead. Signed-off-by: Rob Clark <[email protected]>
* i965: Add missing types to type_sz().Matt Turner2016-06-021-1/+5
| | | | | | | | Coverity warns in multiple places about the potential for division by zero, caused by this function's default case. Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa/extensions: Fix ES1 extension reportingNanley Chery2016-06-021-2/+2
| | | | | | | | | | Commit eda15abd84af575d3bde432e2163e30d743a7c87 , unintentionally advertised these extensions in ES1 contexts. Undo this error. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "12.0" <[email protected]>
* egl: Check if API is supported when using eglBindAPI.Plamena Manolova2016-06-024-10/+72
| | | | | | | | | According to the EGL specifications before binding an API we must check whether it's supported first. If not eglBindAPI should return EGL_FALSE and generate a EGL_BAD_PARAMETER error. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/osmesa: remove double-write (overwriting)Eric Engestrom2016-06-021-1/+0
| | | | | | | | | | | | These two lines have been here since the file was created. I'm guessing the second one was just for testing during dev, so it's the one that's going away. CoverityID: 1296205 Signed-off-by: Eric Engestrom <[email protected]> Cc: [email protected] Reviewed-by: Brian Paul <[email protected]>
* st/vdpau: check for null pointer in get/put bits.Nayan Deshmukh2016-06-022-0/+12
| | | | | | | | Check for null pointer before accessing arrays in get/put bits native/YCbCr/Indexed in VdpOutputSurface and VdpVideoSurface. Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon/uvd: fix the H264 level for Tonga v2Christian König2016-06-021-1/+1
| | | | | | | | | | We support 5.2 for a while now. v2: we even support 5.2 for H264, 5.1 is for HEVC. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Cc: <[email protected]>
* mesa/formatquery: add a comment to clarify INTERNALFORMAT_PREFERREDAlejandro Piñeiro2016-06-021-1/+4
| | | | | | | | | | | The comment clarifies that the driver is called only to try to get a preferred internalformat, and that it was already checked if the format is supported or not. Acked-by: Eduardo Lima <[email protected]> Acked-by: Antia Puentes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/formatquery: remove INTERNALFORMAT_PREFERRED implementationAlejandro Piñeiro2016-06-021-71/+0
| | | | | | | | | | | | | Right now the implementation only checks if the internalformat is supported or not. But that implementation is wrong, returning unsupported for some internalformats. Additionally, checking if the internalformat is supported or not is already done at mesa/main before calling the driver hook, so this new check is not needed. Acked-by: Eduardo Lima <[email protected]> Acked-by: Antia Puentes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/eu: use simd8 when exec_size != EXECUTE_16Alejandro Piñeiro2016-06-021-2/+2
| | | | | | | | | | Among other thigs, fix a gpu hang when using INTEL_DEBUG=shader_time for any shader. Signed-off-by: Jason Ekstrand <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Remove old CS local ID handlingJordan Justen2016-06-017-124/+3
| | | | | | | | | | | | | The old method pushed data for each channels uvec3 data of gl_LocalInvocationID. The new method pushes 1 dword of data that is a 'thread local ID' value. Based on that value, we can generate gl_LocalInvocationIndex and gl_LocalInvocationID with some calculations. Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Enable cross-thread constants and compact local IDs for hsw+Jordan Justen2016-06-013-14/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. One complication is that cross-thread constants are loaded into registers before per-thread constants. Previously, our local IDs were loaded before the uniform data and treated as 'payload' data, even though they were actually pushed into the registers like the other uniform data. Therefore, in this patch we simultaneously enable a newer layout where each thread now uses a single uniform slot for a unique local ID for the thread. This uniform is handled specially to make sure it is added last into the uniform push constant registers. This minimizes our usage of push constant registers, and maximizes our ability to use cross-thread constants for registers. To swap from the old to the new layout, we also need to flip some lowering pass switches to let our driver handle the lowering instead. We also no longer force thread_local_id_index to -1. v4: * Minimize size of patch that switches from the old local ID layout to the new layout (Jason) Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Support new local ID generation & cross-thread constantsJordan Justen2016-06-014-48/+42
| | | | | | | | | | | | | | | | | The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. We also support per-thread data which allows us to store a per-thread ID in one of the uniforms that can be used to calculate the gl_LocalInvocationIndex and gl_LocalInvocationID variables. v4: * Support the old local ID push constant layout as well (Jason) Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Support new local ID push constant & cross-thread constantsJordan Justen2016-06-012-45/+52
| | | | | | | | | | | | | | | | | The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. We also support per-thread data which allows us to store a per-thread ID in one of the uniforms that can be used to calculate the gl_LocalInvocationIndex and gl_LocalInvocationID variables. v4: * Support the old local ID push constant layout as well (Jason) Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add CS push constant info to brw_cs_prog_dataJordan Justen2016-06-012-0/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | We need information about push constants in a few places for the GL driver, and another couple places for the vulkan driver. When we add support for uploading both a common (cross-thread) set of push constants, combined with the previous per-thread push constant data, things are going to get even more complicated. To simplify things, we add push constant info into the cs prog_data struct. The cross-thread constant support is added as of Haswell. To support it we need to make sure all push constants with uniform values are added to earlier registers. The register that varies per thread and holds the thread invocation's unique local ID needs to be added last. For now we add the code that would calculate cross-thread constatn information for hsw+, but we force it (cross_thread_supported) off until the other parts of the driver support it. v4: * Support older local ID push constant layout as well. (Jason) Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Store number of threads in brw_cs_prog_dataJordan Justen2016-06-019-37/+31
| | | | | | Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add nir based intrinsic lowering and thread ID uniformJordan Justen2016-06-014-0/+190
| | | | | | | | | | | | | | | | | | | | | | | | | We add a lowering pass for nir intrinsics. This pass can replace nir intrinsics with driver specific nir lower code. We lower the gl_LocalInvocationIndex intrinsic based on a uniform which is loaded with a thread specific ID. We also lower the gl_LocalInvocationID based on gl_LocalInvocationIndex. v2: * Create variable during lowering pass. (Ken) v3: * Don't create a variable, but instead just insert an intrisic call to load a uniform from the allocated location. (Jason) v4: * Don't run this pass if thread_local_id_index < 0 Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Put CS local thread ID uniform in last push registerJordan Justen2016-06-011-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | This thread ID uniform will be used to compute the gl_LocalInvocationIndex and gl_LocalInvocationID values. It is important for this uniform to be added in the last push constant register. fs_visitor::assign_constant_locations is updated to make sure this happens. The reason this is important is that the cross-thread push constant registers are loaded first, and the per-thread push constant registers are loaded after that. (Broadwell adds another push constant upload mechanism which reverses this order, but we are ignoring this for now.) v2: * Add variable in intrinsics lowering pass * Make sure the ID is pushed last in assign_constant_locations, and that we save a spot for the ID in the push constants v3: * Simplify code based with Jason's suggestions. Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add uniform for a CS thread local base IDJordan Justen2016-06-014-1/+25
| | | | | | | | | | | | v4: * Force thread_local_id_index to -1 for now, and have fs_visitor::setup_cs_payload look at thread_local_id_index. This enables us to more easily cut over from the old local ID layout to the new layout, as suggested by Jason. Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add nir channel_num system valueJordan Justen2016-06-012-0/+16
| | | | | | | | | v2: * simd16/32 fixes (curro) Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Make lowering gl_LocalInvocationIndex optionalJordan Justen2016-06-016-5/+22
| | | | | | Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Add glsl LowerCsDerivedVariables optionJordan Justen2016-06-016-13/+26
| | | | | | | | | | v2: * Move lower flag to context constants. (Ken) Cc: "12.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Copy the offset when lowering logical pull constant sendsJason Ekstrand2016-06-011-0/+8
| | | | | | | | This fixes 64 Vulkan CTS tests per gen Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299 Reviewed-by: Francisco Jerez <[email protected]> Cc: "12.0" <[email protected]>
* glsl/distance: make sure we use clip dist varying slot for lowered var.Dave Airlie2016-06-021-0/+1
| | | | | | | | When lowering, we always want to use the clip dist varying. Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* winsys/amdgpu: decay max_ib_size over timeNicolai Hähnle2016-06-011-0/+2
| | | | | | So that memory use will eventually decrease again after a temporary peak. Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: implement IB chaining on the gfx ringNicolai Hähnle2016-06-012-18/+109
| | | | | | As a consequence, CE IB size never triggers a flush anymore. Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: consolidate IB size management in amdgpu_ib_finalizeNicolai Hähnle2016-06-011-9/+9
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon/winsys: introduce radeon_winsys_cs_chunkNicolai Hähnle2016-06-0111-75/+98
| | | | | | | We will chain multiple chunks together and will keep pointers to the older chunks to support IB dumping. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/sid: add packet definitions for IB chainingNicolai Hähnle2016-06-012-0/+15
| | | | | | While we're at it, add packet printing in si_debug. Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: start with smaller IBs, growing as necessaryNicolai Hähnle2016-06-012-10/+71
| | | | | | | | | | | | | | | | | This avoids allocating giant IBs from the outset, especially for CE and DMA. Since we now limit max_dw only by the size that the buffer happens to be (which, due to the buffer cache, can be even larger than the rounded-up size we request), the new function amdgpu_ib_max_submit_dwords controls when we submit an IB. With this change, we effectively never flush prematurely due to the CE IB, after an initial warm-up phase. v2: - clean up buffer_size calculation Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: add amdgpu_ib and amdgpu_cs_from_ib helper functionsNicolai Hähnle2016-06-012-7/+37
| | | | | | | The latter function allows getting the containing amdgpu_cs from any IB (including non-main ones). Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: extract IB big buffer allocation for re-useNicolai Hähnle2016-06-011-17/+29
| | | | Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: add IB buffer in amdgpu_get_new_ibNicolai Hähnle2016-06-011-121/+113
| | | | | | | Adding the buffer when we start using it for the IB makes the logic for chaining a bit simpler. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: use cs_check_space throughoutNicolai Hähnle2016-06-015-10/+7
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon/winsys: add cs_check_spaceNicolai Hähnle2016-06-013-0/+23
| | | | Reviewed-by: Marek Olšák <[email protected]>
* winsys/amdgpu: simplify interface of amdgpu_get_new_ibNicolai Hähnle2016-06-012-14/+14
| | | | | | We'll want to have an amdgpu_cs pointer for future changes. Reviewed-by: Marek Olšák <[email protected]>