summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* genxml: Make gen6-7 blending look more like gen8Jason Ekstrand2016-07-154-15/+34
| | | | | | | | | | This renames BLEND_STATE to BLEND_STATE_ENTRY and adds an new struct BLEND_STATE which is just an array of 8 BLEND_STATE_ENTRYs. This will make it much easier to write gen-agnostic blend handling code. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* vc4: Speed up glGenerateMipmaps by avoiding shadow baselevel.Eric Anholt2016-07-155-3/+23
| | | | | | | | | | | | | To support general GL_TEXTURE_BASE_LEVEL we have to copy to a temporary miptree. However, if a single level is being selected, we can use the existing miptree and force all the sampling to be from that particular level. This avoids a ton of software fallbacks in glGenerateMipmaps(), which uses base levels in the blit implementation in gallium. Improves "glmark2 -b terrain" from 2 fps to 3 (perhaps some more precision would be useful?), and cuts its CPU usage during the benchmarking from ~30% to ~10% (total CPU time from 8.8s to 7.6s).
* vc4: Drop VC4_DIRTY_TEXSTATE in favor of the per-stage flags.Eric Anholt2016-07-154-8/+4
| | | | | | The compiler uses the per-stage flags already, so it didn't need this. vc4_uniforms was using it, so just replace it with both of the stage flags for now.
* vc4: Remove dead dirty_samplers field.Eric Anholt2016-07-152-5/+0
| | | | We use a big VC4_DIRTY_FRAGTEX/VC4_DIRTY_VERTEX on the stage, instead.
* vc4: Turn on control flow support in the simulator environment.Eric Anholt2016-07-151-0/+4
| | | | | We can't merge the non-simulator support until we merge the kernel side and get a new libdrm release.
* mesa: handle numLevels, numSamples in _mesa_test_proxy_teximage()Brian Paul2016-07-151-3/+42
| | | | | | | | | If numSamples > 0, we can compute the size of the whole mipmapped texture. That's the case for glTexStorage(GL_PROXY_TEXTURE_x). Also, multiply the texture size by numSamples for MSAA textures. Reviewed-by: Anuj Phogat <[email protected]>
* mesa: add proxy texture targets in _mesa_next_mipmap_level_size()Brian Paul2016-07-151-3/+6
| | | | | | So we can use it for computing size of proxy textures. Reviewed-by: Anuj Phogat <[email protected]>
* mesa: add numLevels, numSamples to Driver.TestProxyTexImage()Brian Paul2016-07-157-29/+39
| | | | | | | | | | | | | So that the function can work properly with glTexStorage(), where we know how many mipmap levels there are. And so we can compute storage for MSAA textures. Also, remove the obsolete texture border parameter. A subsequent patch will update _mesa_test_proxy_teximage() to use these new parameters. Reviewed-by: Anuj Phogat <[email protected]>
* mesa: use _mesa_clear_texture_image() in clear_texture_fields()Brian Paul2016-07-151-3/+1
| | | | | | | | | | | This avoids a failed assert(img->_BaseFormat != -1) in init_teximage_fields_ms() because the internalFormat argument is GL_NONE. This was hit when using glTexStorage() to do a proxy texture test. Fixes a failure with the updated Piglit tex3d-maxsize test. Cc: <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* svga: avoid ubinding render targets that have already been unboundCharmaine Lee2016-07-151-1/+6
| | | | | | | | Fixed the remaining redundant SetRenderTargets command emission. Tested with lightsMark2008, Heaven, mtt piglit, glretrace, conform. Reviewed-by: Brian Paul <[email protected]>
* svga: dump code for GenMips.Neha Bhende2016-07-151-0/+6
| | | | Reviewed-by: Brian Paul <[email protected]>
* Use correct names for dlopen()ed files on CygwinYaakov Selkowitz2016-07-153-0/+6
| | | | | Signed-off-by: Yaakov Selkowitz <[email protected]> Reviewed-by: Jon Turney <[email protected]>
* Revert "isl: Don't filter tiling flags if a specific tiling bit is set"Nanley Chery2016-07-151-8/+5
| | | | | | | | | | This reverts commit 091f1da902c71ac8d3d27b325a118e2f683f1ae5 . Although a user may specify a specfic tiling bit, ISL should still prevent incompatible tiling/surface combinations. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* anv/blit2d: Copy with stencil sources when neededNanley Chery2016-07-151-3/+14
| | | | | | | | | | In the next patch, ISL will unconditionally perform verification of a surface's tiling and usage. Since it will require that w-tiled images be stencil buffers, create a stencil surface to copy from a w-tiled/stencil surface. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: Fix initialization of the ISL tilingNanley Chery2016-07-152-4/+14
| | | | | | | | | | | | | | | If an internal user creates an image with Vulkan tiling VK_IMAGE_TILING_OPTIMAL and an ISL tiling that isn't set, ISL will fail to create the image as anv_image_create_info::isl_tiling_flags will be an invalid value. Correct this by making anv_image_create_info::isl_tiling_flags an opt-in, filtering bitmask, that allows the caller to specify which ISL tilings are acceptable, but not contradictory to the Vulkan tiling. Opt-out of filtering for vkCreateImage. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* isl: Fix isl_tiling_is_any_y()Nanley Chery2016-07-151-1/+1
| | | | | | | Cc: 12.0 <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/device: Fix max buffer range limitsNanley Chery2016-07-151-2/+6
| | | | | | | | | | | | | | | | Set limits that are consistent with ISL's assertions in isl_genX(buffer_fill_state_s)() and Anvil's format-DescriptorType mapping in anv_isl_format_for_descriptor_type(). Fixes the following new crucible tests: * stress.limits.buffer-update.range.uniform * stress.limits.buffer-update.range.storage These tests are in this patch: https://patchwork.freedesktop.org/patch/98726/ Cc: 12.0 <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* isl: Fix assert on raw buffer surface state sizeNanley Chery2016-07-151-1/+8
| | | | | | | | See inline PRM reference. Cc: 12.0 <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Simplify range member assignmentNanley Chery2016-07-151-4/+2
| | | | | | | | A ternary is clearer because the range member is assigned one of two values dependant on one condition. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Remove unused variableNanley Chery2016-07-151-2/+1
| | | | | | | This became unused due to commit 612e35b2c65c99773b73e53d0e6fd112b1a7431f . Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/descriptor_set: Fix binding partly undefined descriptor setsNanley Chery2016-07-151-0/+5
| | | | | | | | | | | | | | | | Section 13.2.3. of the Vulkan spec requires that implementations be able to bind sparsely-defined Descriptor Sets without any errors or exceptions. When binding a descriptor set that contains a dynamic buffer binding/descriptor, the driver attempts to dereference the descriptor's buffer_view field if it is non-NULL. It currently segfaults on undefined descriptors as this field is never zero-initialized. Zero undefined descriptors to avoid segfaulting. This solution was suggested by Jason Ekstrand. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850 Cc: 12.0 <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* svga: handle mismatched number of samplers, sampler viewsBrian Paul2016-07-151-5/+10
| | | | | | | | | | | | | in svga_init_shader_key_common(). Since the CSO module only tracks sampler views for fragment shaders, the number of samplers and sampler views can be mismatched for other types of shaders. This situation triggered an assertion in Chrome with maps.google.com This patch adds defensive code to handle that situation. Fixes VMware bug 1694027 Cc: <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* st/omx/enc: check uninitialized list from task releaseLeo Liu2016-07-151-2/+2
| | | | | | | | | The uninitialized list should be checked and returned. Thank Julien for the notification and suggested fix. Signed-off-by: Leo Liu <[email protected]> Cc: "12.0" <[email protected]>
* nv50/ir: add missing string for SV_WORK_DIMSamuel Pitoiset2016-07-141-0/+1
| | | | | | | Fixes: 2aa1197 ("nouveau: Add support for SV_WORK_DIM") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Hans de Goede <[email protected]>
* Revert "radeon/llvm: Use alloca instructions for larger arrays"Marek Olšák2016-07-142-149/+25
| | | | | | This reverts commit 513fccdfb68e6a71180e21827f071617c93fd09b. Bioshock Infinite hangs with that.
* r600,compute: Reserve vtx 3 for kernel argumentsJan Vesely2016-07-141-3/+7
| | | | | | | | | Using vtx 0 does not work for dynamic offsets. v2: add explanatory comment Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeon/uvd: fail to create a decoder if RUVD_MSG_CREATE submission failsMarek Olšák2016-07-141-6/+9
| | | | | | This is the bare minimum for reporting the error to the user. Reviewed-by: Christian König <[email protected]>
* winsys/amdgpu: return an error on IB submission failuresMarek Olšák2016-07-142-1/+9
| | | | Reviewed-by: Christian König <[email protected]>
* gallium/radeon: add a return value to cs_flushMarek Olšák2016-07-143-9/+13
| | | | | | Required by our UVD code. Reviewed-by: Christian König <[email protected]>
* glsl/types: Use _mesa_hash_data for hashing function typesJason Ekstrand2016-07-141-14/+2
| | | | | | | | | | This is way better than the stupid string approach especially since you could overflow the string. Again, I thought I had something better at one point but it obviously got lost. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "12.0" <[email protected]>
* glsl/types: Fix function type comparison functionJason Ekstrand2016-07-141-1/+1
| | | | | | | | | | It was returning true if the function types have different lengths rather than false. This was new with the SPIR-V to NIR pass and I thought I'd fixed it a while ago but it may have gotten lost in rebasing somewhere. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "12.0" <[email protected]>
* freedreno/a4xx: Fix sign compare warnings[email protected]2016-07-141-7/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: Fix sign compare warnings[email protected]2016-07-141-7/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a2xx: Fix sign compare warnings[email protected]2016-07-141-4/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeon/vce: handle newly added parametersBoyuan Zhang2016-07-141-13/+20
| | | | | | | Replace the previous hardcoded value with newly defined parameters Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/omx: assign previous values to new structureBoyuan Zhang2016-07-141-0/+10
| | | | | | | | Assign previously hardcoded values for OMX to newly defined structure. As a result, OMX behaviour will not change at all. Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* vl: add parameters for VAAPI encodeBoyuan Zhang2016-07-141-0/+33
| | | | | | | | Allow to specify more parameters in the encoding interface which previously just hardcoded in the encoder Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/mesa: fix reference counting bug in st_vdpauChristian König2016-07-141-2/+8
| | | | | | | | | Otherwise we leak the resources created for the DMA-buf descriptors. Signed-off-by: Christian König <[email protected]> Cc: 12.0 <[email protected]> Tested-and-Reviewed by: Leo Liu <[email protected]> Ack-by: Tom St Denis <[email protected]>
* vc4: Emit resets of the uniform stream at the starts of blocks.Eric Anholt2016-07-139-0/+167
| | | | | | | | If a block might be entered from multiple locations, then the uniform stream will (probably) be at different points, and we need to make sure that it's pointing where we expect it to be. The kernel also enforces that any block reading a uniform resets uniforms, to prevent reading outside of the uniform stream by using looping.
* vc4: Add support for scheduling of branch instructions.Eric Anholt2016-07-132-17/+114
| | | | For now we don't fill the delay slots, and instead just drop in NOPs.
* vc4: Move the QPU instructions to schedule into each block.Eric Anholt2016-07-134-141/+180
| | | | We'll want to schedule them individually, to handle delay slots.
* vc4: Disable vc4_opt_vpm in the presence of control flow.Eric Anholt2016-07-131-0/+5
| | | | | | It's a really valuable pass currently, but it will be a mess to rewrite for control flow. For now, just disable it if we have multiple blocks present.
* vc4: Convert vc4_opt_dead_code to work in the presence of control flow.Eric Anholt2016-07-131-18/+29
| | | | | | | | | | | | With control flow, we can't be sure that we'll see the uses of a variable before its def as we walk backwards. Given that NIR is eliminating our long chains of dead code, a simple solution for now seems fine. This slightly changes the order of some optimizations, and so an opt_vpm happens before opt_dce, causing 3 dead MOVs to be turned into dead FMAXes in Minecraft: instructions in affected programs: 52 -> 54 (3.85%)
* vc4: Update copy propagation for control flow.Eric Anholt2016-07-131-62/+137
| | | | | | | | | | | | | | Previously, we could assume that a MOV from a temp was always an available copy, because all temps were SSA in NIR, and their non-SSA state in QIR was just due to the fact that they were from a bcsel or pack_unorm_4x8, so we could use the current value of the temp after that series of QIR instructions to define it. However, this is no longer the case with control flow. Instead, we track a new array of MOVs defined within the block that haven't had their source or dest killed yet, and use that primarily. We fall back to looking through the QIR defs array to handle across-block MOVs, but now require that copies from the SSA defs have an SSA src as well.
* i965/fs: emit DIM instruction to load 64-bit immediates in HSWSamuel Iglesias Gonsálvez2016-07-141-0/+10
| | | | | | | | v2 (Matt): - Use brw_imm_df() as source argument of DIM instruction. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: set DF imm value to the source of DIMSamuel Iglesias Gonsálvez2016-07-141-1/+2
| | | | | | | | | | | | | | | According to HSW's PRM, vol02b, the DIM instruction has the following restriction: "Restriction : src0 must be immediate. src0 must specify the :f (F, Float) type encoding but is an immediate 64-bit DF (Double Float) value. dst must have type DF." This commit allows to upload the immediate 64-bit DF value to the source of a DIM instruction even when it is of float type encoding. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: enable the emission of the DIM instructionSamuel Iglesias Gonsálvez2016-07-1410-2/+23
| | | | | | | | | | v2 (Matt): - Take a DF source argument for the DIM instruction emission in the visitors. - Indentation. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* anv: Add a stub for CmdCopyQueryPoolResults on Ivy BridgeJason Ekstrand2016-07-131-0/+13
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* i965: fix compiler warnings for 32bit buildTimothy Arceri2016-07-142-26/+26
| | | | Reviewed-by: Matt Turner <[email protected]>
* Revert "gallium: Force blend color to 16-byte alignment"Tim Rowley2016-07-131-11/+1
| | | | | | | | | | | | | | This reverts commit d8d6091a846ac2a40a011d512d6d57f6c8442e6a. Heap allocations may be only 8-byte aligned on 32-bit system, and so having members with 16-byte alignment (such as in the case where pipe_blend_color is embedded in radeonsi's si_context) is undefined behavior which indeed causes crashes when compiled with gcc -O3. Cc: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835 Signed-off-by: Tim Rowley <[email protected]> Acked-by: Chuck Atkins <[email protected]>