summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Patch the instruction generating discards; don't use CMP.Z.Kenneth Graunke2015-02-271-2/+3
| | | | | | | | | | | | | | | | | | | CMP.Z doesn't work on Gen4-5 because the boolean isn't guaranteed to be 0 or 0xFFFFFFFF - only the low bit is defined. We can call emit_bool_to_cond_code to generate the condition in f0.0; the last instruction will generate the flag value. We can patch it to use f0.1, and negate the condition. Fixes discard tests on Gen4-5. Haswell shader-db stats: total instructions in shared programs: 5770279 -> 5769112 (-0.02%) instructions in affected programs: 64342 -> 63175 (-1.81%) helped: 1069 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Introduce brw_negate_cmod().Kenneth Graunke2015-02-272-0/+23
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* swrast: replace INLINE with inlineBrian Paul2015-02-262-6/+6
| | | | Reviewed-by: Alex Deucher <[email protected]>
* radeon: replace INLINE with inlineBrian Paul2015-02-265-8/+8
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r200: replace INLINE with inlineBrian Paul2015-02-263-4/+4
| | | | Reviewed-by: Alex Deucher <[email protected]>
* i915: replace INLINE with inlineBrian Paul2015-02-2610-22/+22
| | | | Reviewed-by: Alex Deucher <[email protected]>
* mesa: don't include math.h in compiler.hBrian Paul2015-02-262-0/+2
| | | | | | | | Not needed by anything in that header. Include math.h or c99_math.h where needed instead. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* mesa: include stdarg.h only where it's usedBrian Paul2015-02-261-0/+1
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* meta: In pbo_{Get,}TexSubImage don't repeatedly rebind the source texNeil Roberts2015-02-261-4/+0
| | | | | | | | A layered PBO image is now interpreted as a single tall 2D image so the z argument in _mesa_meta_bind_fbo_image is ignored. Therefore this was just redundantly rebinding the same image repeatedly. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/gen8: Use HALIGN_16 if MCS is enabled for non-MSRTAnuj Phogat2015-02-251-0/+3
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* i965: Pass pointer to miptree as function parameter in ↵Anuj Phogat2015-02-251-6/+6
| | | | | | | | | intel_horizontal_texture_alignment_unit This will be used by next patch in the series. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* i965: Allocate texture buffer in intelTexImageAnuj Phogat2015-02-251-2/+11
| | | | | | | | | before calling _mesa_meta_pbo_TexSubImage(). This will be used in later patches and will be required in Skylake to get the tile resource mode of miptree before calling _mesa_meta_pbo_TexSubImage(). Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* i965: Make a function to check the conditions to use the blitterAnuj Phogat2015-02-251-11/+29
| | | | | | | No functional changes in the patch. Just makes the code look cleaner. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* i965: Move the comment to the right placeAnuj Phogat2015-02-251-1/+1
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* i965: Fix condition to use Y tiling in blitter in intel_miptree_create()Anuj Phogat2015-02-251-3/+3
| | | | | | | Y tiling is supported in blitter on SNB+. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* meta: Pass null pointer for the pixel data to avoid unnecessary data uploadAnuj Phogat2015-02-251-1/+4
| | | | | | | to a temporary pbo created in _mesa_meta_pbo_GetTexSubImage(). Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* meta: Fix buffer object assignment to account for both pack and unpack bo'sAnuj Phogat2015-02-251-1/+1
| | | | | | | | | create_texture_for_pbo() is shared by _mesa_meta_pbo_GetTexSubImage() and _mesa_meta_pbo_TexSubImage() functions. So, we need to account for both pack and unpack buffer objects. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* meta: Use GL_STREAM_READ for pbo created with GL_PIXEL_PACK_BUFFERAnuj Phogat2015-02-251-1/+7
| | | | | | | | | | create_texture_for_pbo() is used by both _mesa_meta_pbo_GetTexSubImage() and _mesa_meta_pbo_TexSubImage() functions with different PBO targets. Use GL_STREAM_READ with GL_PIXEL_PACK_BUFFER and GL_STREAM_DRAW with GL_PIXEL_UNPACK_BUFFER. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* meta: Add assertion check for ctx->Meta->SaveStackDepthAnuj Phogat2015-02-251-0/+2
| | | | | | | before using it for derefrencing. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* meta: Do power of two samples check only for samples > 0Anuj Phogat2015-02-251-2/+2
| | | | | | | otherwise samples=0 passes the check, which is invalid. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* i965: Don't force x-tiling for 16-bpp formats on Gen>7Neil Roberts2015-02-251-3/+3
| | | | | | | | | | | | Sandybridge doesn't support y-tiling for surface formats with 16 or more bpp. There was previously an override to explicitly allow this for Gen7. However, this restriction is also removed in Gen8+ so we should use y-tiling there too. This is important to do for Skylake which doesn't support x-tiling for 3D surfaces. Reviewed-by: Ben Widawsky <[email protected]>
* dri/common: Update comment about driQueryRendererIntegerCommonAndreas Boll2015-02-251-0/+1
| | | | | | | | | Since 87d3ae0b45b6b6bb50b583dafa55eb109449a005 driQueryRendererIntegerCommon handles __DRI2_RENDERER_PREFFERED_PROFILE too. Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* common: Fix PBOs for 1D_ARRAY.Laura Ekstrand2015-02-241-26/+36
| | | | | | | | | | | Corrects the way that _mesa_meta_pbo_TexSubImage and _mesa_meta_pbo_GetTexSubImage handle 1D_ARRAY textures. Fixes a failure in the Piglit arb_direct_state_access/gettextureimage-targets test. Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Laura Ekstrand <[email protected]> Cc: "10.4, 10.5" <[email protected]>
* common: Correct PBO 2D_ARRAY handling.Laura Ekstrand2015-02-241-9/+17
| | | | | | | | | | | | | | | | | | | Changes PBO uploads and downloads to use a tall (height * depth) 2D texture for blitting. This fixes the bug where 2D_ARRAY, 3D, and CUBE_MAP_ARRAY textures are not properly uploaded and downloaded. Removes the option to use a 2D ARRAY texture for the PBO during upload and download. This option didn't work because the miptree couldn't be set up reliably. v2: Review from Jason Ekstrand and Neil Roberts: -Delete the depth parameter from create_texture_for_pbo -Abandon the option to create a 2D ARRAY texture in create_texture_for_pbo Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "10.4, 10.5" <[email protected]>
* common: Correct texture init for meta pbo uploads and downloads.Laura Ekstrand2015-02-241-1/+4
| | | | | | | | | | | | | | | | | This moves the line setting immutability for the texture to after _mesa_initialize_texture_object so that the initializer function will not cancel it out. Moreover, because of the ARB_texture_view extension, immutable textures must have NumLayers > 0, or depth will equal (0-1)=0xFFFFFFFF during SURFACE_STATE setup, which triggers assertions. v2: Review from Kenneth Graunke: - Include more explanation in the commit message. - Make texture setup bug fixes into a separate patch. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "10.4, 10.5" <[email protected]>
* i965: Remove redundant discard jumps.Kenneth Graunke2015-02-242-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | With the previous optimization in place, some shaders wind up with multiple discard jumps in a row, or jumps directly to the next instruction. We can remove those. Without NIR on Haswell: total instructions in shared programs: 5777258 -> 5775872 (-0.02%) instructions in affected programs: 20312 -> 18926 (-6.82%) helped: 716 With NIR on Haswell: total instructions in shared programs: 5773163 -> 5771785 (-0.02%) instructions in affected programs: 21040 -> 19662 (-6.55%) helped: 717 v2: Use the CFG rather than the old instructions list. Presumably the placeholder halt will be in the last basic block. v3: Make sure placeholder_halt->prev isn't the head sentinel (caught twice by Eric Anholt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Handle conditional discards.Kenneth Graunke2015-02-242-17/+26
| | | | | | | | | | | | | The discard condition tells us which channels we want killed. We want to invert that condition to get the channels that should survive (remain live) in f0.1. Emit a CMP to negate it. Nothing generates these today, but that will change shortly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* Revert "i965/fs: Remove force_writemask_all assertion for execsize < 8."Matt Turner2015-02-241-0/+1
| | | | | | | | | This reverts commit 0d8f27eab7b7e8b7a16e76aabd3f6a0ab4880497. "This doesn't seem to be necessary." <- I was wrong! Tested-by: Markus Wick <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: Emit MOV(1) instructions with force_writemask_all.Matt Turner2015-02-241-0/+1
| | | | | | | Fixes rendering with Dolphin. Tested-by: Markus Wick <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.Matt Turner2015-02-242-0/+95
| | | | | | | total instructions in shared programs: 5695356 -> 5689775 (-0.10%) instructions in affected programs: 486231 -> 480650 (-1.15%) helped: 2604 LOST: 1
* i965/fs/nir: Optimize integer multiply by a 16-bit constant.Matt Turner2015-02-241-1/+23
| | | | | | | | | | | | Gen8+ support was just broken, since MUL now consumes 32-bits from both sources. Fixes 986 piglit tests on my BDW. total instructions in shared programs: 7753873 -> 7753522 (-0.00%) instructions in affected programs: 28164 -> 27813 (-1.25%) helped: 77 GAINED: 47 Reviewed-by: Ian Romanick <[email protected]>
* i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.Matt Turner2015-02-242-0/+90
| | | | | | | | total instructions in shared programs: 7756214 -> 7753873 (-0.03%) instructions in affected programs: 455452 -> 453111 (-0.51%) helped: 2333 Reviewed-by: Eric Anholt <[email protected]>
* mesa: replace FABSF with fabsfBrian Paul2015-02-241-1/+2
| | | | Reviewed-by: Matt Turner <[email protected]>
* driconf: Update Catalan translationAlex Henrie2015-02-241-18/+34
| | | | Signed-off-by: Alex Henrie <[email protected]>
* driconf: Update Spanish translationAlex Henrie2015-02-241-5/+21
| | | | Signed-off-by: Alex Henrie <[email protected]>
* i965: Fix non-AA wide line rendering with fractional line widthsIago Toral Quiroga2015-02-244-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | "(...)Let w be the width rounded to the nearest integer (...). If the line segment has endpoints given by (x0,y0) and (x1,y1) in window coordinates, the segment with endpoints (x0,y0-(w-1)/2) and (x1,y1-(w-1/2)) is rasterized, (...)" The hardware it not rounding the line width, so we should do it. Also, we should be careful not to go beyond the hardware limits for the line width after it gets rounded. Gen6-7 define a maximum line width slightly below 8.0, so we should advertise a maximum line width lower than 7.5 to make sure that 7.0 is the maximum integer line width that we can select. Since the line width granularity in these platforms is 0.125, we choose 7.375. Other platforms advertise rounded maximum line widths, so those are fine. Fixes the following 3 dEQP tests: dEQP-GLES3.functional.rasterization.primitives.lines_wide dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.lines_wide dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.lines_wide Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: round to nearest when converting float into integerSamuel Iglesias Gonsalvez2015-02-241-4/+7
| | | | | | | | | | | | | | | | | | | | Fixes: dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_linear No piglit regressions. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Perform program state upload outside of atom handlingCarl Worth2015-02-2310-84/+89
| | | | | | | | | | | | | | | | | | | | | | | | Across the board of the various generations, the intial few atoms in all of the atom lists are basically the same, (performing uploads for the various programs). The only difference is that prior to gen6 there's an ff_gs upload in place of the later gs upload. In this commit, instead of using the atom lists for this program state upload, we add a new function brw_upload_programs that calls into the per-stage upload functions which in turn check dirty bits and return immediately if nothing needs to be done. This commit is intended to have no functional change. The motivation is that future code, (such as the shader cache), wants to have a single function within which to perform various operations before and after program upload, (with some local variables holding state across the upload). It may be worth looking at whether some of the other functionality currently handled via atoms might also be more cleanly handled in a similar fashion. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix lower_load_payload() not to use an incorrect half for ↵Francisco Jerez2015-02-231-0/+8
| | | | | | immediates and uniforms. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Fix lower_load_payload() to take into account non-zero reg_offset.Francisco Jerez2015-02-231-2/+2
| | | | | | | | Fixes metadata guess when instructions in the program specify a destination register with non-zero reg_offset and when the payload of a LOAD_PAYLOAD spans several registers. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().Francisco Jerez2015-02-231-26/+13
| | | | | | | MRFs cannot be read from anyway so they cannot possibly be a valid source of LOAD_PAYLOAD. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Less broken handling of force_writemask_all in lower_load_payload().Francisco Jerez2015-02-231-7/+13
| | | | | | | | | | It's perfectly fine to read the second half of a register written with force_writemask_all from a first half MOV instruction or vice versa, and lower_load_payload shouldn't mark the whole MOV as belonging to the second half in that case. Replicate the same metadata to both halves of the destination when writemasking is disabled. Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: Use assert() instead of ASSERT wrapper.Matt Turner2015-02-2315-37/+37
| | | | Acked-by: Eric Anholt <[email protected]>
* i965: Link test programs with gtest before pthreads.Matt Turner2015-02-231-10/+10
| | | | | Cc: "10.5" <[email protected]> Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540962
* osmesa: add gallium include dirs to Makefile.amBrian Paul2015-02-231-0/+2
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89260 Reviewed-by: Jose Fonseca <[email protected]>
* i965/skl: Use 1 register for uniform pull constant payloadBen Widawsky2015-02-221-1/+1
| | | | | | | | | | | | | | When under dispatch_width=16 the previous code would allocate 2 registers for the payload when only one is needed. This manifested itself through bugs on SKL which needs to mess with this instruction. Ken though this might impact shader-db, but apparently it doesn't Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999 Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Timo Aaltonen <[email protected]>
* i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate dataJordan Justen2015-02-211-2/+2
| | | | | | | | | | | The brw_imm_ud will yield a HW_REG which then will introduce a barrier for certain optimization opportunities. No piglit regressions seen with gen8 (simd8vs). Suggested-by: Matt Turner <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Set pixel/sample mask for compute shaders atomic opsJordan Justen2015-02-211-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | For fragment programs, we pull this mask from the payload header. The same mask doesn't exist for compute shaders, so we set all bits to enabled. Previously we were setting 0xff to support SIMD8 VS, but with CS we support SIMD16, and therefore we change this to 0xffff. Related commits for SIMD8 VS: commit d9cd982d556be560af3bcbcdaf62b6b93eb934a5 Author: Ben Widawsky <[email protected]> Date: Sun Feb 15 20:06:59 2015 -0800 i965/simd8vs: Fix SIMD8 atomics commit 4a95be9772a255776309f23180519a4a8560f2dd Author: Jordan Justen <[email protected]> Date: Tue Feb 17 09:57:35 2015 -0800 i965/simd8vs: Fix SIMD8 atomics (read-only) Note: this mask is ANDed with the execution mask, so some channels may not end up issuing the atomic operation. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* drivers/x11: add gallium include dirs to Makefile.amBrian Paul2015-02-201-0/+2
| | | | | | Fixes xlib driver build after e8c5cbfd921680c. Acked-by: Matt Turner <[email protected]>
* util: Move Mesa's bitset.h to util/.Eric Anholt2015-02-205-5/+5
| | | | Reviewed-by: Jose Fonseca <[email protected]>