summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: use determine_interpolation_mode().Paul Berry2011-10-271-4/+4
| | | | | | | | | | | | | | This patch changes how fs_visitor::emit_general_interpolation() decides what kind of interpolation to do. Previously, it used the shade model to determine how to interpolate colors, and used smooth interpolation on everything else. Now it uses ir_variable::determine_interpolation_mode(), so that it respects GLSL 1.30 interpolation qualifiers. Fixes piglit tests interpolation-flat-*-smooth-{distance,fixed,vertex} and interpolation-flat-other-flat-{distance,fixed,vertex}. Reviewed-by: Eric Anholt <[email protected]>
* i965/gen6+: Parameterize barycentric interpolation modes.Paul Berry2011-10-2710-38/+103
| | | | | | | | | | | | | | | | | This patch modifies the fragment shader back-end so that instead of using a single delta_x/delta_y register pair to store barycentric coordinates, it uses an array of such register pairs, one for each possible intepolation mode. When setting up the WM, we intstruct it to only provide the barycentric coordinates that are actually needed by the fragment shader--that is computed by brw_compute_barycentric_interp_modes(). Currently this function returns just BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC, because this is the only interpolation mode we support. However, that will change in a later patch. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Fix split_virtual_grfs() when delta_xy not in a virtual register.Paul Berry2011-10-271-1/+1
| | | | | | | | | | | | | | This patch modifies the special case in fs_visitor::split_virtual_grfs() that prevents splitting from being applied to the delta_x/delta_y register pair (this register pair needs to remain contiguous so that it can be used by the PLN instruction). When gen>=6, this register pair is in a fixed location, not a virtual register, so it was in no danger of being split. And split_virtual_grfs' attempt not to split it was preventing some other unrelated register from being split. Reviewed-by: Eric Anholt <[email protected]>
* intel: Drop texture border support code.Eric Anholt2011-10-265-93/+29
| | | | | | | | | | Now that texture borders are gone, we never need to allocate our textures through non-miptrees, which simplifies some irritating paths. v2: Remove the !mt support case from intel_map_texture_image() Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Brian Paul <[email protected]>
* intel: Enable stripping of texture borders.Eric Anholt2011-10-261-0/+2
| | | | | | | | | | | | | | | | | | | This replaces software rendering of textures with the deprecated 1-pixel border (which is always bad, since mipmapping is rather broken in swrast, and GLSL 1.30 is unsupported) with hardware rendering that just pretends there was never a border (so you have potential seams on apps that actually intentionally used the 1-pixel borders, but correct rendering otherwise). This doesn't regress any piglit tests on gen6 (since the texwrap border/bordercolor cases already failed due to broken border color handling), but regresses texwrap border cases on original gen4 since those end up sampling the border color instead of the border pixels. It's a small price to pay for not thinking about texture borders any more. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glsl: Add uniform_locations_assigned parameter to do_dead_code opt passIan Romanick2011-10-251-1/+2
| | | | | | | | | | | | | | | | | Setting this flag prevents declarations of uniforms from being removed from the IR. Since the IR is directly used by several API functions that query uniforms in shaders, uniform declarations cannot be removed after the locations have been set. However, it should still be safe to reorder the declarations (this is not tested). Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41980 Tested-by: Brian Paul <[email protected]> Reviewed-by: Bryan Cain <[email protected]> Cc: Vinson Lee <[email protected]> Cc: José Fonseca <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Yuanhan Liu <[email protected]>
* i965: Add more #defines for Gen6+ 3DSTATE_GS fields.Kenneth Graunke2011-10-251-0/+8
| | | | | | | These should be useful for doing transform feedback on Sandybridge. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add new brw_context::max_gs_threads constant.Kenneth Graunke2011-10-252-0/+8
| | | | | | | | These are correct to the best of my knowledge, gleaned from a variety of internal sources. Sadly, the Sandybridge PRM has incorrect limits. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename (vs|wm)_max_threads to max_(vs|wm)_threads for consistency.Kenneth Graunke2011-10-2510-24/+29
| | | | | | | | | The inconsistency between vs_max_threads and max_vs_entries was rather annoying. I could never seem to remember which one was reversed, which made it harder to find quickly. "Max __ Threads" seems more natural. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove "single threaded" INTEL_DEBUG mode.Kenneth Graunke2011-10-255-18/+4
| | | | | | | | | | | | | | | | | According to the docs for 3DSTATE_PS (Gen7+) and 3DSTATE_WM (Gen6), there is a platform dependent value for the minimum number of pixel shader threads. It may also vary based on whether WIZ Hashing is on. For example, Ivybridge requires at least 4 threads if WIZ hashing is disabled, and 8 if it's enabled. Programming it to use less threads is illegal. Sandybridge appears to have similar restrictions. So on newer platforms, INTEL_DEBUG=sing will probably just hang the GPU. Rather than try to patch it up for newer platforms and extend it to support geometry shaders, just remove it as it isn't that useful anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Kill dead code in intel_miptree_copy_teximage()Chad Versace2011-10-251-59/+28
| | | | | | | | | | | | | Kill the code paths taken when src_mt is null. It is never null, otherwise there would be a segfault on line 4 of this function: GLuint width = src_mt->level[level].width; (Some interleaved lines in the diff make the real diff non-obvious. All I did was delete some code and then left-shifted what remained to correct the indentation.) Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Apply post-sync non-zero workaround to homebrew workaround.Kenneth Graunke2011-10-241-0/+2
| | | | | | | | | | | | | | | | | | In commit 3e5d3626, Eric added a homebrew workaround to fix GPU hangs in the Mesa "engine" demo and oglc's api-texcoord test. Unfortunately, his PIPE_CONTROL contains a Depth Stall, which necessitates the post-sync non-zero workaround, Fixes GPU hangs in Civilization 4, PlaneShift, and 3DMMES. Hopefully Heroes of Newerth as well, though I haven't tested that. NOTE: This is candidate for the 7.11 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40324 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41096 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-and-tested-by: Eric Anholt <[email protected]>
* intel: remove duplicated #include of texstore.hBrian Paul2011-10-231-1/+0
|
* radeon: remove unnecessary #includes of texstore.hBrian Paul2011-10-234-4/+0
|
* mesa: add swrast_texture_image::BufferBrian Paul2011-10-232-12/+12
| | | | | | | | | | | | | | | In the past, swrast_texture_image::Data has been overloaded. It could either point to malloc'd memory storing texture data, or it could point to a current mapping of GPU memory. Now, Buffer always points to malloc'd memory (if we're not using GPU memory) and Data always points to mapped memory. The next step would be to rename Data -> Map. This change also involves adding swrast functions for mapping textures and renderbuffers prior to rendering to setup the Data pointer. Plus, corresponding functions to unmap texures and renderbuffers. This is very much like similar code in the dri drivers.
* mesa: remove _mesa_alloc_texmemory(), _mesa_free_texmemory()Brian Paul2011-10-235-7/+9
| | | | Core Mesa no longer does any texture memory allocation.
* mesa: move gl_texture_image::Data, RowStride, ImageOffsets to swrastBrian Paul2011-10-2314-107/+116
| | | | | | Only swrast and the drivers that fall back to swrast need these fields now. This removes the last of the fields related to software rendering from gl_texture_image.
* i965: Set MaxIfDepth to UINT_MAX on Gen6+ and 16 on prior generations.Kenneth Graunke2011-10-211-0/+1
| | | | | | | | | | | | | Commit 488fe51cf823ccd137c667f1e92dd86f8323b723 converted the EmitNoIfs flag to MaxIfDepth, an unsigned integer saying "flatten if-statements nested beyond this depth." Unfortunately, i965 left this initialized to 0, which made ir_to_mesa attempt to flatten all if-statements. We didn't notice right away because we usually throw away ir_to_mesa's code in favor of the native VS and FS backends...but this still creates a lot of unnecessary work. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Remove copy and pasted gen7_wm_constants state atom.Kenneth Graunke2011-10-202-56/+1
| | | | | | | Now that this is identical to gen6_wm_constants, just use that instead. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use AUB_TRACE_WM_CONSTANTS in gen7_prepare_wm_push_constants.Kenneth Graunke2011-10-201-1/+1
| | | | | | | | This makes it match gen6_prepare_wm_push_constants. For some reason, it had been using AUB_TRACE_NO_TYPE. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix incorrect dirty bit in gen6_prepare_wm_push_constants.Kenneth Graunke2011-10-201-2/+2
| | | | | | | | | We definitely want CACHE_NEW_WM_PROG, not CACHE_NEW_VS_PROG. NOTE: This is a candidate for the 7.11 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Fix comparisons with uint negation.Eric Anholt2011-10-203-0/+32
| | | | | | | | | | The condmod instruction ends up generating garbage condition codes, because apparently the comparison happens on the accumulator value (33 bits for UD), not the truncated value that would be written. Fixes vs-op-neg-* Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Fix comparisions with uint negation.Eric Anholt2011-10-204-0/+49
| | | | | | | | | | The condmod instruction ends up generating garbage condition codes, because apparently the comparison happens on the accumulator value (33 bits for UD), not the truncated value that would be written. Fixes fs-op-neg-* Reviewed-by: Ian Romanick <[email protected]>
* i965: silence signed/unsigned comparison warningBrian Paul2011-10-191-1/+2
| | | | Reviewed-by: Paul Berry <[email protected]>
* i965: setup address rounding enable bitsYuanhan Liu2011-10-193-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | The patch(based on the reading of the emulator) came from while I was trying to fix the oglc pbo texImage.1PBODefaults fail. This case generates a texture with the width and height equal to window's width and height respectively, then try to texture it on the whole window. So, it's exactly one texel for one pixel. And, the min filter and mag filter are GL_LINEAR. It runs with swrast OK, as expected. But it failed with i965 driver. Well, you can't tell the difference from the screen, as the error is quite tiny. From my digging, it seems that there are some tiny error happened while getting tex address. This will break the one texel for one pixel rule in this case. Thus the linear result is taken, with tiny error. This patch would fix all oglc pbo subcase fail with the same issue on both ILK, SNB and IVB. v2: comments from Ian, make the address_round filed assignment consistent. (the sampler is alread memset to 0 by the xxx_update_samper_state caller, so need to assign 0 first) Signed-off-by: Yuanhan Liu <[email protected]>
* i915: make i830/i915_hiz_resolve_noop() staticBrian Paul2011-10-182-2/+2
|
* i965: remove unused vars in brw_set_ff_sync_message()Brian Paul2011-10-181-3/+0
|
* i965: Disassemble Ivybridge Data Port/Data Cache messages.Kenneth Graunke2011-10-181-0/+8
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Document most of the brw_instruction message structs.Kenneth Graunke2011-10-181-39/+79
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename pixel_scoreboard_clear to last_render_target for clarity.Kenneth Graunke2011-10-185-16/+16
| | | | | | | | | | | | | | | | | Finding this bit in the documentation proved challenging. It wasn't in the SEND instruction's message descriptor section, nor the data port message descriptor section. It turns out to be part of the Render Target Write message's control bits, and in the documentation is named "Last Render Target Select". Shaders that use Multiple Render Targets should set this bit on the last RT write, but not on any prior ones. The GPU does update the Pixel Scoreboard appropriately, but doesn't document this bit as directly causing a scoreboard clear. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove duplicate copies of mlen & rlen from instruction decode.Kenneth Graunke2011-10-181-13/+4
| | | | | | | | | | | | | | | | | After printing the details of a specific message, we always print out the message length and response length with nice "mlen" and "rlen" labels. For Gen5+ URB writes, we were dumping mlen and rlen a second time: urb 0 urb_write interleave used complete mlen 5, rlen 0 mlen 5 rlen 0 Also, for Gen6 data port messages, we were including mlen and rlen in the tuple of undecipherable integers. Both of these are completely redundant. So, remove them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Factor out code for setting Message Descriptors.Kenneth Graunke2011-10-181-129/+77
| | | | | | | | | | | | | | | | | | Every brw_set_???_message function had duplicated code, per-generation, to set the Message Descriptor and Extended Message Descriptor bits (SFID, message length, response length, header present, end of thread). However, these fields are actually specified as part of the SEND instruction itself; individual types of messages don't even specify them (except for header present, but that's in the same bit location). Since these are exactly the same regardless of the message type, just create a function to set them, using the generic message structs. This not only shortens the code, but hides a lot of the per-generation complexity (like the SFID being in destreg__conditionalmod) in one spot. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove EOT parameter from brw_SAMPLE and brw_set_sampler_message.Kenneth Graunke2011-10-184-13/+5
| | | | | | | | | | | | | The existing code asserted that eot == 0, as it doesn't make sense for a thread to sample a texture as the last thing it does. It doesn't make much sense to pass around a dead parameter either. Especially for a function which already has a long parameter list. So, remove the parameter and just set EOT to 0. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Document the brw_instruction Message Descriptor structures.Kenneth Graunke2011-10-181-2/+27
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename BRW_MESSAGE_TARGET_* to BRW_SFID_* and document them.Kenneth Graunke2011-10-183-60/+75
| | | | | | | | | | | | | | | When reading the data port code, it was not clear to me what these values meant, nor where I could find them in the documentation. Especially since the latest BSpec and older PRMs document them in radically different places...neither of which are near the descriptions of individual messages. Cite the documentation, and rename them to SFID to signify that these are Shared Function IDs that one can read about in the GPU overview, rather than arbitrary bitfields. While we're add it, make them an enum. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Clarify check for which cache to use on Gen6 data port reads.Kenneth Graunke2011-10-181-3/+3
| | | | | | | | | | | | | | Currently, we use the Render Cache for scratch access (read/write data) and the Sampler Cache for all read only data (pull constants). Reversing the condition here is clearer: if the caller requested the Render Cache, use that. Otherwise, they requested the Data Cache (which does not exist on Gen6) or Sampler Cache, so use the Sampler Cache. This should not change behavior in any way. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Use Ivybridge's "Legacy Data Port" for reads/writes.Kenneth Graunke2011-10-183-5/+16
| | | | | | | | | | | | | | | | Using the constant cache for reads isn't going to work for scratch reads (variably-indexed arrays or register spills), as these aren't constant at all. Also, in the new VS backend, use the proper message number for OWord Dual Block Write messages. It's now 10, instead of 9. +205 piglits. NOTE: This is a candidate for the 7.11 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Add 'mode' param to intel_region_mapChad Versace2011-10-187-16/+34
| | | | | | | | | | The 'mode' param is a bitset of GL_MAP_READ_BIT, GL_MAP_WRITE_BIT. A future commit will perform buffer resolves in intel_region_map(). So, even though the access mode is irrelevant to the GTT, the extra information allows us to intelligently avoid unneccessary buffer resolves. Signed-off-by: Chad Versace <[email protected]>
* intel: Add HiZ operations to intel_context::vtbl for all driversChad Versace2011-10-187-0/+125
| | | | | | | | | | | | | Add the following to the vtbl: hiz_resolve_depthbuffer hiz_resolve_hizbuffer For all drivers for which HiZ is not enabled, the methods are set to be no-ops. If HiZ is enabled, the methods are currently to set to empty stubs. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Initialize intel_context::vtbl after calling intelInitContext()Chad Versace2011-10-181-1/+2
| | | | | | | | | | | | intel_context::gen field is set by intelInitContext(). So, by calling intelInitContext() before initializing the vtable, we can can construct different vtables for different gens. Specifically, this allows us to set the HiZ operations to be no-ops for contexts for which HiZ is not enabled. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Fix scatter/gather for depthstencil texturesChad Versace2011-10-181-5/+5
| | | | | | | | | During anholt's MapTextureImage refactoring, the call to intel_tex_image_s8z24_create_renderbuffers was missplaced. It needs to occur *after* the miptree is allocated. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965/gen6: Fix segfault in prepare_blend_state()Chad Versace2011-10-181-1/+1
| | | | | | | | | | | | Don't dereference the color buffer if one isn't attached. This fixes the following Piglit tests in my experimental HiZ branch: glean/logicOp glean/paths Note: This is a candidate for the stable branches. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* mesa: Add dd_function_table::PrepareExecBeginChad Versace2011-10-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This hook allows the driver to prepare for a glBegin/glEnd. i965 will use the hook to avoid avoid recursive calls to FLUSH_VERTICES during a buffer resolve meta-op. Detailed Justification ---------------------- When vertices are queued during a glBegin/glEnd block, those vertices must of course be drawn before any rendering state changes. To enusure this, Mesa calls FLUSH_VERTICES as a prehook to such state changes. Therefore, FLUSH_VERTICES itself cannot change rendering state without falling into a recursive trap. This precludes meta-ops, namely i965 buffer resolves, from occuring while any vertices are queued. To avoid that situation, i965 must satisfy the following condition: that it queues no vertex if a buffer needs resolving. To satisfy this, i965 will use the PrepareExecBegin hook to resolve all buffers on entering a glBegin/glEnd block. -------- v2: Don't add dd_function_table::CleanupExecEnd. Anholt and I discovered that hook to be unnecessary. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* meta: Bump MAX_META_OPS_DEPTH from 2 to 8Chad Versace2011-10-181-1/+1
| | | | | | | | | When i965 uses (in the near future) meta-ops to perform buffer resolves, the meta-op stack exceeds depth 2. I bumped it to 8 because... 8 is bigger than 2, but not too big. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* meta: Add flag MESA_META_SELECT_FEEDBACKChad Versace2011-10-182-0/+28
| | | | | | | | | | | If this flag is set, then _mesa_meta_begin/end will save/restore the state of GL_SELECT and GL_FEEDBACK render modes. Intel's future buffer resolve meta-ops will require this, since buffer resolves may occur when the GL_RENDER_MODE is GL_SELECT. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Convert from GLboolean to 'bool' from stdbool.h.Kenneth Graunke2011-10-1889-732/+738
| | | | | | | | | | | | | | | | | I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chad Versace <[email protected]> Acked-by: Paul Berry <[email protected]>
* meta: Fix saving the active programNeil Roberts2011-10-181-1/+1
| | | | | | | | | | When saving the active program in _mesa_meta_begin, it was actually saving the fragment program instead. This means that if the application binds a program that only has a vertex shader then when the meta saved state is restored it will forget the bound program. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41969 Reviewed-by: Chad Versace <[email protected]>
* meta: fix redBits size test in get_temp_image_type()Brian Paul2011-10-131-1/+1
| | | | Fixes https://bugs.freedesktop.org/show_bug.cgi?id=41768
* i965 Gen6+: De-compact clip plane constants for old VS backend.Paul Berry2011-10-131-8/+7
| | | | | | | | | | | | | | | | | | | | | | In commit 018ea68d8780ab5baeef0b8122b8410e5e55ae6d, when I de-compacted clip planes on Gen6+, I updated both the old and new VS back-ends to reflect the change in how clip planes are stored, but I failed to change the code in gen6_vs_state.c that uploads clip plane constants when using the old VS back-end. As a result, if the set of enabled clip planes wasn't contiguous starting with 0, then clipping would not occur properly. This patch corrects gen6_vs_state.c to upload clip plane constants in the new de-compacted form. This only affects the old VS back-end (which is used for fixed-function and ARB vertex programs, not for GLSL vertex shaders). Fixes Piglit test fixed-clip-enables. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41603 Reviewed-by: Eric Anholt <[email protected]>
* intel: Assert that no batch is emitted if a region is mappedChad Versace2011-10-113-1/+32
| | | | | | | | | | | | | | | | What I would prefer to assert is that, for each region that is currently mapped, no batch is emitted that uses that region's bo. However, it's much easier to implement this big hammer. Observe that this requires that the batch flush in intel_region_map() be moved to within the map_refcount guard. v2: Add comments (borrowed from anholt's reply) explaining why the assertion is a good idea. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>