summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: Add helper function to find out the signedness of a register type.Francisco Jerez2014-02-191-0/+28
| | | | Reviewed-by: Paul Berry <[email protected]>
* i965/vec4: Use swizzle() in the ARB_vertex_program code.Francisco Jerez2014-02-192-24/+11
| | | | Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Use offset() in the ARB_fragment_program code.Francisco Jerez2014-02-191-69/+62
| | | | Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Remove fs_reg::retype.Francisco Jerez2014-02-193-20/+12
| | | | | | | | | There doesn't seem to be any reason for it to be a method, and it's surprising that the expression 'reg.retype(t)' doesn't retype its object but rather it creates a temporary with the new type. Use 'retype(reg, t)' instead. Reviewed-by: Paul Berry <[email protected]>
* i965/vec4: Trivial improvements to the with_writemask() function.Francisco Jerez2014-02-193-18/+15
| | | | | | | | | | | | | | Add assertion that the register is not in the HW_REG or IMM file, calculate the conjunction of the old and new mask instead of replacing the old [consistent with the behavior of brw_writemask(), causes no functional changes right now], make it static inline to let the compiler do a slightly better job at optimizing things, and shorten its name. v2: Assert that the new writemask is not zero to avoid undefined hardware behaviour. Reviewed-by: Paul Berry <[email protected]>
* i965: Make sure that backend_reg::type and brw_reg::type are consistent for ↵Francisco Jerez2014-02-195-0/+26
| | | | | | | | | | | | | | | fixed regs. And define non-mutating helper functions to retype fixed and normal regs with a common interface. At some point we may want to get rid of ::fixed_hw_reg completely and have fixed regs use the normal register data members (e.g. backend_reg::reg to select a fixed GRF number, src_reg::swizzle to store the swizzle, etc.), I have the feeling that this is not the last headache we're going to get because of the multiple ways to represent the same thing and the different register interface depending on the file a register is stored in... Reviewed-by: Paul Berry <[email protected]>
* i965/vec4: Add non-mutating helper functions to modify src_reg::swizzle and ↵Francisco Jerez2014-02-191-0/+24
| | | | | | ::negate. Reviewed-by: Paul Berry <[email protected]>
* i965: Add non-mutating helper functions to modify the register offset.Francisco Jerez2014-02-192-0/+24
| | | | | | | Yes, we could avoid having four copies of essentially the same code by using templates here. Reviewed-by: Paul Berry <[email protected]>
* i965/vec4: Fix off-by-one register class overallocation.Francisco Jerez2014-02-191-1/+1
| | | | Reviewed-by: Paul Berry <[email protected]>
* i965: Unify fs_generator:: and vec4_generator::mark_surface_used as a free ↵Francisco Jerez2014-02-197-38/+32
| | | | | | | | function. This way it can be used anywhere. I need it from the visitor. Reviewed-by: Paul Berry <[email protected]>
* i965: Move up duplicated fields from stage-specific prog_data to ↵Francisco Jerez2014-02-1924-188/+162
| | | | | | | | | | | | | brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and the simplification of the stage-specific brw_*_prog_data_compare. Reviewed-by: Paul Berry <[email protected]>
* i965/vec4: Add constructor of src_reg from a fixed hardware reg.Francisco Jerez2014-02-192-0/+9
| | | | Reviewed-by: Paul Berry <[email protected]>
* i965: Enable fast depth clears.Kenneth Graunke2014-02-191-1/+1
| | | | | | | They work fine now, too. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Enable HiZ on Broadwell.Kenneth Graunke2014-02-191-1/+1
| | | | | | | It appears to work fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Implement HiZ resolves on Broadwell.Kenneth Graunke2014-02-193-2/+113
| | | | | | | | | | | | | | | | | | | | | | | Broadwell's 3DSTATE_WM_HZ_OP packet makes this much easier. Instead of programming the whole pipeline, we simply have to emit the depth/stencil packets, a state override, and a pipe control. Then arrange for the state to be put back. This is easily done from a single function. v2: Use minify(mt->logical_{width,height}0, level) in 3DSTATE_WM_HZ_OP instead of intel_mipmap_level's width/height fields. Those were based on the physical width/height, and thus wrong for MSAA buffers. Eric also deleted those fields. v3: Use 0xFFFF as the sample mask regardless of what the user set (as this operation is unrelated); set the drawing rectangle to the miplevel being operated on, rather than the whole surface; remove unnecessary MAX2(..., 1) around mt->logical_depth0 (all suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Refactor Gen8 depth packet emission.Kenneth Graunke2014-02-191-72/+99
| | | | | | | | | | | | | | | | | The existing code followed the vtable function signature, which is not a great fit: many of the parameters are unused, and the function still inspects global state, making it less reusable. This patch refactors the depth buffer packet emission code into a new function which takes exactly the parameters it needs, and which uses no global state. It then makes the existing vtable function call the new one. Ideally, we would remove the vtable function, and clean up that interface. But that can happen once HiZ is working. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add #defines for the 3DSTATE_WM_HZ_OP packet's contents.Kenneth Graunke2014-02-191-0/+25
| | | | | | | We're going to need these to implement HiZ. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Bump generation check in code to disable HiZ at LODs > 0.Kenneth Graunke2014-02-191-1/+1
| | | | | | | | | Broadwell's "HiZ Resolve" operation still has the restriction that the rectangle primitive must be 8x4 aligned. So I believe we still need this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Program 3DSTATE_HIER_DEPTH_BUFFER properly on Broadwell.Kenneth Graunke2014-02-191-8/+17
| | | | | | | HiZ buffers still don't exist, but when they do, we'll set them up. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Pull format conversion logic out of brw_depthbuffer_format.Kenneth Graunke2014-02-193-32/+43
| | | | | | | | | | | | brw_depthbuffer_format is not very reusable at the moment, since it uses global state (ctx->DrawBuffer) to access a particular depth buffer. For HiZ on Broadwell, I need a function which simply converts the formats. However, at least one existing user of brw_depthbuffer_format really wants the existing interface. So, I've created a new function. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Bump MaxTexMbytes from 1GB to 1.5GB.Kenneth Graunke2014-02-181-0/+1
| | | | | | | | | | | | | | | | | Even with the other limits raised, TestProxyTexImage would still reject textures > 1GB in size. This is an artificial limit; nothing prevents us from having a larger texture. I stayed shy of 2GB to avoid the larger-than-aperture situation. For 3D textures, this raises the effective limit: - RGBA8: 645 -> 738 - RGBA16: 512 -> 586 - RGBA32F: 406 -> 465 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Bump GL_MAX_CUBE_MAP_TEXTURE_SIZE to 8192.Kenneth Graunke2014-02-181-1/+1
| | | | | | | | | | | | | | Gen4+ supports 8192x8192 cube maps. Ivybridge and later can actually support 16384, but that would place GL_MAX_CUBE_MAP_TEXTURE_SIZE above GL_MAX_TEXTURE_SIZE, which seems like a bad idea. (Unfortunately, we can't bump GL_MAX_TEXTURE_SIZE to 16384 without causing regressions due to awful W-tiled stencil buffer interactions.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Bump MAX_3D_TEXTURE_SIZE to 2048.Kenneth Graunke2014-02-181-1/+1
| | | | | | | | | | | | It's highly unlikely that there will be enough memory in the system to allocate enough space for this, but we should still expose the hardware limit. It's what the Intel Windows driver does, and it seems most other vendors do likewise. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Use conditional sends to do FB writes on HSW+.Eric Anholt2014-02-184-18/+46
| | | | | | | | | | | | | | | | | | | | | | | | | This drops the MOVs for header setup, which are totally mis-scheduled. total instructions in shared programs: 1590047 -> 1589331 (-0.05%) instructions in affected programs: 43729 -> 43013 (-1.64%) GAINED: 0 LOST: 0 glb27-trex: x before + after +-----------------------------------------------------------------------------+ | + x xx + + + | | ++ + xxx ++x xx + ** *x+ + + + x * | |+x xx x* x+++xx*x*xx+++*+*xx++** *x* x+***x*+xx+* + * + + *| | |__|__________MA___A___________|___| | +-----------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 49 62.33 65.41 63.49 63.53449 0.62757822 + 50 62.28 65.4 63.7 63.6982 0.656564 No difference proven at 95.0% confidence Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Drop dead comment about the old proj_attrib_mask optimization.Eric Anholt2014-02-181-6/+0
| | | | | | The code was removed early last year. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop mt->levels[].width/height.Eric Anholt2014-02-187-42/+23
| | | | | | | | | | | | It often confused people because it was unclear on whether it was the physical or logical, and people needed the other one as well. We can recompute it trivially using the minify() macro, clarifying which value is being used and making getting the other value obvious. v2: Fix a pasteo in intel_blit.c's dst flip. Reviewed-by: Chris Forbes <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move singlesample_mt to the renderbuffer.Eric Anholt2014-02-188-276/+168
| | | | | | | | | | | Since only window system renderbuffers can have a singlesample_mt, this lets us drop a bunch of sanity checking to make sure that we're just a renderbuffer-like thing. v2: Fix a badly-written comment (thanks Kenneth!), drop the now trivial helper function for set_needs_downsample. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop some duplicated code in DRI winsys BO updates.Eric Anholt2014-02-183-110/+38
| | | | | | | | | | | The only DRI2 vs DRI3 delta was just how to decide about frontbuffer-ness for doing the upsample. v2: Fix missing singlesample_mt->region->name update in the merged code, which would have broken the DRI2 don't-recreate-the-miptree optimization. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Simplify intel_miptree_updownsample.Eric Anholt2014-02-181-24/+11
| | | | | | | | | Pretty silly to pass in values dereferenced out of one of the arguments. v2: Get the destination size from the dst, even though the callers are always dealing with src size == dst size cases. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't try to use the ctx->ReadBuffer when asked to blorp miptrees.Eric Anholt2014-02-181-3/+4
| | | | | | | | So far it's happened to be that we're only ever calling intel_miptree_blit() (up/downsampling) from the ReadBuffer, but I stumbled over a null ReadBuffer case when debugging later parts of the series. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the mt->target of multisample renderbuffers be 2D_MS.Eric Anholt2014-02-181-3/+5
| | | | | | | | | Mostly mt->target == 2D_MS just results in a few checks that we don't try to allocate multiple LODs and don't try to do slice copies with them. But with the introduction of binding renderbuffers to textures, we need more consistency. Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Push into desktop GL mode when doing meta operations.Eric Anholt2014-02-182-23/+19
| | | | | | | | This lets us simplify our shaders, and rely on GLES-prohibited functionality (like ARB_texture_multisample) when writing these driver-internal functions. Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Fix blit shader compile on non-glsl-130 drivers.Eric Anholt2014-02-181-1/+1
| | | | | | | | | | Compare this VS to the one for the post-130 case. Fixes piglit glsl-lod-bias, and presumably tons of other code (I haven't done a full piglit run on swrast). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911 Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Don't try to enable FF texturing when we're using GLSL.Eric Anholt2014-02-141-6/+3
| | | | | | On a core context, this would throw an error. Reviewed-by: Kenneth Graunke <[email protected]>
* nouveau: fix chipset checks for nv1a by using the oclass insteadIlia Mirkin2014-02-133-7/+8
| | | | | | | | | | | | Commit f4ebcd133b9 ("dri/nouveau: NV17_3D class is not available for NV1a chipset") fixed this partially by using the correct 3d class. However there were a lot of checks left over comparing against the chipset. Reported-and-tested-by: John F. Godfrey <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: 9.2 10.0 10.1 <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* meta: Add acceleration for depth glBlitFramebuffer().Eric Anholt2014-02-121-6/+23
| | | | | | | | Surprisingly, the GLSL shaders already wrote the sampled r value to FragDepth. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51600 Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Use BindRenderbufferTexImage() for meta glBlitFramebuffer().Eric Anholt2014-02-121-9/+46
| | | | | | | | | | This avoids a CopyTexImage() on Intel i965 hardware without blorp. v2: Move the !readAtt check up higher. v3: Rebase on idr's changes, plus readAtt check is totally gone, and also fix a typo in a comment. Reviewed-by: Kenneth Graunke <[email protected]> (v2)
* i965: Add a driver hook for binding renderbuffers to textures.Eric Anholt2014-02-121-0/+36
| | | | | | | | | | | | | | | | This will let us use meta's acceleration from renderbuffers without having to do a CopyTexImage first. This is like what we do for TFP, but just taking an existing renderbuffer and binding it to a texture with whatever its format was. The implementation won't work for stencil renderbuffers, and it only does non-texture renderbuffers (but then, if you're using a texture renderbuffer, you can just pull the texture object/level/slice out of the renderbuffer, anyway). v2: Don't forget to propagate NumSamples to the teximage. Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Do a massive unindent (and rename) of blitframebuffer_texture().Eric Anholt2014-02-121-142/+144
| | | | | | | | | | | | This function is only handling the color case. We can just unindent as long as we're willing to do the check for the bit outside of the function. v2: Rebase on idr's changes, drop readAtt check that's always non-null anyway (it's a pointer into to the statically-allocated attachments array in the renderbuffer). Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* meta: Move glBlitFramebuffer() to a separate file.Eric Anholt2014-02-122-420/+466
| | | | | | | v2: Drop a bunch of unnecessary includes (by Kenneth), rebase on idr's changes. Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* meta: De-static some of meta's functions.Eric Anholt2014-02-122-96/+159
| | | | | | | | | | | I want split some meta.c code off to a separate file, so these functions can't be static any more. v2: Rebase on idr's changes, also expose setup_blit_shader, blit_shader_table_cleanup, setup_vertex_objects, setup_ff_tnl_for_blit. Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* meta: Move the meta structures to the meta header.Eric Anholt2014-02-122-283/+283
| | | | | | | | | I'd like to split some of our code to separate files, since 4k lines and growing is pretty unreasonable for all these separate operations. v2: Rebase on idr's changes. Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* meta: Fold the texture setup into setup_copypix_texture().Eric Anholt2014-02-121-11/+9
| | | | | | | | | There was this funny argument passed to setup for "did alloc decide we need to allocate new texture storage?", which goes away if we don't have the caller do alloc as a separate step. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* meta: Drop the src == dst restriction on meta glBlitFramebuffer().Eric Anholt2014-02-121-20/+0
| | | | | | | | | | | | | | From the GL_ARB_fbo spec: If the source and destination buffers are identical, and the source and destination rectangles overlap, the result of the blit operation is undefined. As far as I know, that's the only thing that would have been of concern for this. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* meta: Rename the "sampler" stuff to "blit shader".Eric Anholt2014-02-121-41/+40
| | | | | | | | | While these structs are generated per GLSL sampler type, they're structs of data-about-shaders (notably, the ID of a shader program), not data-about-samplers. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* meta: Drop a now-trivial helper function.Eric Anholt2014-02-121-12/+3
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* meta: Fold the glUseProgram() into the blit program generator.Eric Anholt2014-02-121-22/+8
| | | | | | | | Everyone was just immediately calling it and doing nothing else with the shader program id. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* meta: Simplify the blit shader setup steps.Eric Anholt2014-02-121-22/+11
| | | | | | | | The only thing that wants to track the glsl_sampler structure is the shader string generator. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Fix confusion between SWIZZLE and BRW_SWIZZLE macros.Francisco Jerez2014-02-123-4/+4
| | | | | | | | | | | | Most of the VEC4 back-end agrees on src_reg::swizzle being one of the BRW_SWIZZLE macros defined in brw_reg.h, except in two places where we use Mesa's SWIZZLE macros. There is even a doxygen comment saying that Mesa's macros are the right ones. They are incompatible swizzle representations (3 bits vs. 2 bits per component), and the code using Mesa's works by pure luck. Fix it. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/fs: Remove fs_reg::sechalf.Francisco Jerez2014-02-124-12/+16
| | | | | | | | | The same effect can be achieved using ::subreg_offset. Remove the less flexible alternative and define a convenience function to keep the fs_reg interface sane. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Paul Berry <[email protected]>