summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Don't set dependency hints on instructions with spilled destinationsJason Ekstrand2014-10-271-0/+8
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Make scratch write instructions use the correct execution sizeJason Ekstrand2014-10-271-1/+1
| | | | Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Use correct spill offsetsJason Ekstrand2014-10-271-6/+5
| | | | | | | | | | | | Different platforms require the offset to be in different units. However, the generator fixes all of this up for us and only requires an offset in bytes. Previously, we were getting this wrong all over the place. Some computed/used it correctly as bytes while others treated the offset as whole registers or computed it as bytes or bytes*2 in SIMD16 mode. This commit cleans all this up and makes us properly treat it as bytes everywhere. Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Use the spill destination for the message header on GEN >= 7Jason Ekstrand2014-10-271-6/+13
| | | | Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Don't [un]spill multiple registers at a time in SIMD8 modeJason Ekstrand2014-10-271-2/+4
| | | | | | | | | I thought this would be a clever way to make spilling less expensive. However, it appears that the oword read/write messages we are using for spilling ignore the execution size and assume SIMD16 whenever working with more than one register. Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Use instruction execution sizes when generating scratch reads/writesJason Ekstrand2014-10-271-4/+4
| | | | Reviewed-by: Kristian Høgsberg <[email protected]>
* i965: Skip recalculating URB allocations if the entry size didn't change.Eric Anholt2014-10-244-5/+19
| | | | | | | | | We only get here if the VS/GS compiled programs change, but we can even skip it if the VS/GS size didn't change. Affects cairo runtime on glamor by -1.26471% +/- 0.674335% (n=234) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Silence unused parameter warning in brw_dump_irIan Romanick2014-10-245-7/+5
| | | | | | | | | | | | Just remove the parameter. Silences: brw_program.c: In function 'brw_dump_ir': brw_program.c:566:33: warning: unused parameter 'brw' [-Wunused-parameter] Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove brwIsProgramNativeIan Romanick2014-10-241-9/+0
| | | | | | | | | | | | | | | Originally I just fixed some unused parameter warnings in this function. However, Ken pointed out: "You could instead remove this driver hook. If the dd pointer is NULL, arbprogram.c will return true. I think I'd prefer that." Way, way back in time, I think _mesa_GetProgramivARB had the opposite behavior. Given that it works the way it now works, I also prefer removing the driver hook. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove brw_new_shader_programIan Romanick2014-10-243-13/+0
| | | | | | | | | | It was identical to the default implementation in _mesa_new_shader_program. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Only use _mesa_ClipControl if the extension is supportedIan Romanick2014-10-241-4/+7
| | | | | | | | | Fixes many piglit failures on IVB since 85edaa8. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85425 Reviewed-by: Jason Ekstrand <[email protected]> Cc: Mathias Fröhlich <[email protected]>
* i965/fs: Compute q-values for register allocation manuallyJason Ekstrand2014-10-241-2/+56
| | | | | | | | | | | | | | | | | | | | | Previously, we were allowing the register allocation code to do the computation for us in ra_set_finalize. However, the runtime for this computation is O(c^4 * g) where c is the number of classes and g is the number of GRF registers. However, these q-values are directly computable based on the way we lay out our register classes so there is no need for the aweful runtime algorithm. We were doing ok until commit 7210583eb where we bumped the number of register classes from 11 to 16. While startup times don't normally matter, this caused piglit to take 4 times as long to run on Bay Trail. This patch should make generating the ra_set much faster and melt the piglit run times. v2: Fixed a couple of bugs. I have now verified that the same q-values are generated both ways. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Don't interfere with too many base registersJason Ekstrand2014-10-241-2/+2
| | | | | | | | | | | | On older GENs in SIMD16 mode, we were accidentally building too much interference into our register classes. Since everything is divided by 2, the reigster allocator thinks we have 64 base registers instead of 128. The actual GRF mapping still needs to be doubled, but as far as the ra_set is concerned, we only have 64. We were accidentally adding way too much interference. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Properly precolor payload registers on GEN5 in SIMD16Jason Ekstrand2014-10-241-1/+10
| | | | | | | | | | | For GEN6 SIMD16 mode, we have to 2-align all the registers, so we only have the even-numbered ones. This means that we have to divide the register number by 2 when we precolor. This wasn't a problem before because we were setting up the interference between ra_node registers wrong. This will be fixed in the next commit. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Add another use of MAX_VGRF_SIZEJason Ekstrand2014-10-241-1/+1
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Handle clip control in meta operations.Mathias Fröhlich2014-10-242-0/+9
| | | | | | | | | | | Restore clip control to the default state if MESA_META_VIEWPORT or MESA_META_DEPTH_TEST is requested. v3: Handle clip control state with MESA_META_TRANSFORM. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]>
* mesa: Refactor viewport transform computation.Mathias Fröhlich2014-10-241-17/+9
| | | | | | | | | | This is for preparation of ARB_clip_control. v3: Add comments. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]>
* i965: Silence unused variable warning.Matt Turner2014-10-231-2/+1
|
* i965/fs: Silence uninitialized variable warning.Matt Turner2014-10-231-0/+1
| | | | | | | The compiler isn't privy to the knowledge that we're doing at least one framebuffer write. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Generate better code for ir_triop_csel.Kenneth Graunke2014-10-211-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | Previously, we generated an extra CMP instruction: cmp.ge.f0(8) g6<1>D g1<0,4,1>F 0F cmp.nz.f0(8) null g6<4,4,1>D 0D (+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F The first operand is always a boolean, and we want to predicate the SEL on that. Rather than producing a boolean value and comparing it against zero, we can just produce a condition code in the flag register. Now we generate: cmp.ge.f0(8) null g1<0,4,1>F 0F (+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F No difference in shader-db. v2: Remember to delete the old code (thanks Matt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Simplify visit(ir_expression *)'s result_src/dst setup.Kenneth Graunke2014-10-211-13/+6
| | | | | | | | | | | | | Using dst_reg(this, ir->type) automatically sets the writemask to the proper size for the type; src_reg(dst_reg) preserves that. This should be equivalent, but less code. Note that src_reg(dst_reg) either uses SWIZZLE_XXXX or SWIZZLE_XYZW, so the old code did need the manual writemask adjustment, since it constructed the registers the other way around. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Delete some dead code in visit(ir_expression *).Kenneth Graunke2014-10-211-8/+0
| | | | | | | | | | | Nothing uses the vector_elements temporary variable. Setting this->result.file is dead because we overwrite this->result a few lines later. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Generate better code for ir_triop_csel.Kenneth Graunke2014-10-211-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | Previously, we generated an extra CMP instruction: cmp.ge.f0(8) g4<1>D g2<0,1,0>F 0F cmp.nz.f0(8) null g4<8,8,1>D 0D (+f0) sel(8) g120<1>F g2.4<0,1,0>F g3<0,1,0>F The first operand is always a boolean, and we want to predicate the SEL on that. Rather than producing a boolean value and comparing it against zero, we can just produce a condition code in the flag register. Now we generate: cmp.ge.f0(8) null g2<0,1,0>F 0F (+f0) sel(8) g124<1>F g2.4<0,1,0>F g3<0,1,0>F total instructions in shared programs: 5473459 -> 5473253 (-0.00%) instructions in affected programs: 6219 -> 6013 (-3.31%) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* meta/msaa-blit: consider weird sample count case unreachableChris Forbes2014-10-181-0/+1
| | | | | | | | Suppresses a bunch of warning noise about sample_map possibly being used uninitialized. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/fs: Change the type of booleans to UD and emit correct immediatesJason Ekstrand2014-10-173-16/+16
| | | | | | | | | | | | | | | | | Before, we used the a signed d-word for booleans and the immedates we emitted varried between signed and unsigned. This commit changes the type to unsigned (I think that makes more sense) and makes immediates more consistent. This allows copy propagation to work better cleans up some instructions. total instructions in shared programs: 5473519 -> 5465864 (-0.14%) instructions in affected programs: 432849 -> 425194 (-1.77%) GAINED: 27 LOST: 0 Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Don't pass ir_variable * to emit_sampleid_setup().Kenneth Graunke2014-10-173-4/+4
| | | | | | | | | gl_SampleID is a built-in variable that always is of type "int". Suggested by Connor Abbott. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* mesa: Drop the "target" parameter from NewBufferObject().Kenneth Graunke2014-10-164-9/+8
| | | | | | | | | | | NewBufferObject took a "target" parameter, which it blindly passed to _mesa_initialize_buffer_object(), which ignored it. Not much point in passing it around. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Flag BRW_ATOMIC_COUNTER_BUFFER when a possible ABO is respecifiedChris Forbes2014-10-161-0/+2
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/disasm: Add missing message type for Gen7 DP untyped surface readChris Forbes2014-10-161-0/+1
| | | | | | | | This is used to implement GLSL's atomicCounter() intrinsic. Previously it *worked*, but the disassembly was bogus. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Correctly use ABO count to trigger flagging of new surfaces.Chris Forbes2014-10-161-1/+1
| | | | | | | | This would have *almost never* actually been an issue, since other state tends to get flagged at the same time as new ABOs -- but still bogus. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: No longer reemit textures on BRW_NEW_UNIFORM_BUFFERChris Forbes2014-10-161-2/+1
| | | | | | | | This didn't make any sense, but papered over the missing TexBO flagging we've just fixed, in a bunch of cases. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Dirty state in BO reallocation based on usage historyChris Forbes2014-10-161-1/+4
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Have mesa flag BRW_NEW_TEXTURE_BUFFER when a TexBO binding changesChris Forbes2014-10-161-0/+1
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Add new dirty flag for new TexBOs.Chris Forbes2014-10-163-0/+4
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/fs: don't make a fake ir_texture in the Mesa IR frontendConnor Abbott2014-10-151-14/+5
| | | | | | | | | Now that we've made all the texture emit code mostly independent of GLSL IR, this isn't necessary any more. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Refactor the texture emission logic into a single function.Kenneth Graunke2014-10-153-104/+144
| | | | | | | | | | | | | | | | | Before, we had 3 different emit functions for various different gen's, as well as some ancilliary work that was the same across all gen's which was either contained in functions or duplicated across the GLSL IR and Mesa IR backends. Now, we have a single method, emit_texture(), that takes all the information needed to make a texture instruction and handles all the setup, and all we have to do to emit a texture instruction while converting from GLSL IR, Mesa IR, or any new backend is to extract the information emit_texture() needs and then call it. v2: Significant rebasing (by Ken). Signed-off-by: Connor Abbott <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Make gather_channel() not use ir_texture.Connor Abbott2014-10-152-5/+4
| | | | | | | | Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Make swizzle_result() not use ir_texture.Connor Abbott2014-10-153-8/+9
| | | | | | | | Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: fix integer textures with swizzlesConnor Abbott2014-10-151-0/+1
| | | | | | | | | This happened to work before, but it would convert the output to a float and then back to an integer which seems bad. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: don't pass in ir_texture to emit_texture_*Connor Abbott2014-10-153-24/+23
| | | | | | | | At this point, the only thing it's used for is the opcode. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: don't use ir->type in emit_texture_gen4()Connor Abbott2014-10-151-4/+1
| | | | | | | | We already have the type from the original destination. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Don't use ir->lod_info.grad.dPd<x,y> in emit_texture_*.Connor Abbott2014-10-153-18/+31
| | | | | | | | | | | | | This drops a dependency on ir_texture objects. v2 (Ken): Rename lod_components to grad_components, as it only has a meaningful value for ir_txd. We could set it to 1 for TXL, but there's no real need. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Don't use ir->coordinate in emit_texture_*.Connor Abbott2014-10-153-31/+39
| | | | | | | | This drops a dependency on ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: make rescale_texcoord() not use ir_texture.Connor Abbott2014-10-153-8/+8
| | | | | | | | | | Our new IR won't have ir_texture objects, but using glsl_type is fine. v2 (Ken): Drop redundant ir->coordinate NULL check; rebase. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Make emit_mcs_fetch() not use ir_texture.Connor Abbott2014-10-152-4/+4
| | | | | | | Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Rename "length" to "components" in emit_mcs_fetch().Kenneth Graunke2014-10-151-6/+6
| | | | | | | This is slightly clearer. Based on a patch by Connor Abbott. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Make brw_texture_offset() not use ir_texture.Connor Abbott2014-10-154-12/+15
| | | | | | | | Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: don't use ir->offset in emit_texture_gen5.Connor Abbott2014-10-153-5/+8
| | | | | | | | v2 (Ken): Refactor the Gen7 code separately; rebase. Signed-off-by: Connor Abbott <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Move texel offset handling to visit(ir_texture *).Kenneth Graunke2014-10-153-11/+29
| | | | | | | | | | | | This moves the handling of non-constant texel offset subexpression trees to the place where we visit other such subtrees. It also removes some uses of ir->offset in emit_texture_gen7, which will be useful when we write the backend for our new upcoming IR. Based on a patch by Connor Abbott. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Drop ir->op != ir_txf condition in offset checking.Kenneth Graunke2014-10-152-4/+3
| | | | | | | | | brw_lower_unnormalized_offset sets ir->offset to NULL if it applies the texelFetchOffset workarounds, so there's no need to special case it here---there won't be an offset for ir_txf. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>