summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Move is_zero/one/null/accumulator into backend_reg.Matt Turner2014-07-056-93/+44
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make a common backend_reg class.Matt Turner2014-07-054-42/+36
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop imm union from visitor register classes.Matt Turner2014-07-052-14/+0
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use immediate storage in brw_reg for visitor regs.Matt Turner2014-07-056-41/+37
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* gallium: rename PIPE_CAP_TGSI_VS_LAYER to also have _VIEWPORTIlia Mirkin2014-07-032-2/+2
| | | | | | | | | Now that this cap is used to determine the availability of both, adjust its name to reflect the new reality. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa/st: enable AMD_vertex_shader_viewport_indexIlia Mirkin2014-07-032-0/+6
| | | | | | | | | | The assumption is that any driver capable of emitting layer from the vertex shader and supporting viewports should be able to also handle emitting viewport index from the vertex shader. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Tobias Droste <[email protected]>
* i965: expose AMD_vertex_shader_viewport_index on gen7+Ilia Mirkin2014-07-021-1/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa: add support for AMD_vertex_shader_viewport_indexIlia Mirkin2014-07-022-0/+2
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Tested-by: Tobias Droste <[email protected]>
* mesa/st: enable ARB_fragment_layer_viewportIlia Mirkin2014-07-021-0/+1
| | | | | | | | | | If multiple viewports are supported, that implies the presence of a GS and layered rendering, so we can enable ARB_fragment_layer_viewport as well. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965/gen6: Add a spec citation about push constant packet requirements.Eric Anholt2014-07-021-1/+8
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add a comment about null renderbuffer surfaces and why they exist.Eric Anholt2014-07-023-2/+22
| | | | | | I noticed this when trying to find comments about pull constant buffers. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Update a ton of comments about constant buffers.Eric Anholt2014-07-024-32/+74
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Merge VS/GS and WM pull constant buffer upload paths.Eric Anholt2014-07-024-53/+42
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6+: Merge VS/GS and WM push constant buffer upload paths.Eric Anholt2014-07-024-66/+52
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move dispatch_grf_start_reg and first_curbe_grf into stage_prog_data.Eric Anholt2014-07-0214-35/+36
| | | | | | | I wanted to access this value from stage-generic code, so stop storing it under two different names. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix state flags for gen4/5 CURBE.Eric Anholt2014-07-021-8/+8
| | | | | | | | If we had some NOS affecting VS compilation that resulted in optimization changing the set of constants to be uploaded, we might not have reuploaded the constants. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove a dead define.Eric Anholt2014-07-021-2/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Reuse libdrm's header for AUB definitions.Eric Anholt2014-07-024-71/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix stale comments about the state cache.Eric Anholt2014-07-022-2/+9
| | | | | | This changed in the state streaming work years ago. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix stale binding table comment.Eric Anholt2014-07-021-2/+0
| | | | | | | I recently moved the code from the mentioned location right into this file. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop the memcmp for finding duplicated CURBE uploads.Eric Anholt2014-07-024-50/+2
| | | | | | | | | | | | | | At this point, the extra copy of the data and memcmp are as expensive as just re-uploading. Note: now that we'll always upload, and brw_constant_buffer watches BRW_NEW_BATCH anyway, we don't need to explicitly unref the old curbe_bo at batch reset time. No significant performance difference on glamor copywinwin10 (n=55), despite that test having a 98% hit rate on the cache. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Reuse intel_upload.c for gen4/5 constant buffers.Eric Anholt2014-07-023-31/+7
| | | | | | No performance difference on glamor with copywinwin10 (n=40) on my gm45. Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: add support for indirect drawingChristoph Bumiller2014-07-023-1/+14
|
* xmlconfig/dri: bool -> unsigned charDave Airlie2014-07-023-10/+8
| | | | | | | | | Drop stdbool, due to the X server being a pain and having struct members called bool, although I've sent a patch to fix that we should retain stupidity here. Use unsigned char which is what GLboolean is anyways. Signed-off-by: Dave Airlie <[email protected]>
* i965/fs: Update discard jump to preserve uniform loads via sampler.Cody Northrop2014-07-011-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 17c7ead7 exposed a bug in how uniform loading happens in the presence of discard. It manifested itself in an application as randomly incorrect pixels on the borders of conditional areas. This is due to how discards jump to the end of the shader incorrectly for some channels. The current implementation checks each 2x2 subspan to preserve derivatives. When uniform loading via samplers was turned on, it uses a full execution mask, as stated in lower_uniform_pull_constant_loads(), and only populates four channels of the destination (see generate_uniform_pull_constant_load_gen7()). It happens incorrectly when the first subspan has been jumped over. The series that implemented this optimization was done before the changes to use samplers for uniform loads. Uniform sampler loads use special execution masks and only populate four channels, so we can't jump over those or corruption ensues. This fix only jumps to the end of the shader if all relevant channels are disabled, i.e. all 8 or 16, depending on dispatch. This preserves the original GLbenchmark 2.7 speedup noted in commit beafced2. It changes the shader assembly accordingly: before : (-f0.1.any4h) halt(8) 17 2 null { align1 WE_all 1Q }; after(8) : (-f0.1.any8h) halt(8) 17 2 null { align1 WE_all 1Q }; after(16): (-f0.1.any16h) halt(16) 17 2 null { align1 WE_all 1H }; v2: Cleaned up comments and conditional ordering. v3: Fix typo. Signed-off-by: Cody Northrop <[email protected]> Reviewed-by: Mike Stroyan <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79948
* i965/fs: Mark case unreachable to silence warning.Matt Turner2014-07-011-0/+2
| | | | Reviewed-by: Ian Romanick <[email protected]>
* i965: Use unreachable() instead of unconditional assert().Matt Turner2014-07-0155-324/+182
| | | | Reviewed-by: Ian Romanick <[email protected]>
* mesa: Make unreachable macro take a string argument.Matt Turner2014-07-016-14/+16
| | | | | | To aid in debugging. Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Remove useless conditionals.Matt Turner2014-07-011-6/+3
| | | | | Setting a couple of bits is the same cost or less as conditionally setting a couple of bits.
* i965/fs: Pass cfg to calculate_live_intervals().Matt Turner2014-07-016-12/+15
| | | | | | | We've often created the CFG immediately before, so use it when available. Reviewed-by: Ian Romanick <[email protected]>
* i965: Mark fields in the live interval classes protected.Matt Turner2014-07-012-16/+19
| | | | | | | | cfg, for instance, is a pointer to a local variable in calculate_live_intervals, certainly not valid after that function has returned. Reviewed-by: Ian Romanick <[email protected]>
* i965: Use typed foreach_in_list_safe instead of foreach_list_safe.Matt Turner2014-07-0111-55/+20
| | | | Acked-by: Ian Romanick <[email protected]>
* i965: Use typed foreach_in_list instead of foreach_list.Matt Turner2014-07-0118-184/+76
| | | | Acked-by: Ian Romanick <[email protected]>
* i965: Add and use foreach_inst_in_block macros.Matt Turner2014-07-017-25/+17
| | | | Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Use is_head_sentinel() instead of ->prev == NULL.Matt Turner2014-07-011-1/+1
| | | | | | | Makes it more clear what we're doing and requires less knowledge of exec_list. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add and use foreach_list_typed_safe.Matt Turner2014-07-011-3/+1
| | | | Acked-by: Ian Romanick <[email protected]>
* mesa: Add and use foreach_in_list_use_after.Matt Turner2014-07-012-9/+2
| | | | Reviewed-by: Ian Romanick <[email protected]>
* mesa: Use typed foreach_in_list_safe instead of foreach_list_safe.Matt Turner2014-07-011-3/+1
| | | | Reviewed-by: Ian Romanick <[email protected]>
* mesa: Use typed foreach_in_list instead of foreach_list.Matt Turner2014-07-013-85/+41
| | | | Reviewed-by: Ian Romanick <[email protected]>
* mesa: update comment for UniformBufferSize to indicate size is in bytesBrian Paul2014-07-011-1/+1
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: fix incorrect size of UBO declarationsBrian Paul2014-07-011-1/+8
| | | | | | | | | UniformBufferSize is in bytes so we need to divide by 16 to get the number of constant buffer slots. Also, the ureg_DECL_constant2D() function takes first..last parameters so we need to subtract one for the last value. Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: don't use address register for constant-indexed ir_binop_ubo_loadBrian Paul2014-07-011-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Before, we were always using the address register and indirect addressing to index into a UBO constant buffer. With this change we only do that when necessary. Using the piglit bin/arb_uniform_buffer_object-rendering test as an example: Shader code: uniform ub_rot {float rotation; }; ... m[1][1] = cos(rotation); Before: IMM[1] INT32 {0, 1, 0, 0} 1: UARL ADDR[0].x, IMM[1].xxxx 2: MOV TEMP[0].x, CONST[3][ADDR[0].x].xxxx 3: COS TEMP[1].x, TEMP[0].xxxx After: 0: COS TEMP[0].x, CONST[3][0].xxxx Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: allow 2D indexing for all shader types in translate_src()Brian Paul2014-07-011-1/+4
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: don't ignore const buf index in src_register()Brian Paul2014-07-011-1/+1
| | | | | | | Otherwise, if we were creating a const buffer src register for a UBO the index into the UBO was always zero. Reviewed-by: Roland Scheidegger <[email protected]>
* mesa/st: add vertex stream supportIlia Mirkin2014-07-012-4/+8
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add a cap for max vertex streamsIlia Mirkin2014-07-011-0/+5
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add an index argument to create_queryIlia Mirkin2014-07-011-3/+3
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add support for stream in so infoIlia Mirkin2014-07-011-0/+1
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add vertex stream argument to EMIT/ENDPRIMIlia Mirkin2014-07-011-2/+2
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965/fs: Mark predicated PLN instructions with dependency hints.Matt Turner2014-06-301-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | To implement the unlit_centroid_workaround, previously we emitted (+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 1Q }; (-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 1Q }; where the flag register contains the channel enable bits from g0. Since the predicates are complementary, the pair of pln instructions write to non-overlapping components of the destination, which is the case that the dependency control hints are designed for. Typically setting dependency control hints on predicated instructions isn't safe (if an instruction doesn't execute due to the predicate, it won't update the scoreboard, leaving it in a bad state) but since we must have at least one channel executing (i.e., +f0 is true for some channel) by virtue of the fact that the thread is running, we can put the +f0 pln instruction last and set the hints: (-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 NoDDClr 1Q }; (+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 NoDDChk 1Q }; Reviewed-by: Kristian Høgsberg <[email protected]>