summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* st/mesa: expose ARB_indirect_parameters when the backend driver allowsIlia Mirkin2016-01-072-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: add support for ARB_indirect_parameters draw functionsIlia Mirkin2016-01-073-0/+234
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: add parameter buffer, used for ARB_indirect_parametersIlia Mirkin2016-01-074-0/+25
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glapi: add ARB_indirect_parameters definitionsIlia Mirkin2016-01-077-1/+63
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: add support for real ARB_multi_draw_indirectIlia Mirkin2016-01-074-18/+47
| | | | | | | The draw groups are now split up into groups of 32 if there's a non-packed stride, or in groups of 400-500 if the draw data is packed. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: adjust indirect draw macros to handle multiple draws at onceIlia Mirkin2016-01-073-52/+101
| | | | | | | These are still invoked one at a time, but the underlying macro can handle multiple draws. Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: add support for new mesa indirect draw interfaceIlia Mirkin2016-01-073-9/+84
| | | | | | | | | | This shifts all indirect draws to go through the new function. If the driver doesn't have support for multi draws, we break those up and perform N draws. Otherwise, we pass everything through for just a single draw call. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add caps to expose support for multi indirect drawsIlia Mirkin2016-01-0716-0/+35
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add sufficient draw interface to allow new indirect featuresIlia Mirkin2016-01-071-1/+10
| | | | | | | | This makes it possible to support indirect multidraws as well as having the number of such draws to come from a separate GPU resource. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vbo: create a new draw function interface for indirect drawsIlia Mirkin2016-01-074-75/+89
| | | | | | | | | | | | | | All indirect draws are passed to the new draw function. By default there's a fallback implementation which pipes it right back to draw_prims, but eventually both the fallback and draw_prim's support for indirect drawing should be removed. This should allow a backend to properly support ARB_multi_draw_indirect and ARB_indirect_parameters. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* llvmpipe: do 64bit plane calculations in the sse pathRoland Scheidegger2016-01-083-62/+150
| | | | | | | | | | | | The sse path was pretty much disabled for practical purposes because the largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations. This is actually not that difficult, though a problem is that we can't do a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall, the code still looks reasonable, though it's not like changes there in setup really make much of a difference in the end... Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: don't store eo as 64bit intRoland Scheidegger2016-01-084-11/+16
| | | | | | | | | | | eo, just like dcdx and dcdy, cannot overflow 32bit. Store it as unsigned though just in case (it cannot be negative, but in theory twice as big as dcdx or dcdy so this gives it one more bit). This doesn't really change anything, albeit it might help minimally on 32bit archs. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: use aligned data for the assembly program in setupRoland Scheidegger2016-01-081-17/+21
| | | | | | | | | Back in the day (before 24678700edaf5bb9da9be93a1367f1a24cfaa471) the values were not actually in a struct but even then I can't see why we didn't simply align the values. Especially since it's trivial to do so. (Not that it actually matters since the code is pretty much unused for now.) Reviewed-by: Oded Gabbay <[email protected]>
* draw: initialize prim header flags when clipping linesRoland Scheidegger2016-01-081-0/+2
| | | | | | | | | Otherwise, clipped lines would have undefined stippling reset bit if line stippling is enabled. (Untested, and I just assume copying over the bits from the original line is actually the right thing to do.) Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix line stippling with unfilled primsRoland Scheidegger2016-01-081-18/+38
| | | | | | | | | | | | | The unfilled stage was not filling in the prim header, and the line stage then decided to reset the stipple counter or not based on the uninitialized data. This causes some failures in conform linestipple test (albeit quite randomly happening depending on environment). So fill in the prim header in the unfilled stage - I am not entirely sure if anybody really needs determinant after that stage, but there's at least later stages (wide line for instance) which copy over the determinant as well. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glsl: replace null check with assertTimothy Arceri2016-01-081-3/+1
| | | | | | | | This was added in 54f583a20 since then error handling has improved. The test this was added to fix now fails earlier since 01822706ec Reviewed-by: Matt Turner <[email protected]>
* i965: use _mesa_delete_buffer_objectNicolai Hähnle2016-01-071-1/+1
| | | | | | | | | This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0 11.1" <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i915: use _mesa_delete_buffer_objectNicolai Hähnle2016-01-071-1/+1
| | | | | | | | | This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0 11.1" <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* radeon: use _mesa_delete_buffer_objectNicolai Hähnle2016-01-071-1/+1
| | | | | | | | | This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0 11.1" <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* st/mesa: use _mesa_delete_buffer_objectNicolai Hähnle2016-01-071-3/+1
| | | | | | | This is more future-proof than the current code. Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0 11.1" <[email protected]>
* mesa/bufferobj: make _mesa_delete_buffer_object externally accessibleNicolai Hähnle2016-01-072-1/+5
| | | | | | | | | gl_buffer_object has grown more complicated and requires cleanup. Using this function from drivers will be more future-proof. Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0 11.1" <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* llvmpipe: use sse2 conv code for altivecOded Gabbay2016-01-071-2/+2
| | | | | | | | | | | | | | | In lp_build_conv() and lp_build_conv_auto(), there is a special case of conversion when sse2 is present. That code path is suitable without any changes to altivec, because all the functions that are called in that code path already support altivec. This patch increase the FPS in POWER arch across the board between 10%-25% I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2. Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: adjust the parameters of si_shader_dumpMarek Olšák2016-01-073-20/+11
| | | | | | | The function will be extended to dump all binaries shaders will consist of, so si_shader* makes sense here. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_shader_dump call out of si_compile_llvmMarek Olšák2016-01-072-2/+11
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: inline si_shader_binary_readMarek Olšák2016-01-073-11/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_shader_dump call out of si_shader_binary_readMarek Olšák2016-01-073-20/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: separate shader dumping code to si_shader_dump and *_dump_statsMarek Olšák2016-01-071-12/+30
| | | | | | | Eventually, I'd like to dump stats for several combined binaries, which is why you don't see a binary parameter in si_shader_dump_stats Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add si_shader_destroy_binaryMarek Olšák2016-01-072-5/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't pass si_shader to si_compile_llvmMarek Olšák2016-01-073-18/+28
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_shader_binary_upload out of si_compile_llvmMarek Olšák2016-01-072-4/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: always keep shader code, rodata, and relocs in memoryMarek Olšák2016-01-071-7/+3
| | | | | | | We won't compile shaders in draw calls, but we will concatenate shader binaries according to states in draw calls, so keep the binaries. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't pass si_shader to si_shader_binary_readMarek Olšák2016-01-073-14/+19
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't pass si_shader to si_shader_binary_read_configMarek Olšák2016-01-073-17/+19
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add struct si_shader_configMarek Olšák2016-01-075-64/+68
| | | | | | | There will be 1 config per variant, which will be a union of configs from {prolog, main, epilog}. For now, just add the structure. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move NULL exporting into a separate functionMarek Olšák2016-01-071-15/+22
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move MRT color exporting into a separate functionMarek Olšák2016-01-071-41/+57
| | | | | | This will be used by a fragment shader epilog. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use EXP_NULL for pixel shaders without outputsMarek Olšák2016-01-072-6/+3
| | | | | | This never happens currently. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: only use LLVMBuildLoad once when updating color outputs at the endMarek Olšák2016-01-071-47/+20
| | | | | | | | | | | without LLVMBuildStore. So: - do LLVMBuildLoad - update the values as necessary - export Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: export "undef" values for undefined PS outputsMarek Olšák2016-01-071-9/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move MRTZ export into a separate functionMarek Olšák2016-01-071-51/+62
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: simplify setting the DONE bit for PS exportsMarek Olšák2016-01-072-73/+55
| | | | | | First find out what the last export is and simply set the DONE bit there. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilationMarek Olšák2016-01-073-16/+28
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: write all MRTs only if there is exactly one outputMarek Olšák2016-01-072-4/+5
| | | | | | | | This doesn't fix a known bug, but better safe than sorry. Also, simplify the expression in si_shader.c. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilationMarek Olšák2016-01-073-9/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: determine DB_SHADER_CONTROL outside of shader compilationMarek Olšák2016-01-073-28/+40
| | | | | | | because the API pixel shader binary will not emulate alpha test one day, so the KILL_ENABLE bit must be determined elsewhere. Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: set which color components are read by a fragment shaderMarek Olšák2016-01-072-8/+23
| | | | | | This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: fix tgsi_shader_info::reads_zMarek Olšák2016-01-071-2/+3
| | | | | | This has no users in Mesa. Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: set if a fragment shader writes sample maskMarek Olšák2016-01-072-0/+3
| | | | | | This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: Disallow vectorization of vector_insert/extract.Kenneth Graunke2016-01-061-0/+2
| | | | | | | | | | | | | | | | | | vector_insert takes a vector, a scalar location, and a scalar value, and produces a new vector with that component updated. As such, it can't be vectorized properly. vector_extract takes a vector and a scalar location, and returns that scalar component of the vector. Vectorization doesn't really make any sense. Treating both as horizontal operations makes sure the vectorizer won't try to touch these. Found by inspection. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* softpipe: tell draw about the vertex layout we wantRoland Scheidegger2016-01-074-58/+105
| | | | | | | | | | | | | | | This makes it more similar to llvmpipe. It also allows us to let draw emit code handle things like getting zeros for non-existing vs outputs automatically. There probably isn't really any overhead either way, there isn't really any "simply copy everything" code in the emit path it would copy each attrib individually just the same. Likewise, we still do another mapping step in softpipe as the layout may still not match exactly (same as in llvmpipe, should probably nuke the pointless mapping in both drivers). This fixes the piglit arb_fragment_layer_viewport no_gs/no_write tests. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>