summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* gallium/radeon: add a function for adding llvm function attributesMarek Olšák2016-02-092-4/+10
| | | | | | This will be used for setting the new InitialPSInputAddr attribute. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: compile geometry shaders immediatelyMarek Olšák2016-02-091-1/+2
| | | | | | they have only 1 variant Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: split out code for deleting si_shaderMarek Olšák2016-02-091-29/+36
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move code writing tess factors into a separate functionMarek Olšák2016-02-091-9/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make LLVM IR dumping less messyMarek Olšák2016-02-093-9/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move a few r600_can_dump_shader calls to where they're neededMarek Olšák2016-02-091-5/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove useless code that handles dx10_clamp_modeMarek Olšák2016-02-093-14/+6
| | | | | | | "enable-no-nans-fp-math" is a wrong string and there was a disagreement about fixing it. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: dump SPI_PS_INPUT values along with shader statsMarek Olšák2016-02-091-0/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: read SPI_PS_INPUT_ADDR from LLVM if it returns itMarek Olšák2016-02-093-2/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't force gl_SampleMaskIn to 1 for smoothingMarek Olšák2016-02-091-7/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: split PS input interpolation code into its own functionMarek Olšák2016-02-091-56/+71
| | | | | | This will be used by the fragment shader prolog. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement forcing per-sample_interpolation using the shader key onlyMarek Olšák2016-02-096-152/+55
| | | | | | | | | | | It was partly a state and partly emulated by shader code, but since we want to do this in a fragment shader prolog, we need to put it into the shader key, which will be used to generate the prolog. This also removes the spi_ps_input states and moves the registers to the PS state. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove si_shader::ps_input_interpolateMarek Olšák2016-02-092-6/+3
| | | | | | tgsi_shader_info has this too. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move BCOLOR PS input locations after all other inputsMarek Olšák2016-02-093-29/+50
| | | | | | | | | | | | | | | | BCOLOR inputs were immediately after COLOR inputs. Thus, all following inputs were offset by 1 if color_two_side was enabled, and not offset if it was not enabled, which is a variation that's problematic if we want to have 1 variant per shader and the variant doesn't care about color_two_side (that should be handled by other bytecode attached at the beginning). Instead, move BCOLOR inputs after all other inputs, so BCOLOR0 is at location "num_inputs" if it's present. BCOLOR1 is next. This also allows removing si_shader::nparam and si_shader::ps_input_param_offset, which are useless now. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move SPI_PS_INPUT_CNTL value computation to a separate functionMarek Olšák2016-02-091-34/+40
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: generate a color_two_side variant only if the shader reads colorsMarek Olšák2016-02-091-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_shader_context initialization into a separate functionMarek Olšák2016-02-091-43/+60
| | | | | | This will be re-used later. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: remove st_is_program_nativeMarek Olšák2016-02-091-13/+0
| | | | | | The default scenario sets GL_TRUE too. Reviewed-by: Edward O'Callaghan <[email protected]>
* st/mesa: unify destroy_program_variants cases for TCS, TES, GSMarek Olšák2016-02-091-50/+16
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* st/mesa: unify get_variant functions for TCS, TES, GSMarek Olšák2016-02-093-176/+31
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* st/mesa: unify variants and delete functions for TCS, TES, GSMarek Olšák2016-02-095-214/+108
| | | | | | no difference between those Reviewed-by: Edward O'Callaghan <[email protected]>
* mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFTBrian Paul2016-02-091-2/+2
| | | | | | | | | Ilia Mirkin found/fixed the mistake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93813 Cc: "11.1" <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: rewrite save_CallLists() codeBrian Paul2016-02-091-26/+35
| | | | | | | | When glCallLists() is compiled into a display list, preserve the call as a single glCallLists rather than 'n' glCallList calls. This will matter for an upcoming display list optimization project. Reviewed-by: Ian Romanick <[email protected]>
* mesa: add missing error check in _mesa_CallLists()Brian Paul2016-02-091-0/+8
| | | | | | | | Generate GL_INVALID_VALUE if n < 0. Return early if n==0 or lists==NULL. v2: fix formatting, also check for lists==NULL. Reviewed-by: Ian Romanick <[email protected]>
* mesa: whitespace clean-ups in dlist.hBrian Paul2016-02-091-16/+31
| | | | And remove 'extern' qualifiers.
* st/mesa: don't allocate bitmap drawing state until neededBrian Paul2016-02-093-72/+77
| | | | | | | | | Most apps don't use glBitmap so don't allocate the bitmap cache or gallium state objects/shaders/etc until the first call to st_Bitmap(). v2: simplify a conditional, per Gustaw Smolarczyk. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: move the setup_bitmap_vertex_data() code into draw_bitmap_quad()Brian Paul2016-02-091-90/+78
| | | | | | Now all the code to setup the vertex data and draw it is in one place. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: refactor some bitmap drawing codeBrian Paul2016-02-091-33/+57
| | | | | | | Move setup/restoration of rendering state into helper functions. This makes the draw_bitmap_quad() function much more concise. Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: remove hack to fix up GL_ANY_SAMPLES_PASSED resultsIlia Mirkin2016-02-091-5/+0
| | | | | | | | | | Both st/mesa and i965 should return a true/false result now, and the only other driver implementing queries (radeon) doesn't support ARB_occlusion_query2 which added that pname. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: make use of the occlusion predicate queryIlia Mirkin2016-02-091-2/+10
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50: add PIPE_QUERY_OCCLUSION_PREDICATE supportIlia Mirkin2016-02-091-0/+6
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv30: add PIPE_QUERY_OCCLUSION_PREDICATE supportIlia Mirkin2016-02-091-2/+5
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* ilo: add PIPE_QUERY_OCCLUSION_PREDICATE supportIlia Mirkin2016-02-093-1/+12
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* draw: use util_pstipple_* function for stipple pattern textures and samplersNicolai Hähnle2016-02-093-112/+18
| | | | | | | | This reduces code duplication. Suggested-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: use util_pstipple_create_fragment_shaderNicolai Hähnle2016-02-091-197/+12
| | | | | | | | | This reduces code duplication. It also adds support for drivers where the fragment position is a system value. Suggested-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernelMarek Olšák2016-02-091-0/+6
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94019 Tested-by: Nick Sarnie <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* nir: remove unused nir_variable fieldsTimothy Arceri2016-02-092-20/+0
| | | | | | | These are used in GLSL IR to removed unused varyings and match transform feedback variables. There is no need to use these in NIR. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: remove unrequired forward declarationTimothy Arceri2016-02-091-2/+0
| | | | | | | This was added in 2548092ad80156a4 although I don't see why as it was already in the linker.h header. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: clean up and fix bug in varying linking rulesTimothy Arceri2016-02-091-74/+63
| | | | | | | | | | | | | | | | The existing code was very hard to follow and has been the source of at least 3 bugs in the past year. The existing code also has a bug for SSO where if we have a multi-stage SSO for example a tes -> gs program, if we try to use transform feedback with gs the existing code would look for the transform feedback varyings in the tes stage and fail as it can't find them. V2: Add more code comments, always try to remove unused inputs to the first stage. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: simplify ES Vertex/Fragment shader requirementsTimothy Arceri2016-02-091-28/+28
| | | | | | | | | | | | We really just needed to skip the existing ES < 3.1 check if we have a compute shader, all other scenarios are already covered. * No shaders is a link error. * Geom or Tess without Vertex is a link error which means we always require a Vertex shader and hence a Fragment shader. * Finally a Compute shader linked with any other stage is a link error. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: simplify required stages for linking rulesTimothy Arceri2016-02-091-43/+41
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: small tidy up now that link_shaders() exits early with 0 shadersTimothy Arceri2016-02-091-6/+4
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: don't attempt to link empty programTimothy Arceri2016-02-091-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously an empty program would go through the entire link_shaders() function and we would have to be careful not to cause a segfault. In core profile also now set link_status to false by generating an error, it was previously set to true. From Section 7.3 (PROGRAM OBJECTS) of the OpenGL 4.5 spec: "Linking can fail for a variety of reasons as specified in the OpenGL Shading Language Specification, as well as any of the following reasons: - No shader objects are attached to program." V2: Only generate an error in core profile and add spec quote (Ian) V3: generate error in ES too, remove previous check which was only applying the rule to GL 4.5/ES 3.1 and above. My understand is that this spec change is clarifying previously undefined behaviour and therefore should be applied retrospectively. The ES CTS tests for this are in ES 2 I suspect it was passing because it would have generated an error for not having both a vertex and fragment shader. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Recognize open-coded bitfield_reverse.Matt Turner2016-02-081-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Helps 11 shaders in UnrealEngine4 demos. I seriously hope they would have given us bitfieldReverse() if we exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that anyway?). instructions in affected programs: 4875 -> 4633 (-4.96%) cycles in affected programs: 270516 -> 244516 (-9.61%) I suspect there's a *lot* of room to improve nir_search/opt_algebraic's handling of this. We'd actually like to match, e.g., step2 by matching step1 once and then doing a pointer comparison for the second instance of step1, but unfortunately we generate an enormous tuple for instead. The .text size increases by 6.5% and the .data by 17.5%. text data bss dec hex filename 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is in a GL 4.0 context once we expose GL 4.0. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Handle large unsigned values in opt_algebraic.Matt Turner2016-02-081-4/+1
| | | | | | | | | | | | | | | The next patch adds an algebraic rule that uses the constant 0xff00ff00. Without this change, the build fails with return hex(struct.unpack('I', struct.pack('i', self.value))[0]) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 The hex() function handles integers of any size, and assigning a negative value to an unsigned does what we want in C. The pack/unpack is unnecessary (and as we see, buggy). Reviewed-by: Dylan Baker <[email protected]>
* nir: Do opt_algebraic in reverse order.Matt Turner2016-02-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | Walking the SSA definitions in order means that we consider the smallest algebraic optimizations before larger optimizations. So if a smaller rule is part of a larger rule, the smaller one will happen first, preventing the larger one from happening. instructions in affected programs: 32721 -> 32611 (-0.34%) helped: 106 In programs whose nir_optimize loop count changes (129 of them): before: 1164 optimization loops after: 1071 optimization loops Of the 129 affected, 16 programs' optimization loop counts increased. Prevents regressions and annoyances in the next commits. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Recognize product of open-coded pow()s.Matt Turner2016-02-081-0/+2
| | | | | | Prevents regressions in the next commit. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add opt_algebraic rules for xor with zero.Matt Turner2016-02-081-0/+2
| | | | | | | | instructions in affected programs: 668 -> 664 (-0.60%) helped: 4 Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: validate arrays of arrays on empty type delclarationsTimothy Arceri2016-02-091-25/+38
| | | | | | | | Fixes: dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_fragment dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_vertex Reviewed-by: Dave Airlie <[email protected]>
* i965: Use nir_lower_load_const_to_scalar().Kenneth Graunke2016-02-081-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I don't know why, but we never hooked up this pass Eric wrote. Otherwise, you can end up with stupid scalarized code such as: vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0) vec4 ssa_8 = ... vec1 ssa_9 = feq ssa_8, ssa_7 vec1 ssa_10 = feq ssa_8.y, ssa_7.y vec1 ssa_11 = feq ssa_8, ssa_7.z vec1 ssa_12 = feq ssa_8.y, ssa_7.w ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions. shader-db on Skylake: total instructions in shared programs: 9121153 -> 9120749 (-0.00%) instructions in affected programs: 32421 -> 32017 (-1.25%) helped: 277 HURT: 69 total cycles in shared programs: 69003364 -> 69000912 (-0.00%) cycles in affected programs: 899186 -> 896734 (-0.27%) helped: 313 HURT: 403 This also prevents regressions when disabling channel expressions. v2: Don't call opt_cse afterwards (requested by Matt). It should happen in the optimization loop below anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Matt Turner <[email protected]>