summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: Add debug option to force SSE2.Jose Fonseca2016-04-031-11/+14
| | | | | | For simulating less capable machines. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Test abs.Jose Fonseca2016-04-031-0/+1
| | | | Trivial.
* llvmpipe: Build lp_test_arit on MSVC too.Jose Fonseca2016-04-031-3/+1
| | | | | | It builds fine now. Probably due to C99 support. Trivial.
* gallivm: Fix performance regressions due to vector selects.Jose Fonseca2016-04-031-22/+18
| | | | | | | | | LLVM often can't determine the mask elements are all ones/zeros, and there doesn't seem to be a good way to hint that. Thanks to Roland Scheidegger for spotting and analyzing the issue. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Remove lp_build_load_volatile.Jose Fonseca2016-04-032-12/+0
| | | | | | | No longer needed. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards.Jose Fonseca2016-04-039-27/+39
| | | | | | | | | Only provide a fallback for LLVM 3.3. One less dependency on LLVM C++ interface. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: remove unrequired elseTimothy Arceri2016-04-031-42/+39
| | | | | | The if always returns so no need for an else. Reviewed-by: Brian Paul <[email protected]>
* gm107/ir: add OP_SELP emission, used in DSQRT loweringIlia Mirkin2016-04-021-0/+30
| | | | | | | | The current DSQRT lowering code emits an OP_SELP, so we have to handle its emission. This will eventually go away, but no harm supporting this op. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: we can't load local memory directly into an outputIlia Mirkin2016-04-021-1/+2
| | | | | | | | | | | This fixes piglit tests like tests/spec/glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr.shader_test and related ones. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* st/nine: specify WINAPI only for i386 and amd64Christian Schmidbauer2016-04-021-5/+11
| | | | | | | | | | | | | | | Currently mesa fails building with the x32 abi as ms_abi is not defined in such a case. The patch uses ms_abi only for amd64 targets and stdcall only for i386 targets to be sure that those are defined. This patch additionally checks for __GNUC__ to guarantee that __attribute__ is available. CC: "11.1 11.2" <[email protected]> Signed-off-by: Christian Schmidbauer <[email protected]> Acked-by: Axel Davy <[email protected]>
* nv50/ir: fix envyas variants when building the code libSamuel Pitoiset2016-04-021-2/+2
| | | | | | | nvc0 and nve4 have been respectively replaced by gf100 and gk104. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* svga: remove unused svga_compile_key::texture_msaa fieldBrian Paul2016-04-022-2/+0
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* svga: check TXF instruction's target to determine MSAABrian Paul2016-04-021-1/+1
| | | | | | | | | | Rather than the currently bound texture. This goes along with the earlier patch to get away from examining bound textures and sampler views during shader translation. Fixes VMware bug 1632739. Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: add simple tgsi_is_msaa_target() helperBrian Paul2016-04-021-0/+8
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* glsl: rename var and simplify ifTimothy Arceri2016-04-021-4/+4
| | | | | | is_ubo_var is true for both UBOs and SSBOs Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: store ubo or ssbo index in block indexTimothy Arceri2016-04-022-22/+29
| | | | | | | | | | | | | | | | | | | | | | | Previously we store the buffer block index i.e the index of a combined ubo/ssbo list. Fixes several dEQP-GLES31.functional tests: - program_interface_query.uniform.block_index.block_array - program_interface_query.uniform.block_index.named_block - program_interface_query.uniform.block_index.unnamed_block - program_interface_query.uniform.random.10 - program_interface_query.uniform.random.15 - program_interface_query.uniform.random.22 - program_interface_query.uniform.random.24 - program_interface_query.uniform.random.26 - program_interface_query.uniform.random.28 - program_interface_query.uniform.random.3 - program_interface_query.uniform.random.31 - program_interface_query.uniform.random.38 - program_interface_query.uniform.random.5 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94116 Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: store stage reference in gl_uniform_blockTimothy Arceri2016-04-025-37/+26
| | | | | | | | | This allows us to simplify the code and drop InterfaceBlockStageIndex which is a per stage array of integers the size of all blocks in the program combined including duplicates across stages. Adding a stage ref per block will use less memory. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: simplify buffer block resource limit checkingTimothy Arceri2016-04-021-55/+32
| | | | | | | | | | This changes the code to use the buffer counts stored for each stage rather than counting from scratch. It also moves the checks outside of the for loop which means we now just get a single link error message if we go over the max rather than X error messages where X is the number we have exceeded the max by. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: simplify SSBO resources checkTimothy Arceri2016-04-021-7/+1
| | | | | | We already have a count of active SSBOs per stage so use it. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: split buffer block arrays earlierTimothy Arceri2016-04-021-27/+27
| | | | | | | This will allow us to use them when checking resources in a following patch and clean up a bunch of code. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: only set buffer block binding once during initialisationTimothy Arceri2016-04-021-26/+6
| | | | | | | | Since 8683d54d2be825 there is now a single instance of the buffer block information that needs to be updated rather than one instance for each stage. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Fix prorgram interface query locations biasing for SSO.Kenneth Graunke2016-04-012-13/+16
| | | | | | | | | | | | | | | | | | | | | With SSO, the GL_PROGRAM_INPUT and GL_PROGRAM_OUTPUT interfaces refer to the first and last shader stage linked into a program. This may not be the vertex and fragment shader stages. So, subtracting VERT_ATTRIB_GENERIC0 and FRAG_RESULT_DATA0 is bogus. We need to subtract VERT_ATTRIB_GENERIC0 for VS inputs, FRAG_RESULT_DATA0 for FS outputs, and VARYING_SLOT_VAR0 for other cases. Note that built-in variables get a location of -1. Fixes 4 dEQP-GLES31.functional.program_interface_query tests: - program_input.location.separable_fragment.var_explicit_location - program_input.location.separable_fragment.var_array_explicit_location - program_output.location.separable_vertex.var_array_explicit_location - program_output.location.separable_vertex.var_array_explicit_location Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Return -1 for program interface query locations in many cases.Kenneth Graunke2016-04-012-58/+42
| | | | | | | | | | | | | | | | | | | We were recording locations for all variables, even ones without an explicit location set. Implement the rules from the spec, and record -1 in the resource list accordngly. Make program_resource_location stop doing math on negative values. Remove hacks that are no longer necessary now that we've stopped doing that. Fixes 4 dEQP-GLES31.functional.program_interface_query tests: - program_input.location.separable_fragment.var - program_input.location.separable_fragment.var_array - program_output.location.separable_vertex.var_array - program_output.location.separable_vertex.var_array v2: Delete more code Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Consolidate gl_VertexIDMESA -> gl_VertexID query hacks.Kenneth Graunke2016-04-012-19/+10
| | | | | | | | | A program will either have gl_VertexID or gl_VertexIDMESA (the lowered zero-based version), not both. Just spoof it in the resource list so the hacks are done in a single place. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Clean up some leftover cruft.Kenneth Graunke2016-04-011-4/+1
| | | | | | | stages is always 1 << stage now. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Add all system variables to the input resource list.Kenneth Graunke2016-04-012-18/+1
| | | | | | | | | | | | | | | | | | | | System values are just built-in input variables that we've opted to special-case out of convenience. We need to consider all inputs, regardless of how we've classified them. Unfortunately, there's one exception: we shouldn't add gl_BaseVertex unless ARB_shader_draw_parameters is enabled, because it doesn't actually exist in the language, and shouldn't be counted in the GL_ACTIVE_RESOURCES query. Fixes dEQP-GLES31.functional.program_interface_query.program_input. resource_list.compute.empty, which expects gl_NumWorkGroups to appear in the resource list. v2: Delete more code Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Delete hack for VS system values.Kenneth Graunke2016-04-011-4/+0
| | | | | | | | | | | This makes no sense. If the stage being considered is the vertex shader, then we'll add inputs and system values appropriately. If we're not considering the vertex shader, then we absolutely should not do anything with it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Make add_interface_variables only consider the appropriate stage.Kenneth Graunke2016-04-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | add_interface_variables() is supposed to add variables for the inputs of the first shader stage linked into a program, and the outputs of the last shader stage linked into a program. From the ARB_program_interface_query specification: "* PROGRAM_INPUT corresponds to the set of active input variables used by the first shader stage of <program>. If <program> includes multiple shader stages, input variables from any shader stage other than the first will not be enumerated. * PROGRAM_OUTPUT corresponds to the set of active output variables (section 2.14.11) used by the last shader stage of <program>. If <program> includes multiple shader stages, output variables from any shader stage other than the last will not be enumerated." Previously, we used build_stageref here, which walks over all linked shaders in the program. This meant that internal varyings would be visible. We don't actually need any of build_stageref's code: we already explicitly skip packed varyings, handle modes, and the name comparisons just do a fuzzy string comparison of name with itself. Fixes two tests: dEQP-GLES31.functional.program_interface_query. program_{input,output}.referenced_by.referenced_by_vertex_fragment. These tests have a VS and FS linked together into a single program. Both stages have an input called "shaderInput". But the FS input should not be visible because it isn't the first stage. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Clarify "mask" variable in add_interface_variables().Kenneth Graunke2016-04-011-5/+5
| | | | | | | | | | This is a bitfield of which stages refer to a variable. It is not used to mask off bits. In fact, it's used to contribute additional bits. Rename it and tidy a bit of the logic. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Pass stage to add_interface_variables().Kenneth Graunke2016-04-011-5/+5
| | | | | | | | | | add_interface_variables is supposed to add variables from either the first or last stage of a linked shader. But it has no way of knowing the stage it's being asked to process, which makes it impossible to produce correct stagerefs. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Make vertex ID lowering declare gl_BaseVertex as hidden.Kenneth Graunke2016-04-011-1/+1
| | | | | | | | | | | | | | | If the GL_ARB_shader_draw_parameters extension is enabled, we'll already have a gl_BaseVertex variable. It will have var->how_declared set to ir_var_declared_implicitly, and will appear in the program resource list. If not, we make one for internal use. We don't want it to be listed in the program resource list, as the application won't be expecting it. Marking it hidden will properly exclude it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Exclude ir_var_hidden variables from the program resource list.Kenneth Graunke2016-04-011-1/+1
| | | | | | | | | | | | We occasionally generate variables internally that we want to exclude from the program resource list, as applications won't be expecting them to be present. The next patch will make use of this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Make _mesa_choose_tex_format() handle stencil textures.Kenneth Graunke2016-04-011-0/+5
| | | | | | | | This is necessary for ARB_texture_stencil8 support on classic drivers. Presumably Gallium works because it implements its own ChooseTexFormat. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Don't require matching centroid qualifiersJordan Justen2016-04-011-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note: This patch appears to violate older OpenGL and OpenGLES specs. The OpenGLES GLSL 3.1 and OpenGL GLSL 4.3 specifications both remove the requirement for the output and input centroid qualifiers to match. The deqp dEQP-GLES3.functional.shaders.linkage.varying.rules.differing_interpolation_2 test wants the newer OpenGLES 3.1 specification behavior, even for OpenGLES 3.0. This patch simply removes the checking in all cases. The OpenGLES 3.0 conformance test suite doesn't appear to require the older ("must match") spec behavior. For reference, here are the relavent spec citations: The OpenGL 4.2 spec says: "the last active shader stage output variables and fragment shader input variables of the same name must match in type and qualification (other than out matching to in)" The OpenGL 4.3 spec says: "interpolation qualification (e.g., flat) and auxiliary qualification (e.g. centroid) may differ." The OpenGLES GLSL 3.00.4 specification says: "The output of the vertex shader and the input of the fragment shader form an interface. For this interface, vertex shader output variables and fragment shader input variables of the same name must match in type and qualification (other than precision and out matching to in)." The OpenGLES GLSL 3.10 Specification says: "interpolation qualification (e.g., flat) and auxiliary qualification (e.g. centroid) may differ" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743 Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7819 Signed-off-by: Jordan Justen <[email protected]> Cc: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: distinguish between shader IR in get_compute_paramBas Nieuwenhuizen2016-04-0212-39/+54
| | | | | | | | | | | | | For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add global buffer memory barrier bitBas Nieuwenhuizen2016-04-022-0/+3
| | | | | | | | | Currently radeonsi synchronizes after every dispatch and Clover does nothing to synchronize. This is overzealous, especially with GL compute, so add a barrier for global buffers. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add threads per block TGSI propertyBas Nieuwenhuizen2016-04-024-1/+31
| | | | | | | | | | The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add compute shader IR typeBas Nieuwenhuizen2016-04-025-1/+7
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* glsl: remove tabs and fix some other style issues in glcpp-parse.yTimothy Arceri2016-04-021-1424/+1352
| | | | | | Note there are still tabs left in the parser rules. Acked-by: Dave Airlie <[email protected]>
* i965: Add an implemnetation of nir_op_fquantize2f16Jason Ekstrand2016-04-012-0/+53
| | | | Reviewed-by: Matt Turner <[email protected]>
* nir: Add an opcode for stomping a 32-bit value to 16-bit precisionJason Ekstrand2016-04-011-0/+1
| | | | | | This correlates directly to the SPIR-V opcode OpQuantizeToF16 Reviewed-by: Rob Clark <[email protected]>
* nvc0: enable compute shaders on GK104 and GM107+Samuel Pitoiset2016-04-011-1/+2
| | | | | | | | | Compute support on GK110 is still unstable for weird reasons, but this can be fixed later as the NVF0_COMPUTE envvar prevent using compute. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bump the maximum number of UBOs for compute on KeplerSamuel Pitoiset2016-04-012-3/+0
| | | | | | | | The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS) per compute program must be at least 12. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: do not lower shared+atomics on GM107+Samuel Pitoiset2016-04-011-7/+10
| | | | | | | | For Maxwell, the ATOMS instruction can be used to perform atomic operations on shared memory instead of this load/store lowering pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add atomics support on shared memory for KeplerSamuel Pitoiset2016-04-012-1/+108
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix wrong pred emission for ld lock on GK104Samuel Pitoiset2016-04-011-1/+4
| | | | | | | | This fixes 84b9b8f (nvc0/ir: add missing emission of locked load predicate). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for compute UBOs on KeplerSamuel Pitoiset2016-04-012-1/+57
| | | | | | | | Make sure to avoid out of bounds access in presence of indirect array indexing by loading the size from the driver constant buffer. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add indirect compute support on KeplerSamuel Pitoiset2016-04-011-34/+77
| | | | | | | | | | The grid size is stored as three 32-bits integers in the indirect buffer but the launch descriptor uses a 32-bits integer for both griddim_y and griddim_z like this (z << 16) | y. To make it work, the 16 high bits of griddim_y are overwritten by griddim_z. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: reduce likelihood of collision for real buffers on KeplerSamuel Pitoiset2016-04-011-2/+2
| | | | | | | | | | | Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: store ubo info to the driver constbuf on KeplerSamuel Pitoiset2016-04-014-1/+30
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>