summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: fix struct type in commentTimothy Arceri2016-04-111-1/+1
| | | | Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965: enable OES_texture_buffer on gen7+Ilia Mirkin2016-04-101-0/+1
| | | | | | | | | It will only end up getting exposed on gen8+ since it requires GL ES 3.1, but it should be ready to go on gen7 when support for GL ES 3.1 is completed there. Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Kenneth Graunke <[email protected]>
* i965/disasm: Decode per-slot offsets.Kenneth Graunke2016-04-091-0/+5
| | | | | | | We just never bothered to decode this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965/disasm: Decode "channel mask present" bit correctly.Kenneth Graunke2016-04-091-4/+15
| | | | | | | | Bit 15 means "interleave" for most messages, but for SIMD8 messages it means "use channel masks". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965/disasm: Simplify the URB opcode printing with ?:.Kenneth Graunke2016-04-091-7/+6
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965/tiled_memcopy: Get rid of the direction parameter to get_memcpyJason Ekstrand2016-04-085-22/+5
| | | | | | | | | Now that we can use the much simpler rgba8_copy function, we don't need to hand different functions out based on direction. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/tiled_memcpy: Rework the RGBA -> BGRA mem_copy functionsJason Ekstrand2016-04-081-76/+63
| | | | | | | | | | | | | This splits the two copy functions into three: One for unaligned copies, one for aligned sources, and one for aligned destinations. Thanks to the previous commit, we are now guaranteed that the aligned ones will *only* operate on aligned memory so they should be safe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93962 Cc: "11.1 11.2" <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functionsJason Ekstrand2016-04-081-32/+43
| | | | | | | | | | | | | | | | | | Each of the [de]tiling functions has three mem_copy calls: 1) Left edge to tile boundary 2) Tile boundary to tile boundary in a loop 3) Tile boundary to right edge Copies 2 and 3 start at a tile edge so the pointer to tiled memory is guaranteed to be at least 16-byte aligned. Copy 1, on the other hand, starts at some arbitrary place in the tile so it doesn't have any such alignment guarantees. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Check eu/subslices are > 0Ben Widawsky2016-04-081-1/+1
| | | | | | | | Now that the check is restricted to gen8+, we should always get back a non-zero positive value for the EU and subslice counts. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix eu/subslice warningBen Widawsky2016-04-081-11/+23
| | | | | | | | | | | Older gen platforms do not actually return a value for sublice and eu total (IMO, confusingly) they return -ENODEV. This patch defers the SSEU setup until we have the actual GPU generation to avoid useless warnings when running on older platforms with older kernels. Reported-by: Mark Janes <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Extract SSEU configuration infoBen Widawsky2016-04-081-14/+21
| | | | | Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: fix glReadBuffer() assertion failureBrian Paul2016-04-081-0/+2
| | | | | | | | | | | | | | | | | | If the first call in a GL app is glReadPixels(GL_FRONT) we'd fail the assert(st->ctx->FragmentProgram._Current) at st_atom_shader.c:114 in update_fp(). This is because we were calling st_validate_state() without first updating Mesa state with _mesa_update_state(). The regression came from commit 83b589301f4a150f4 "st/mesa: fix frontbuffer glReadPixels regressions". The new piglit gl-1.0-simple-readbuffer test exercises this. Cc: "11.1 11.2" <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* st/glsl_to_tgsi: make samplers_used an uint32_t (v2)Nicolai Hähnle2016-04-071-3/+5
| | | | | | | | | | | | | It is used as a bitfield, so it seems cleaner to keep it unsigned. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (v1) Reviewed-by: Marek Olšák <[email protected]> (v1)
* mesa/st: Update framebuffer state with no.of samples,layersEdward O'Callaghan2016-04-071-3/+5
| | | | | | | | Handle the case of ARB_framebuffer_no_attachment. Also, kill off a dead debug printf() call while we are here. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa/st: Set _NumSamples in update_framebuffer_state()Edward O'Callaghan2016-04-071-0/+46
| | | | | | | | | | | | | | | Using PIPE_FORMAT_NONE to indicate what MSAA modes are supported with a framebuffer using no attachment. V.2: Rewrite MSAA mode loop to be more general. V.3: Move comment to right place after loop was rewritten. V.4: [airlied] remove unneeded variable, and assert, and unneeded pipe assignment Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: Obtain ARB_framebuffer_no_attachment constantsEdward O'Callaghan2016-04-071-0/+28
| | | | | | | | | | | | | | | | | | Set default values for the constants required in ARB_framebuffer_no_attachments and obtained the number of layers from ``PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS``. We also obtain the MaxFramebufferSamples value using a query back to the driver for PIPE_FORMAT_NONE. V.1: Merge if branch predicates into one branch. Move const init into st_init_limits() [airlied: whitespace fixup] Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa/st: Use _mesa_geometric_ functions appropriatelyEdward O'Callaghan2016-04-074-8/+15
| | | | | | | | | | | | | | | | | | | | Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometric_ convenience functions for those places where the geometry of the gl_framebuffer is needed. This is in contrast to the geometry of the intersection of the attachments of the gl_framebuffer. This patch paves the way to enable GL_ARB_framebuffer_no_attachements for all gallium drivers. V.2: Remove itermeditate variable state. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: Add comment to framebuffer_parameteri()Edward O'Callaghan2016-04-071-0/+5
| | | | | | | | | V.2: Change 'N.B.,' to 'NOTE:'. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965/sf_state: Pull flat_enables out of prog_dataJason Ekstrand2016-04-064-27/+5
| | | | | | | | | | Previously, we were walking over the shader source to figure out which inputs should be marked flat. Now, we can just pull it out of prog_data. This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it also means that it will get properly cached. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add a flat_inputs field to prog_dataJason Ekstrand2016-04-062-0/+37
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* brw/device_info: Add a helper for getting a device nameJason Ekstrand2016-04-062-0/+13
| | | | | | | This is needed by the Vulkan driver Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs_surface_builder: Mask signed integers after conversionJason Ekstrand2016-04-061-0/+18
| | | | | Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Make the repclear shader support either a uniform or a flat inputJason Ekstrand2016-04-061-5/+18
| | | | | | | | | | In the Vulkan driver we use a single flat input instead of a uniform because setting up push constants is more disruptive to the pipeline than setting up another vertex input. This uses the number of uniforms as a key to keep it working for the GL driver. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move get_hw_prim_for_gl_prim to brw_util.cJason Ekstrand2016-04-062-29/+28
| | | | | | | | It's used by brw_compile_gs in brw_vec4_gs_visitor.cpp so it needs to be in a file that's linked into libi965_compiler.la. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* drirc: add a workaround for blackness in WarsowMarek Olšák2016-04-061-0/+8
| | | | Cc: 11.1 11.2 <[email protected]>
* mesa: remove unused IsShaderStorage fieldTimothy Arceri2016-04-061-5/+0
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* glsl: fully split apart buffer block arraysTimothy Arceri2016-04-065-58/+19
| | | | | | | | | | | | With this change we create the UBO and SSBO arrays separately from the beginning rather than putting them into a combined array and splitting it apart later. A bug is with UBO and SSBO stage reference querying is also fixed as we now use the block index to lookup the references in the separate arrays not the combined buffer block array. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965/fs: Move the code for load/store_shared to emit_cs_intrinsicJason Ekstrand2016-04-041-76/+76
| | | | | | | They are compute-shader only and that's where the code for doing atomics on shared variables lives so it seemes to make sense. Reviewed-by: Jordan Justen <[email protected]>
* i965/nir: Provide a default LOD for buffer texturesJason Ekstrand2016-04-042-0/+8
| | | | | | | | | Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all. Reviewed-by: Jordan Justen <[email protected]>
* i965: Fix invalid pointer read in dead_control_flow_eliminate().Kenneth Graunke2016-04-041-0/+4
| | | | | | | | | | | There may not be a previous block. In this case, there's no real work to do, so just continue on to the next one. v2: Update for bblock->prev() API change. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Make bblock_t::next and friends return NULL at sentinels.Kenneth Graunke2016-04-042-1/+13
| | | | | | | | | The bblock_t::prev/prev_const/next/next_const API returns bblock_t pointers, rather than exec_nodes. So it's a bit surprising. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/peephole_ffma: Only match a mul+add if none of the ops are exactJason Ekstrand2016-04-041-0/+11
| | | | Reviewed-by: Ian Romanick <[email protected]>
* i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range.Kenneth Graunke2016-04-044-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.99997, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.00006 to about 0.00003. When enabled, fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive (about twice the number of cycles on recent hardware). Enabling this option has been shown to reduce GPUTest Volplosion performance by about 10%. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Allow 8x MSAA on >= 64bpp formats on Gen8+.Kenneth Graunke2016-04-041-1/+2
| | | | | | | | | | | See commit 3b0279a69 - this restriction is documented in the "Surface Format" field of RENDER_SURFACE_STATE. Looking at newer documentation, this restriction appears to exist on Haswell, but no longer applies on Gen8+. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* mesa/get: fix MAX_GEOMETRY_SHADER_STORAGE_BLOCKSDave Airlie2016-04-041-1/+1
| | | | | | | this was returning the fragment shader value. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: expose EXT_base_instance in ES3 contextsIlia Mirkin2016-04-033-1/+7
| | | | | | | | This extension is identical to ARB_base_instance. Reuse the same entrypoints. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: expose EXT_polygon_offset_clamp in ES contextsIlia Mirkin2016-04-033-4/+10
| | | | | | | | The extension spec was extended to also support ES. This functionality is provided all the way back to ES 1.0. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: add always-false-for-now enables for GL 4.3, 4.4, 4.5.Ilia Mirkin2016-04-031-2/+49
| | | | | | | | | As the relevant extensions get implemented, the lines should be uncommented. I believe this is (almost) everything needed for those GL versions though. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: add ES3_1_compatibility extension enableIlia Mirkin2016-04-032-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa: remove unrequired elseTimothy Arceri2016-04-031-42/+39
| | | | | | The if always returns so no need for an else. Reviewed-by: Brian Paul <[email protected]>
* glsl: store stage reference in gl_uniform_blockTimothy Arceri2016-04-023-15/+4
| | | | | | | | | This allows us to simplify the code and drop InterfaceBlockStageIndex which is a per stage array of integers the size of all blocks in the program combined including duplicates across stages. Adding a stage ref per block will use less memory. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Fix prorgram interface query locations biasing for SSO.Kenneth Graunke2016-04-011-8/+3
| | | | | | | | | | | | | | | | | | | | | With SSO, the GL_PROGRAM_INPUT and GL_PROGRAM_OUTPUT interfaces refer to the first and last shader stage linked into a program. This may not be the vertex and fragment shader stages. So, subtracting VERT_ATTRIB_GENERIC0 and FRAG_RESULT_DATA0 is bogus. We need to subtract VERT_ATTRIB_GENERIC0 for VS inputs, FRAG_RESULT_DATA0 for FS outputs, and VARYING_SLOT_VAR0 for other cases. Note that built-in variables get a location of -1. Fixes 4 dEQP-GLES31.functional.program_interface_query tests: - program_input.location.separable_fragment.var_explicit_location - program_input.location.separable_fragment.var_array_explicit_location - program_output.location.separable_vertex.var_array_explicit_location - program_output.location.separable_vertex.var_array_explicit_location Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Return -1 for program interface query locations in many cases.Kenneth Graunke2016-04-011-53/+9
| | | | | | | | | | | | | | | | | | | We were recording locations for all variables, even ones without an explicit location set. Implement the rules from the spec, and record -1 in the resource list accordngly. Make program_resource_location stop doing math on negative values. Remove hacks that are no longer necessary now that we've stopped doing that. Fixes 4 dEQP-GLES31.functional.program_interface_query tests: - program_input.location.separable_fragment.var - program_input.location.separable_fragment.var_array - program_output.location.separable_vertex.var_array - program_output.location.separable_vertex.var_array v2: Delete more code Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Consolidate gl_VertexIDMESA -> gl_VertexID query hacks.Kenneth Graunke2016-04-011-17/+0
| | | | | | | | | A program will either have gl_VertexID or gl_VertexIDMESA (the lowered zero-based version), not both. Just spoof it in the resource list so the hacks are done in a single place. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Add all system variables to the input resource list.Kenneth Graunke2016-04-011-8/+1
| | | | | | | | | | | | | | | | | | | | System values are just built-in input variables that we've opted to special-case out of convenience. We need to consider all inputs, regardless of how we've classified them. Unfortunately, there's one exception: we shouldn't add gl_BaseVertex unless ARB_shader_draw_parameters is enabled, because it doesn't actually exist in the language, and shouldn't be counted in the GL_ACTIVE_RESOURCES query. Fixes dEQP-GLES31.functional.program_interface_query.program_input. resource_list.compute.empty, which expects gl_NumWorkGroups to appear in the resource list. v2: Delete more code Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Make _mesa_choose_tex_format() handle stencil textures.Kenneth Graunke2016-04-011-0/+5
| | | | | | | | This is necessary for ARB_texture_stencil8 support on classic drivers. Presumably Gallium works because it implements its own ChooseTexFormat. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* gallium: distinguish between shader IR in get_compute_paramBas Nieuwenhuizen2016-04-021-6/+7
| | | | | | | | | | | | | For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add threads per block TGSI propertyBas Nieuwenhuizen2016-04-021-0/+18
| | | | | | | | | | The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add compute shader IR typeBas Nieuwenhuizen2016-04-021-0/+1
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* i965: Add an implemnetation of nir_op_fquantize2f16Jason Ekstrand2016-04-012-0/+53
| | | | Reviewed-by: Matt Turner <[email protected]>