aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glsl/cs: Add gl_NumWorkGroups as a system valueJordan Justen2015-09-292-1/+2
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Setup surface binding for gl_NumWorkGroupsJordan Justen2015-09-295-1/+53
| | | | | | | | | | | | | | | | | | | | | | | | | This will only be setup when the prog_data uses_num_work_groups boolean is set. At this point nothing will set uses_num_work_groups, but soon code will set it when emitting code for the intrinsic that loads gl_NumWorkGroups. We can't emit this surface information earlier at the start of the DispatchCompute* call because we may not have generated the program yet. Until we generate the program, we don't know if the gl_NumWorkGroups variable is accessed. We also can't emit the surface as part of the brw_cs_state atom, because we might not need the surface if gl_NumWorkGroups is not used by the program. Lastly, we cannot emit the surface later (after state upload) in the DispatchCompute* call, because it needs to be run before the brw_cs_state atom is emitted, since it changes the surface state. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Add a binding table entry for gl_NumWorkGroupsJordan Justen2015-09-293-5/+29
| | | | | | | | | | | | | If glDispatchComputeIndirect is used, then the value for this variable must be read from the indirect BO. To allow the same generated code to support indirect and glDispatchCompute, we will also setup a BO for the number of work groups using the intel_upload_data mechanism. This will only be required if the gl_NumWorkGroups variable is accessed. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Store compute invocation information in brw contextJordan Justen2015-09-292-24/+37
| | | | | | | | We will need this in an atom to setup a surface to read the gl_NumWorkGroups values from. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Re-emit cs_state when surfaces have changedJordan Justen2015-09-291-1/+2
| | | | | | | | | Unlike rendering (BINDING_TABLE_POINTERS_*S), compute doesn't have a binding table pointers command. Instead it is part of the MEDIA_INTERFACE_DESCRIPTOR structure loaded by the brw_cs_state atom. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/cs: Re-emit push constants and cs_state on new batchesJordan Justen2015-09-291-2/+4
| | | | | | | | | | We need to re-emit push constansts when a new batch is started since the push constants are stored in the batch. We also need to re-emit the MEDIA_INTERFACE_DESCRIPTOR (in brw_cs_state) since it is stored in the batch. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* mesa/cs: Add MESA_VERBOSE=api support in DispatchCompute*Jordan Justen2015-09-291-0/+7
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* util: Fix strndup prototype on C++.Jose Fonseca2015-09-292-1/+10
| | | | Trivial.
* mesa: fix ARRAY_SIZE query for GetProgramResourceivTapani Pälli2015-09-293-43/+62
| | | | | | | | | | | | | | | | Patch also refactors name length queries which were using array size in computation, this has to be done in same time to avoid regression in arb_program_interface_query-resource-query Piglit test. Fixes rest of the failures with ES31-CTS.program_interface_query.no-locations v2: make additional check only for GS inputs v3: create helper function for resource name length so that it gets calculated only in one place Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Martin Peres <[email protected]>
* glsl: Fix forward NULL dereference coverity warningIago Toral Quiroga2015-09-291-7/+6
| | | | | | | | | | | | The comment says that it should be impossible for decl_type to be NULL here, so don't try to handle the case where it is, simply add an assert. >>> CID 1324977: Null pointer dereferences (FORWARD_NULL) >>> Comparing "decl_type" to null implies that "decl_type" might be null. No piglit regressions observed. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Fix null return coverity warningIago Toral Quiroga2015-09-291-4/+6
| | | | | | | | | | | | | | Add an assert on the result of as_dereference() not being NULL: >>> CID 1324978: Null pointer dereferences (NULL_RETURNS) >>> Dereferencing a null pointer "deref_record->record->as_dereference()". Since we are introducing a new variable to hold the result of as_dereference(), take the opportunity to rename deref_record_type to interface_type and just name the new variable interface_deref, which is less confusing. Reviewed-by: Kristian Høgsberg <[email protected]>
* glsl: Fix unused value warning reported by CoverityIago Toral Quiroga2015-09-291-2/+0
| | | | | | | | | | We don't use param in this part of the code, so no point in advancing the pointer forward: >>> CID 1324983: Code maintainability issues (UNUSED_VALUE) >>> Assigning value from "param->get_next()" to "param" here, but that stored value is overwritten before it can be used. Reviewed-by: Kristian Høgsberg <[email protected]>
* util: implement strndup for WIN32Samuel Iglesias Gonsalvez2015-09-294-0/+84
| | | | | | | | | | | v2: - Add strndup.h to Makefile.sources (Emil) - Use calloc instead of malloc (Emil). - Check if allocation fails (Emil, Jose) - Add '#pragma once' and include stdlib.h to strndup.h (Jose) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92124 Reviewed-by: Jose Fonseca <[email protected]>
* glsl: use correct number of uniform blocks in error messageSamuel Iglesias Gonsalvez2015-09-291-1/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: rename gl_shader_program's NumUniformBlocks to NumBufferInterfaceBlocksSamuel Iglesias Gonsalvez2015-09-2911-26/+26
| | | | | | | | | | | Because it counts shader storage blocks too. v2: - Use NumBufferInterfaceBlocks instead (Jordan). Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* main: fix ACTIVE_UNIFORM_BLOCKS valueSamuel Iglesias Gonsalvez2015-09-291-1/+5
| | | | | | | | | | | | | | NumUniformBlocks also counts shader storage blocks. NumUniformBlocks variable will be renamed in a later patch to avoid misunderstandings. v2: - Modify the condition to use !IsShaderStorage and the list of uniform blocks (Timothy) Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* docs: add news item and link release notes for 11.0.2Emil Velikov2015-09-292-0/+7
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 11.0.2Emil Velikov2015-09-291-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 4c0b48461269d3ede0c5446d86ebe3e81f16788e)
* docs: add release notes for 11.0.2Emil Velikov2015-09-291-0/+84
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 51e0b06d9916e126060c0d218de1aaa4e5a4ce26)
* i965/gen9: Add a condition for starting pixel in fast copy blitAnuj Phogat2015-09-281-0/+4
| | | | | | | | | | | | This condition restricts the use of fast copy blit to cases where starting pixel of src and dst is oword (16 byte) aligned. Many piglit tests (if using fast copy blit in Mesa) failed earlier because I missed adding this condition.Fast copy blit is currently enabled for use only with Yf/Ys tiling. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* nouveau: wait to unref the transfer's bo until it's no longer usedIlia Mirkin2015-09-281-2/+3
| | | | | | | | | | The bo will often come from a slab in which case it doesn't matter. But for larger allocations this will be in its own bo, and we have to make sure to wait until it's no longer used in order for it to be freed. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Tested-by: Marcin Ślusarz <[email protected]>
* nouveau: delay deleting buffer with unflushed fenceIlia Mirkin2015-09-282-2/+10
| | | | | | | | | | | If there is an unflushed fence on the bo, then the resource may still be used in commands built up in the local pushbuf. Flushing can cause all sorts of unwanted effects, so just free the bo when the relevant fence is hit. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Tested-by: Marcin Ślusarz <[email protected]>
* nouveau: be more careful about freeing temporary transfer buffersIlia Mirkin2015-09-285-4/+30
| | | | | | | | | Deleting a buffer does not flush the command stream. Make sure that we wait for the copies to finish before deleting the temporary bo. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Tested-by: Marcin Ślusarz <[email protected]>
* i965: Rename intel_miptree_get_dimensions_for_image()Anuj Phogat2015-09-285-10/+14
| | | | | | | | | | | This function isn't specific to miptrees. So, drop the "miptree" from function name. V3: Add a comment explaining how the 1D Array texture height and depth is interpreted by Intel hardware. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/gen9: Fix {src, dst}_pitch alignment check for XY_FAST_COPY_BLTAnuj Phogat2015-09-281-11/+7
| | | | | | | | | I misinterpreted the alignmnet restriction in XY_FAST_COPY_BLT earlier. Instead of checking pitch for 64KB alignmnet we need to check it for tile widh alignment. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Fix {src, dst}_pitch alignment check for XY_SRC_COPY_BLTAnuj Phogat2015-09-281-2/+7
| | | | | | | | | | | | | | Current code checks the alignment restrictions only for Y tiling. From Broadwell PRM vol 10: "pitch is of 512Byte granularity for Tile-X: This means the tiled-x surface pitch can be (512, 1024, 1536, 2048...)/4 (in Dwords)." This patch adds the restriction for X tiling as well. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Move conversion of {src, dst}_pitch to dwords outside if/elseAnuj Phogat2015-09-281-16/+9
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Delete temporary variable 'src_pitch'Anuj Phogat2015-09-281-5/+1
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Use helper function intel_get_tile_dims() in surface setupAnuj Phogat2015-09-281-2/+12
| | | | | | | | | | It takes care of using the correct tile width if we later use other tiling patterns for aux miptree. V2: Remove the comment about using Yf for aux miptree. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Use intel_get_tile_dims() to get tile masksAnuj Phogat2015-09-284-33/+28
| | | | | | | | | | | | | | | This will require change in the parameters passed to intel_miptree_get_tile_masks(). V2: Rearrange the order of parameters. (Ben) Change the name to intel_get_tile_masks(). (Topi) V3: Use temporary variables in intel_get_tile_masks() for clarity. Fix mask_y computation. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Add a helper function intel_get_tile_dims()Anuj Phogat2015-09-282-22/+64
| | | | | | | | | | | | | | V2: - Do the tile width/height computations in the new helper function and use it later in intel_miptree_get_tile_masks(). - Change the name to intel_get_tile_dims(). V3: Return the tile_h in number of rows in place of bytes. Document the units of tile_w, tile_h parameters. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa: Use the effective internal format instead for validationEduardo Lima Mitev2015-09-281-0/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When validating format+type+internalFormat for texture pixel operations on GLES3, the effective internal format should be used if the one specified is an unsized internal format. Page 127, section "3.8 Texturing" of the GLES 3.0.4 spec says: "if internalformat is a base internal format, the effective internal format is a sized internal format that is derived from the format and type for internal use by the GL. Table 3.12 specifies the mapping of format and type to effective internal formats. The effective internal format is used by the GL for purposes such as texture completeness or type checks for CopyTex* commands. In these cases, the GL is required to operate as if the effective internal format was used as the internalformat when specifying the texture data." v2: Per the spec, Luminance8Alpha8, Luminance8 and Alpha8 should not be considered sized internal formats. Return the corresponding unsize format instead. v4: * Improved comments in _mesa_es3_effective_internal_format_for_format_and_type(). * Splitted patch to separate chunk about reordering of error_check_subtexture_dimensions() error check, which is not directly related with this patch. v5: Dropped the splitted patch because it was actually a work around 3 dEQP tests that are buggy: dEQP-GLES2.functional.negative_api.texture.texsubimage2d_neg_offset dEQP-GLES2.functional.negative_api.texture.texsubimage2d_offset_allowed dEQP-GLES2.functional.negative_api.texture.texsubimage2d_neg_wdt_hgt Cc: "11.0" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Mark Janes <[email protected]>
* mesa: Move _mesa_base_tex_format() from teximage to glformats filesEduardo Lima Mitev2015-09-284-378/+507
| | | | | | | | | | | | | | | | | This function will be needed as part of validating the combination of format, type and internal format of texture pixel operations, which happens in glformats files. Specifically, we want to be able to obtain the base format of a resolved effective internal format, to compare it with the original internal format passed. Also, since this function deals solely with GL formats, it fits better in glformats where the rest of similar format functionality rests. The function is moved as-is, without any modification. Cc: "11.0" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Mark Janes <[email protected]>
* mesa: Fix order of format+type and internal format checks for glTexImageXD opsEduardo Lima Mitev2015-09-281-16/+25
| | | | | | | | | | | | | | | | | | | The more specific GLES constrains should be checked after the general validation performed by _mesa_error_check_format_and_type(). This is also for consistency with the error checks order of glTexSubImage ops. v3: The change of order uncovered a bug that regresses a couple of piglit tests written against OpenGL-ES 1.1 spec, which expects an INVALID_VALUE instead of the INVALID_ENUM returned by _mesa_error_check_format_and_type() when an invalid format is passed to glTexImage2D. This version of the patch accounts for those cases. Fixes 1 dEQP test: * dEQP-GLES3.functional.negative_api.texture.teximage2d Cc: "11.0" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Mark Janes <[email protected]>
* egl: Fix missing Haiku include pathAlexander von Gluck IV2015-09-281-0/+1
|
* state_trackers/hgl: Fix missing include pathAlexander von Gluck IV2015-09-281-0/+1
|
* i965/fs: Fix hang on IVB and VLV with image format mismatch.Francisco Jerez2015-09-281-4/+38
| | | | | | | | | | | | | | | | | | | IVB and VLV hang sporadically when an untyped surface read or write message is used to access a surface of format other than RAW, as may happen when there is a mismatch between the format qualifier of the image uniform and the format of the actual image bound to the pipeline. According to the spec this condition gives undefined results but may not lead to program termination (which is one of the possible outcomes of the hang). Fix it by checking at runtime whether the surface is of the right type. Fixes the "arb_shader_image_load_store.invalid/format mismatch" piglit subtest. Reported-by: Mark Janes <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91718 CC: [email protected] Reviewed-by: Ian Romanick <[email protected]>
* clover: Implement clCreateImage?D w/ clCreateImage.Serge Martin2015-09-281-52/+8
| | | | | | | Remplace clCreateImage2D and clCreateImage3D implementation with call to clCreateImage. Reviewed-by: Francisco Jerez <[email protected]>
* clover: Implement CL1.2 clCreateImage().Serge Martin2015-09-281-10/+91
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* clover: Move down canonicalization of memory object flags into validate_flags().Francisco Jerez2015-09-281-39/+40
| | | | | | | | This will be used to share the same logic between buffer and image creation. v2: Make memory flag set constants local to validate_flags. (Serge Martin)
* docs: mention ARB_shader_storage_buffer_object on 11.1.0 release notesSamuel Iglesias Gonsalvez2015-09-281-0/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: revert "glsl: atomic counters can be declared as buffer-qualified ↵Iago Toral Quiroga2015-09-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | variables" This reverts commit 586142658e2927a68c. The specs are not explicit about any restrictions related to the types allowed on buffer variables, however, the description of opaque types (like atomic counters) is in conclict with the purpose of buffer variables: "The opaque types declare variables that are effectively opaque handles to other objects. These objects are accessed through built-in functions, not through direct reading or writing of the declared variable. (...) Opaque variables cannot be treated as l-values;(...)" Also, Mesa is already disallowing opaque types in interface blocks anyway, so that commit was not really achieving anything. Reviewed-by: Francisco Jerez <[email protected]>
* gallium/util: avoid unreferencing random memory on buffer alloc failureIlia Mirkin2015-09-281-1/+1
| | | | | | | | Found by Coverity Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Albert Freeman <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: don't leak interface_nameIlia Mirkin2015-09-281-0/+1
| | | | | | | Found by Coverity Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: fix component size calculation for tessellation and geom shadersTimothy Arceri2015-09-281-1/+1
| | | | | | Broken in commit abdab88b30ab when adding arrays of arrays support Reviewed-by: Dave Airlie <[email protected]>
* docs/GL3.txt: fix typoBoyan Ding2015-09-271-1/+1
| | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Albert Freeman <[email protected]>
* i965/gs: Optimize away the EOT write on Gen8+ with static vertex count.Kenneth Graunke2015-09-261-0/+15
| | | | | | | | | | | | | | | | | | | With static vertex counts, the final EOT write doesn't actually write any data - it's just there to end the thread. Typically, the last thing before ending the thread will be an EmitVertex() call, resulting in a URB write. We can just set EOT on that. Note that this isn't always possible - there might be an intervening SSBO write/image store, or the URB write may have been in a loop. shader-db statistics for geometry shaders only: total instructions in shared programs: 3173 -> 3149 (-0.76%) instructions in affected programs: 176 -> 152 (-13.64%) helped: 8 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/gs: Allow src0 immediates in GS_OPCODE_SET_WRITE_OFFSET.Kenneth Graunke2015-09-262-2/+14
| | | | | | | | | | | | | | | | | | | GS_OPCODE_SET_WRITE_OFFSET is a MUL with a constant src[1] and special strides. We can easily make the generator handle constant src[0] arguments by instead generating a MOV with the product of both operands. This isn't necessarily a win in and of itself - instead of a MUL, we generate a MOV, which should be basically the same cost. However, we can probably avoid the earlier MOV to put src[0] into a register. shader-db statistics for geometry shaders only: total instructions in shared programs: 3207 -> 3173 (-1.06%) instructions in affected programs: 3207 -> 3173 (-1.06%) helped: 11 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Implement "Static Vertex Count" geometry shader optimization.Kenneth Graunke2015-09-265-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Broadwell's 3DSTATE_GS contains new "Static Output" and "Static Vertex Count" fields, which control a new optimization. Normally, geometry shaders can output arbitrary numbers of vertices, which means that resource allocation has to be done on the fly. However, if the number of vertices is statically known, the hardware can pre-allocate resources up front, which is more efficient. Thanks to the new NIR GS intrinsics, this is easy. We just call the function introduced in the previous commit to get the vertex count. If it obtains a count, we stop emitting the extra 32-bit "Vertex Count" field in the VUE, and instead fill out the 3DSTATE_GS fields. Improves performance of Gl32GSCloth by 5.16347% +/- 0.12611% (n=91) on my Lenovo X250 laptop (Broadwell GT2) at 1024x768. shader-db statistics for geometry shaders only: total instructions in shared programs: 3227 -> 3207 (-0.62%) instructions in affected programs: 242 -> 222 (-8.26%) helped: 10 v2: Don't break non-NIR paths (just skip this optimization). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Move GS_THREAD_END mlen calculations out of the generator.Kenneth Graunke2015-09-262-2/+2
| | | | | | | | | | The visitor was setting a mlen that was wrong for Broadwell, but the generator was ignoring it and doing the right thing regardless. We may as well move the logic fully into the visitor. This will be useful in the next commit as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>