summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallium/radeon: add radeon_surf::is_linearMarek Olšák2016-11-018-13/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove radeon_surf_level::pitch_bytesMarek Olšák2016-11-0113-44/+48
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't call u_format helpers if we have that info alreadyMarek Olšák2016-11-012-10/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: replace radeon_surf_info::dcc_enabled with num_dcc_levelsMarek Olšák2016-11-016-15/+19
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a driver query for counting CP DMA callsMarek Olšák2016-11-014-0/+13
| | | | | | | CP DMA calls are synchronous with regard to shaders, but can be made asynchronous if needed. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a driver query for shader cache hitsMarek Olšák2016-11-014-1/+16
| | | | | | This is an 8-month old patch. Reviewed-by: Nicolai Hähnle <[email protected]>
* gbm: set up the interop extension for egl/drmMarek Olšák2016-11-013-0/+3
| | | | | | | breaking libgbm -> libEGL ABI? Acked-by: Alex Deucher <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nvc0: do not duplicate similar performance metricsSamuel Pitoiset2016-11-011-43/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* docs: add news item and link release notes for 13.0.0Emil Velikov2016-11-012-0/+8
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 13.0.0Emil Velikov2016-11-011-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 405dd26860719d800ed6134f8f985f1525f25502)
* docs: Update 13.0.0 release notesEmil Velikov2016-11-011-3/+228
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit df1b0a5a86bab8cd138f504942198a300753b005)
* anv/device: Return DEVICE_LOST if execbuf2 failsJason Ekstrand2016-11-011-6/+4
| | | | | | | | | | | This makes more sense than OUT_OF_HOST_MEMORY. Technically, you can recover from a failed execbuf2 but the batch you just submitted didn't fully execute so things are in an ill-defined state. The app doesn't want to continue from that point anyway. Signed-off-by: Jason Ekstrand <[email protected]> Cc: "13.0" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputsAntia Puentes2016-11-015-22/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The emission of vertex attributes corresponding to dvec3 and dvec4 vertex shader input variables was not correct when the <size> passed to the VertexAttribL* commands was <= 2. This was because we were using the vertex array size when emitting vertices to decide if we uploaded a 64-bit floating point attribute as 1 slot (128-bits) for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This caused problems when mapping the input variables to registers because, for deciding which registers contain the values uploaded for a certain variable, we use the size and type given to the variable in the shader, so we will be assigning 256-bits to dvec3/4 variables, even if we only uploaded 128-bits for them, which happened when the vertex array size was <= 2. The patch uses the shader information to only emit as 128-bits those 64-bit floating point variables that were declared as double or dvec2 in the vertex shader. Dvec3 and dvec4 variables will be always uploaded as 256-bits, independently of the <size> given to the VertexAttribL* command. From the ARB_vertex_attrib_64bit specification: "For the 64-bit double precision types listed in Table X.1, no default attribute values are provided if the values of the vertex attribute variable are specified with fewer components than required for the attribute variable. For example, the fourth component of a variable of type dvec4 will be undefined if specified using VertexAttribL3dv or using a vertex array specified with VertexAttribLPointer and a size of three." We are filling these unspecified components with zeros, which coincidentally is also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 expects. v2: Do not use bitcount (Kenneth Graunke) Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287 Reviewed-by: Kenneth Graunke <[email protected]>
* radv: drop some unused cmask info members.Dave Airlie2016-11-012-8/+0
| | | | | | | | These were assigned but never used. Inspired by similiar patch in radeonsi. Signed-off-by: Dave Airlie <[email protected]>
* intel: aubinator: fix printing missing gen optionLionel Landwerlin2016-10-311-2/+2
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel: aubinator: fix assumptions on amount of required dataLionel Landwerlin2016-10-311-1/+5
| | | | | | | We require 12 bytes of headers but in some cases we just need 4. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel: aubinator: don't print out blocks twiceLionel Landwerlin2016-10-311-1/+0
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Move gen8_disable_stages to brw_upload_initial_gpu_stateNanley Chery2016-10-314-56/+13
| | | | | | | | | | 3DSTATE_WM_CHROMAKEY isn't programmed anywhere else. 3DSTATE_WM_HZ_OP is programmed, then cleared by blorp during a HZ op, so repeatedly clearing it after every blorp execution is redundant. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Program 3DSTATE_AA_LINE_PARAMETERS in upload_invariant_stateNanley Chery2016-10-313-36/+10
| | | | | | | This packet is non-pipelined and doesn't ever change across emissions. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* st/omx/dec: disable tunnel for size different caseLeo Liu2016-10-313-1/+11
| | | | | | | | | | When the video coded size is different from frame size, we need the result buffers are same as coded size, which are not size compatible with encode required size, so that simply use no tunnel for this case instead of frame by frame converting. Signed-off-by: Leo Liu <[email protected]> Cc: 13.0 <[email protected]>
* st/omx/dec: result buffers size should match codec decoder sizeLeo Liu2016-10-313-19/+18
| | | | | | | | Otherwise fails the check of matching between decoder size and buffers size in kernel. Signed-off-by: Leo Liu <[email protected]> Cc: 13.0 <[email protected]>
* swr: [rasterizer] added EventHandlerFile contructorGeorge Kyriazis2016-10-311-1/+6
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Frontend dependency workGeorge Kyriazis2016-10-313-2/+18
| | | | | | | Add frontend dependency concept in the DRAW_CONTEXT, which allows serialization of frontend work if necessary. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Refactor/cleanup backendsGeorge Kyriazis2016-10-312-360/+351
| | | | | | Used for common code reuse and simplification Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] Remove deprecated simd intrinsicsGeorge Kyriazis2016-10-314-990/+1
| | | | | | Used in abandoned all-or-nothing approach to converting to AVX512 Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] Add thread tags to event files.George Kyriazis2016-10-315-4/+24
| | | | | | | | This allows the post-processor to easily detect the API thread and to process frame information. The frame information is needed to optimized how data is processed from worker threads. Reviewed-by: Bruce Cherniak <[email protected]>
* glsl: use a non-malloc'd storage for short ir_variable namesMarek Olšák2016-10-313-3/+22
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use the linear allocator in opt_constant_propagationMarek Olšák2016-10-311-3/+11
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use the linear allocator in opt_copy_propagationMarek Olšák2016-10-311-1/+6
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use the linear allocator in opt_copy_propagation_elementsMarek Olšák2016-10-311-4/+11
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use the linear allocator in opt_dead_code_localMarek Olšák2016-10-311-3/+9
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use the linear allocator in glsl_symbol_tableMarek Olšák2016-10-311-8/+8
| | | | | | | no ralloc_free occurences Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use the linear allocator for ast_node and derived classesMarek Olšák2016-10-316-113/+114
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl/lexer: use the linear allocatorMarek Olšák2016-10-313-8/+12
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glcpp: use the linear allocator for most objectsMarek Olšák2016-10-313-118/+91
| | | | | | | v2: cosmetic changes Tested-by: Edmondo Tommasina <[email protected]> (v1) Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
* ralloc: add a linear allocator as a child node of rallocMarek Olšák2016-10-312-4/+433
| | | | | | | v2: remove goto, cosmetic changes Tested-by: Edmondo Tommasina <[email protected]> (v1) Reviewed-by: Nicolai Hähnle <[email protected]>
* ralloc: remove memset from ralloc_sizeMarek Olšák2016-10-311-15/+11
| | | | | | | only do it in rzalloc_size as it was supposed to be Reviewed-by: Edward O'Callaghan <[email protected]> Tested-by: Edmondo Tommasina <[email protected]>
* ralloc: use rzalloc where it's necessaryMarek Olšák2016-10-3111-15/+19
| | | | | | | | | | | | | | | | | No change in behavior. ralloc_size is equivalent to rzalloc_size. That will change though. Calls not switched to rzalloc_size: - ralloc_vasprintf - glsl_type::name allocation (it's filled with snprintf) - C++ classes where valgrind didn't show uninitialized values I switched most of non-glsl stuff to rzalloc without checking whether it's really needed. Reviewed-by: Edward O'Callaghan <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ralloc: add DECLARE_RZALLOC_CXX_OPERATORSMarek Olšák2016-10-311-2/+7
| | | | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: zero allocated memory where neededJuha-Pekka Heikkila2016-10-316-7/+7
| | | | Signed-off-by: Marek Olšák <[email protected]>
* i965/fs: fill allocated memory with zeros where neededJuha-Pekka Heikkila2016-10-312-3/+3
| | | | Signed-off-by: Marek Olšák <[email protected]>
* i965/vec4: zero allocated memory where neededJuha-Pekka Heikkila2016-10-311-2/+2
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* glsl/glcpp: initialize all fields of glcpp_parser_t on creationTapani Pälli2016-10-311-0/+3
| | | | | | | | this fixes some of the regressions with "ralloc: remove memset from ralloc_size" Signed-off-by: Tapani Pälli <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* glsl: Fix reading of uninitialized memoryJuha-Pekka Heikkila2016-10-312-4/+4
| | | | | | | | | | Switch to use memory allocations which zero memory for places where needed. v2: modify and rebase on top of Marek's series (Tapani) Signed-off-by: Juha-Pekka Heikkila <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* glsl: initialize glsl_struct_field properlyMarek Olšák2016-10-312-38/+6
| | | | | | | | | don't rely on ralloc doing memset Reviewed-by: Edward O'Callaghan <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ralloc: don't memset ralloc_header, clear it manuallyMarek Olšák2016-10-311-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | time GALLIUM_NOOP=1 ./run shaders/private/alien_isolation/ >/dev/null Before (2 takes): real 0m8.734s 0m8.773s user 0m34.232s 0m34.348s sys 0m0.084s 0m0.056s After (2 takes): real 0m8.448s 0m8.463s user 0m33.104s 0m33.160s sys 0m0.088s 0m0.076s Average change in "real" time spent: -3.4% calloc should only do 2 things compared to malloc: - check for overflow of "n * size" - call memset I'm not sure if that explains the difference. v2: clear "parent" and "next" in the caller of add_child. Reviewed-by: Edward O'Callaghan <[email protected]> (v1) Tested-by: Edmondo Tommasina <[email protected]> (v1) Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
* clover: Implement clGetExtensionFunctionAddressForPlatform.Serge Martin2016-10-303-1/+21
| | | | | | Add clGetExtensionFunctionAddressForPlatform (CL 1.2). Reviewed-by: Francisco Jerez <[email protected]>
* clover: Introduce CLOVER_EXTRA_*_OPTIONS environment variablesVedran Miletić2016-10-302-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The options specified in the CLOVER_EXTRA_BUILD_OPTIONS shell variable are appended to the options specified by the OpenCL program in the clBuildProgram function call, if any. Analogously, the options specified in the CLOVER_EXTRA_COMPILE_OPTIONS and CLOVER_EXTRA_LINK_OPTIONS variables are appended to the options specified in clCompileProgram and clLinkProgram function calls, respectively. v2: * rename to CLOVER_EXTRA_COMPILER_OPTIONS * use debug_get_option * append to linker options as well v3: code cleanups v4: separate CLOVER_EXTRA_LINKER_OPTIONS options v5: * fix documentation typo * use CLOVER_EXTRA_COMPILER_OPTIONS in link stage v6: * separate in CLOVER_EXTRA_{BUILD,COMPILE,LINK}_OPTIONS * append options in cl{Build,Compile,Link}Program Signed-off-by: Vedran Miletić <[email protected]> Reviewed-by[v1]: Edward O'Callaghan <[email protected]> v7 [Francisco Jerez]: Slight simplification. Reviewed-by: Francisco Jerez <[email protected]>
* clover: Pass unquoted compiler arguments to ClangVedran Miletić2016-10-301-4/+36
| | | | | | | | | | | | | | | | | | | | OpenCL apps can quote arguments they pass to the OpenCL compiler, most commonly include paths containing spaces. If the Clang OpenCL compiler was called via a shell, the shell would split the arguments with respect to to quotes and then remove quotes before passing the arguments to the compiler. Since we call Clang as a library, we have to split the argument with respect to quotes and then remove quotes before passing the arguments. v2: move to tokenize(), remove throwing of CL_INVALID_COMPILER_OPTIONS v3: simplify parsing logic, use more C++11 v4: restore error throwing, clarify a comment Signed-off-by: Vedran Miletić <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/fs/generator: Don't use the address immediate for MOV_INDIRECTJason Ekstrand2016-10-281-28/+27
| | | | | | | | | | | | | The address immediate field is only 9 bits and, since the value is in bytes, the highest GRF we can point to with it is g15. This makes it pretty close to useless for MOV_INDIRECT. There were already piles of restrictions preventing us from using it prior to Broadwell, so let's get rid of the gen8+ code path entirely. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779 Cc: "12.0 13.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>