summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glsl: make linker error message more informativeTimothy Arceri2015-08-131-2/+3
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* i965: Stop aux data compare preventing program binary re-useTopi Pohjolainen2015-08-131-32/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Items in the program cache consist of three things: key, the data representing the instructions and auxiliary data representing uniform storage. The data consisting of instructions is stored into a drm buffer object while the key and the auxiliary data reside in malloced section. Now the cache uploading is equipped with a check that iterates over existing items and seeks to find a another item using identical instruction data than the one being just uploaded. If such is found there is no need to add another section into the drm buffer object holding identical copy of the existing one. The item just being uploaded should instead simply point to the same offset in the underlying drm buffer object. Unfortunately the check for the matching instruction data is coupled with a check for matching auxiliary data also. This effectively prevents the cache from ever containing two items that could share a section in the drm buffer object. The constraint for the instruction data and auxiliary data to match is, fortunately, unnecessary strong. When items are stored into the cache they will anyway contain their own copy of the auxiliary data (even if they matched - which they in real world never will). The only thing the items would be sharing is the instruction data and hence we should only check for that to match and nothing else. No piglit regression in jenkins. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Only write program to cache when it doesn't exist yetTopi Pohjolainen2015-08-131-7/+7
| | | | | | | | | | | Current logic re-writes the same data when existing data is found. Not that this actually matters at the moment in practice, the contraint for finding matching data is too severe to ever allow data to be shared between two items in the cache. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Rename brw_upload_item_data to brw_alloc_item_dataTopi Pohjolainen2015-08-131-9/+10
| | | | | | | | | | and simplify the interface to take directly the size and to return the offset. The routine does nothing more than allocate, it doesn't upload anything. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* mesa: update MaxShaderStorageBlockSize to 2^27Tapani Pälli2015-08-131-1/+1
| | | | | | | | | | | | Extension spec originally required 2^24 but 2^27 is the minimum value required by OpenGL 4.5 and OpenGL ES 3.1 specifications. Fixes: ES31-CTS.shader_storage_buffer_object.basic-max Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* mesa: fix name returned for XFB varyingsTapani Pälli2015-08-131-4/+16
| | | | | | | | | | | | | | | _mesa_get_program_resource_name has logic to append '[0]' in name if variable is an array, this should be skipped for XFB varyings that have array index already appended. v2: fix comment, change also GL_NAME_LENGTH query to match the behaviour Fixes: ES31-CTS.program_interface_query.transform-feedback-types Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Martin Peres <[email protected]>
* mesa: Fix printf format specifier warn of the ptrdiff_tEdward O'Callaghan2015-08-131-1/+1
| | | | | | | See §7.19.6.1, paragraph 7 of the ISO C specification. Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: allow setting geometry shader sampler statesMarek Olšák2015-08-131-5/+0
| | | | | | | | We were ignoring them. This is both hilarious and sad. Cc: [email protected] Reviewed-by: Edward O'Callaghan <eocallaghan at alterapraxis.com> Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix polygon offset scaleMarek Olšák2015-08-132-2/+2
| | | | | | | | | | | | The value was copied from r300g, which uses 1/12 subpixels, but this hw uses 1/16 subpixels. Should fix piglit: gl-1.4-polygon-offset (formerly a glean test) (untested, ported from radeonsi) Reviewed-by: Edward O'Callaghan <eocallaghan at alterapraxis.com> Reviewed-by: Alex Deucher <[email protected]> Cc: [email protected]
* radeonsi: fix polygon offset scaleMarek Olšák2015-08-131-1/+1
| | | | | | | | | | The value was copied from r300g, which uses 1/12 subpixels, but this hw uses 1/16 subpixels. Fixes piglit: gl-1.4-polygon-offset (formerly a glean test) Reviewed-by: Michel Dänzer <[email protected]> Cc: [email protected]
* radeonsi: enable VS_OUT_MISC_SIDE_BUS_ENAMarek Olšák2015-08-131-0/+1
| | | | | | | This is recommended for better performance. Diag tests always enable this. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add support for gl_PrimitiveID in the fragment shaderMarek Olšák2015-08-134-9/+49
| | | | | | | | | | It must be obtained from the VS. The GS scenario A must be enabled for PrimID to be generated for the VS. + 4 piglits Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move VGT_GS_MODE to the VS stateMarek Olšák2015-08-131-2/+6
| | | | | | The VS will want to select GS scenario A here (VS with PrimitiveID). Reviewed-by: Michel Dänzer <[email protected]>
* freedreno/a4xx: format updatesRob Clark2015-08-121-4/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx+a4xx: add texture buffer object supportRob Clark2015-08-127-11/+41
| | | | | | | | | Basic texture buffer support. Should be straightforward to add first/ last_element support. And with a bit of work in ir3 emulate larger texture buffer sizes. But this seems to be enough for stk gl31 render paths. Signed-off-by: Rob Clark <[email protected]>
* ttn: add buffer texture typeRob Clark2015-08-121-0/+3
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: 'keeps' need neighbors found tooRob Clark2015-08-121-0/+5
| | | | | | | | | This shows up with a glamor shader, which does a TXF and uses the result for conditional kill. Before we wouldn't group the fanin (collect) neighbors which need to be allocated adjacently at RA, resulting in badness. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/print: print left/right neighbors tooRob Clark2015-08-121-0/+14
| | | | | | When debugging compiler, this is useful to see. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use nir pass to lower const to scalarRob Clark2015-08-121-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: point-size and spritelist fixesRob Clark2015-08-127-50/+46
| | | | | | | | | a4xx needs similar treatment as 995f55a6 Also fixup a few point-size and vpsrepl issues and drop fix_blit_fp() hack previously needed for mem2gmem. Signed-off-by: Rob Clark <[email protected]>
* freedreno: cap cleanupsRob Clark2015-08-122-16/+16
| | | | | | | | Move a few things around to group stuff that is common to a3xx/a4xx together. Also, introduce is_ir3() for things that are more specific to the compiler / shader-ISA than to the gpu generation. Signed-off-by: Rob Clark <[email protected]>
* mesa: add NV_read_{depth,stencil,depth_stencil} extensionsRob Clark2015-08-122-9/+42
| | | | | | | | These extensions allow reading depth/stencil for GLES contexts, which is useful for tools like apitrace. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965/shader: Don't use OptimizeForAOS for NIR vec4 vertex shadersJason Ekstrand2015-08-121-1/+1
| | | | | | | | | | | | | | Shader-db results for vec4 programs using NIR on HSW: total instructions in shared programs: 1838157 -> 1828469 (-0.53%) instructions in affected programs: 275978 -> 266290 (-3.51%) helped: 2827 HURT: 244 GAINED: 0 LOST: 0 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* mesa/teximage: report the correct function which triggered the errorNanley Chery2015-08-121-4/+4
| | | | | | | | | This function would always report that a dimension or size error occurred in glTexImage even when it was called from glCompressedTexImage. Replace the static string with the dynamically determined caller name. Reviewed-by: Tapani Palli <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* mesa/formats: don't byteswap when building array formatsOded Gabbay2015-08-121-11/+3
| | | | | | | | | | | | | Because we build here an array format, we don't need to swap the bytes for big endian. If it isn't an array format, the bytes will be swapped in _mesa_format_convert. v2: remove temp variable Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "10.5 10.6" <[email protected]>
* mesa/formats: Don't flip channels of null array formatsJason Ekstrand2015-08-121-1/+2
| | | | | | | | | | Before, if we encountered an array format of 0 on a BE system, we would flip all the channels even though it's an invalid format. This would result in a mostly invalid format with a swizzle of yyyy or wwww. Instead, we should just return 0 if the array format stashed in the format info is invalid. Cc: "10.6 10.5" <[email protected]>
* mesa/formats: Fix swizzle flipping for big-endian targetsJason Ekstrand2015-08-121-4/+12
| | | | | | | | | | | | | The swizzle defines where in the format you should look for any given channel. When we flip the format around for BE targets, we need to change the destinations of the swizzles, not the sources. For example, say the format is an RGBX format with a swizzle of xyz1 on LE. Then it should be wzy1 on BE; however, the code as it was before, would have made it 1zyx on BE which is clearly wrong. Reviewed-by: Iago Toral <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Cc: "10.6 10.5" <[email protected]>
* mesa/formats: Only do byteswapping for packed formatsJason Ekstrand2015-08-121-3/+3
| | | | | Reviewed-by: Iago Toral <[email protected]> Cc: "10.6 10.5" <[email protected]>
* configure.ac: Always define __STDC_LIMIT_MACROS.Matt Turner2015-08-111-1/+1
| | | | | | | ... which ensures that we get defines like LONG_MAX in C++. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91591 Reviewed-by: Jose Fonseca <[email protected]>
* i965: Optimize brw_inst_set_bits() and brw_compact_inst_set_bits().Matt Turner2015-08-111-4/+4
| | | | | | | | | | Cuts about 2k of .text. text data bss dec hex filename 5017141 197160 27672 5241973 4ffc75 i965_dri.so before 5014981 197160 27672 5239813 4ff405 i965_dri.so after Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Optimize brw_inst_bits() and brw_compact_inst_bits().Matt Turner2015-08-111-4/+4
| | | | | | | | | | Cuts about 1k of .text. text data bss dec hex filename 5018165 197160 27672 5242997 500075 i965_dri.so before 5017141 197160 27672 5241973 4ffc75 i965_dri.so after Reviewed-by: Kenneth Graunke <[email protected]>
* docs: add news item and link release notes for 10.6.4Emil Velikov2015-08-112-0/+7
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 10.6.4Emil Velikov2015-08-111-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 99793e2541510fe208d29e69fedf97a6fff006f8)
* docs: add release notes for 10.6.4Emil Velikov2015-08-111-0/+136
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 6b2fcee64edadbd4db2293f5f4fc1a70e80c7251)
* gallium/radeon: fix r600g build if LLVM is disabledMarek Olšák2015-08-111-4/+5
| | | | | | | | MESA_LLVM_VERSION_PATCH is undefined. Reviewed-by: Edward O'Callaghan <eocallaghan at alterapraxis.com> Tested-by: Benjamin Bellec <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* r600g: use a bitfield to track dirty atomsGrazvydas Ignotas2015-08-114-10/+56
| | | | | | | | | | | | | | | r600 currently has 73 atoms and looping through their dirty flags has become costly because checking each flag requires a pointer dereference before the read. To avoid having to do that add additional bitfield which can be checked really quickly thanks to tzcnt instruction. id field was added to struct r600_atom but that doesn't affect memory usage for both 32 and 64 bit CPUs because it was stuffed into padding. The performance improvement is ~2% for benchmarks that can have FPS in the thousands but is hardly measurable in "real" programs. Signed-off-by: Marek Olšák <[email protected]>
* r600g: don't mark unused atom dirtyGrazvydas Ignotas2015-08-111-1/+3
| | | | | | On evergreen config_state is not used, so don't mark it dirty. Signed-off-by: Marek Olšák <[email protected]>
* r600g: use a helper to add an initialized atomGrazvydas Ignotas2015-08-114-8/+16
| | | | | | | Instead of writing to rctx->atoms directly use a helper to take advantage of assert checks. Signed-off-by: Marek Olšák <[email protected]>
* gallium/radeon: use helper functions to mark atoms dirtyGrazvydas Ignotas2015-08-1119-145/+182
| | | | | | | | | | This is analogous to r300_mark_atom_dirty() used by r300, and will be used by later patches. For common radeon code, appropriate helper is called through a function pointer. No functional changes. Signed-off-by: Marek Olšák <[email protected]>
* docs: Mark ARB_shader_image_load_store as done on i965.Francisco Jerez2015-08-112-1/+2
|
* i965: Expose ARB_shader_image_load_store.Francisco Jerez2015-08-111-0/+1
| | | | | Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Clamp image array indices to the array bounds on IVB.Francisco Jerez2015-08-111-4/+21
| | | | | | | | | | | This fixes the spec@arb_shader_image_load_store@invalid index bounds piglit tests on IVB, which were causing a GPU hang and then a crash due to the invalid binding table index result of the array index calculation. Other generations seem to behave sensibly when an invalid surface is provided so it doesn't look like we need to care. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Translate image load, store and atomic NIR intrinsics.Francisco Jerez2015-08-111-0/+106
| | | | | | v2: Move array coordinate workaround into the surface builder. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Handle image uniforms in NIR programs.Francisco Jerez2015-08-112-8/+44
| | | | | | v2: Move the image_params array back to brw_stage_prog_data. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Implement logic to set up and upload an image uniform.Francisco Jerez2015-08-112-0/+32
| | | | | | v2: Move the image_params array back to brw_stage_prog_data. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Teach type_size() about the size of an image uniform.Francisco Jerez2015-08-112-0/+2
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/fs: Implement image load, store and atomic.Francisco Jerez2015-08-112-0/+264
| | | | | | | | v2: Drop VEC4 suport. v3: Rebase. v4: Move array coordinate workaround into the surface builder. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Import image format conversion primitives.Francisco Jerez2015-08-111-0/+265
| | | | | | | | | | | | | | | | | | | Define bitfield packing, unpacking and type conversion operations in terms of which the image format conversion code will be implemented. These don't directly know about image formats: The packing and unpacking functions take a 4-tuple of bit shifts and a 4-tuple of bit widths as arguments, determining the bitfield position of each component. Most of the remaining functions perform integer, fixed point normalized, and floating point type conversions, mapping between a target type with per-component bit widths given by a parameter and a matching native representation of the same type. v2: Drop VEC4 suport. v3: Rebase. v4: Fix clamping of negative floats in the unsigned case of emit_convert_to_scaled(). Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Import image format metadata queries.Francisco Jerez2015-08-111-0/+148
| | | | | | | | | | | Define some utility functions to query the bitfield layout of a given image format and whether it satisfies a number of more or less hardware-specific properties. v2: Drop VEC4 suport. v3: Add SKL support. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Import code to transform image coordinates into surface coordinates.Francisco Jerez2015-08-111-0/+52
| | | | | | Accounting for the padding required for 1D arrays in certain cases. Reviewed-by: Jason Ekstrand <[email protected]>