summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* mesa/st: add BPTC formats, expose ARB_texture_compression_bptcIlia Mirkin2014-08-123-1/+49
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965: Delete the Gen8 code generators.Kenneth Graunke2014-08-129-4076/+0
| | | | | | | | We now use the brw_eu_emit.c code instead. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Never use the Gen8 code generators.Kenneth Graunke2014-08-123-28/+10
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Switch to the EU emit layer for code generation on Broadwell.Kenneth Graunke2014-08-123-3/+3
| | | | | | | | | | | | | Everything should be in place to unify code generation between Gen4-7 and Gen8+. We should be able to drop the Gen8 generators at this point. However, leave them hooked up for a brief moment, for testing and comparison purposes. Set GEN8=1 to use the old Gen8+ code generator paths. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Retype atomics to UD in Gen8 code generation.Kenneth Graunke2014-08-122-4/+8
| | | | | | | | | | Kind of a moot point since we're deleting Gen8 code generation, but this at least helps make it match the Gen4-7 code. It's probably more reasonable than using float. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vp: Use the sampler for pull constant loads on Gen7/7.5.Kenneth Graunke2014-08-121-5/+12
| | | | | | | | | | | | | This improves performance in Trine 2 at 1280x720 (windowed) on "Very High" settings by 30% (in the interactive menu) to 45% (in the forest by the giant frog) on Haswell GT3e. It also now generates the same assembly on Gen7 as it does on Gen8, which always used the sampler for both types. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Drop gen <= 7 assertion in pull constant load handling.Kenneth Graunke2014-08-121-1/+0
| | | | | | | | I don't see any reason for this to exist. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Set src0 file to IMM on Gen8+ flow control instructions.Kenneth Graunke2014-08-121-9/+36
| | | | | | | | | | | | | | | According to the documentation, we need to set the source 0 register type to IMM for flow control instructinos that have both JIP and UIP. Out of paranoia, just make all flow control instructions use IMM; there's no benefit to using ARF anyway, and it could trouble that's difficult to diagnose. See commit 9584959123b0453cf5313722357e3abb9f736aa7, which did the analogous change in the gen8_generator code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Refactor brw_WHILE to share a bit more code on Gen6+.Kenneth Graunke2014-08-121-16/+12
| | | | | | | | | We're going to add a Gen8+ case shortly, which would need to duplicate this code again. Instead, share it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Emulate F32TO16 and F16TO32 on Broadwell.Kenneth Graunke2014-08-121-2/+50
| | | | | | | | | | | | | | | | | When we combine the Gen4-7 and Gen8+ generators, we'll need to handle half float packing/unpacking functions somehow. The Gen8+ generator code today just emulates the behavior of the Gen7 F32TO16/F16TO32 instructions, including the align16 mode bugs. Rather than messing with fs_generator/vec4_generator, I decided to just emulate the instructions at the brw_eu_emit.c layer. v2: Change gen >= 7 asserts to gen == 7 (suggested by Chris Forbes). Fix regressions on Haswell in VS tests due to type assertions. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Port Gen8 SET_VERTEX_COUNT handling to vec4_generator.Kenneth Graunke2014-08-121-18/+25
| | | | | | | | | | | | Broadwell requires the number of vertices written by the geometry shader to be specified in a separate register, as part of the terminating message's payload. This also means GS_OPCODE_THREAD_END needs to increment mlen. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Switch to MOV, not OR, for GS_OPCODE_THREAD_END on Gen8.Kenneth Graunke2014-08-121-4/+3
| | | | | | | | Either should work. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Use MOV, not OR, to set URB write channel mask bits.Kenneth Graunke2014-08-121-4/+2
| | | | | | | | | | | | | | g0.5 has nothing of value to contribute to m0.5. In both the VS and GS payload, g0.5 contains the scratch space pointer - which is definitely not of any use. The GS payload also contains FFTID, but the URB write message header doesn't want FFTID. The only reason I used OR was because Eric originally requested it. On Broadwell, I used MOV, and that's worked out fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Don't set flag_subreg_nr = 1 on predicated FB write setup.Kenneth Graunke2014-08-121-0/+1
| | | | | | | | | | | | | | | On Haswell, we implement "discard" via predicated SEND messages, using f0.1 instead of f0.0. To accomplish this, we set inst->flag_subreg to 1 on the FS_OPCODE_FB_WRITE. Most instructions using fs_inst::flag_subreg expand to a single assembly instruction. However, FS_OPCODE_FB_WRITE can generate several MOVs for setting up header information. We don't want to set flag_subreg on those, so override the default state back to 0. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965/vec4: Respect ir->force_writemask_all in Gen8 code generation.Kenneth Graunke2014-08-121-0/+1
| | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+.Kenneth Graunke2014-08-121-1/+3
| | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Return NONE from brw_swap_cmod on unknown input.Matt Turner2014-08-123-3/+3
| | | | | | | | Comparing ~0u with a packed enum (i.e., 1 byte) always evaluates to false. Shouldn't gcc warn about this? Reported-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/meta: Support decompressing floating-point formatsNeil Roberts2014-08-122-33/+78
| | | | | | | | | | | | | Previously the Meta implementation of glGetTexImage would fall back to _mesa_get_teximage if the texturing is not using an unsigned normalised format. However in order to support the half-float formats of BPTC textures we can make it render to a floating-point renderbuffer instead. This patch makes decompression_state have two FBOs, one for the GL_RGBA format and one for GL_RGBA32F. If a floating-point texture is encountered it will try setting up a floating-point FBO. It will now also check the status of the FBO and fall back to _mesa_get_teximage if the FBO is not complete. Reviewed-by: Ian Romanick <[email protected]>
* swrast: Enable GL_ARB_texture_compression_bptcNeil Roberts2014-08-121-0/+1
| | | | | | Enables BPTC texture compression on the software rasterizer. Reviewed-by: Ian Romanick <[email protected]>
* i965: Enable the GL_ARB_texture_compression_bptc extensionNeil Roberts2014-08-122-0/+7
| | | | | | | Enables the BPTC extension on Gen>=7 and adds the necessary format mappings to get the right surface type value. Reviewed-by: Ian Romanick <[email protected]>
* mesa/main: Modify generate_mipmap_compressed to cope with float texturesNeil Roberts2014-08-121-5/+8
| | | | | | | | | Once we add BPTC texture support we will need to generate mipmaps for compressed floating point textures too. Most of the code seems to already be there but it just needs a few extra lines to get it to use GL_FLOAT instead of GL_UNSIGNED_BYTE as the type for the temporary buffers. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add texstore functions for BPTC-compressed texturesNeil Roberts2014-08-123-0/+709
| | | | | | | | | | | | | | | | | | | | | | This adds compressors for all four of the BPTC compressed-texture formats. The compressor is written from scratch and takes a very simple approach. It always uses a single mode of the BPTC format (4 for unorm and 3 for half-floats) and picks the two endpoints by dividing the texels into those which have more or less than the average luminance of the block and then calculating an average color of the texels within each division. It's probably not really sensible to try to use BPTC compression at runtime because for example with the Nvidia offline compression tool it can take in the order of an hour to compress a full-screen image. With that in mind I don't think it's worth having a proper compressor in Mesa and this approach gives reasonable results for a usage that is basically a corner case. v2: Always use the custom compressor, even for the unorm formats. Fix the quantization step for the half-float format compressor. Fixed a typo which was breaking the right-hand edge of half-float textures with a width that isn't a multiple of four. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add texel fetch functions for BPTC-compressed texturesNeil Roberts2014-08-124-0/+1001
| | | | | | | | | | Adds functions to fetch from any of the four BPTC-compressed formats. v2: Set the alpha component to 1.0 when fetching from the half-float formats instead of leaving it uninitialised. Don't linearize the alpha component when fetching from sRGB. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add the format enums for BPTC-compressed imagesNeil Roberts2014-08-128-0/+112
| | | | | | | | | | | | | | | | | | This adds the following four Mesa image format enums which correspond to the four BPTC compressed texture formats: MESA_FORMAT_BPTC_RGBA_UNORM MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM MESA_FORMAT_BPTC_RGB_SIGNED_FLOAT MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT It also updates the format information functions to handle these and the corresponding GL enums. v2: Also modify _mesa_get_format_color_encoding, _mesa_get_srgb_format_linear and _mesa_get_uncompressed_format Reviewed-by: Ian Romanick <[email protected]>
* mesa/format_info: Add support for the BPTC layoutNeil Roberts2014-08-121-0/+3
| | | | | | | | | | | | | | | | | | Adds the ‘bptc’ layout to get_channel_bits. The channel bits for BPTC depend on the mode but as it only has to be an approximation this sets it to 8 for the two UNORM formats and 16 for the two half-float formats. These represent the minimum number of bits of variation that can be generated by the interpolation of the two formats. This doesn't quite match what we do for S3TC which only returns 4 even though it can similarly generate 8 bits from the interpolation. However it does match what we return for ETC2. For reference, NVidia seems to return 8 bits for the UNORM formats and 32 bits for the half-float formats. v2: Change the number of bits to 8/8/8/8 for the UNORM formats and 16/16/16 for the half-float formats. Reviewed-by: Jason Ekstrand <[email protected]>
* mesa/format_info: Add support for compressed floating-point formatsNeil Roberts2014-08-121-1/+3
| | | | | | | | | If the name of a compressed texture format has ‘FLOAT’ in it it will now set the data type of the format to GL_FLOAT. This will be needed for the BPTC half-float formats. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Fix the base format for GL_COMPRESSED_RGB_BPTC_*_FLOAT_ARBNeil Roberts2014-08-121-2/+2
| | | | | | | | The signed and unsigned half-float BPTC-compressed formats were being reported as having a base format of GL_RGBA but they don't store an alpha channel so it should be GL_RGB. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add the GL_ARB_texture_compression_bptc extensionNeil Roberts2014-08-122-0/+2
| | | | | | | This adds a boolean in the gl_extensions struct for GL_ARB_texture_compression_bptc as well as an entry in extension_table. Reviewed-by: Ian Romanick <[email protected]>
* mesa/st: Move declaration to top of block.José Fonseca2014-08-121-1/+3
| | | | | | To fix MSVC build failure. Trivial.
* mesa/st: add support for dynamic sampler offsetsIlia Mirkin2014-08-121-17/+42
| | | | | | | | | | | | | | Replace the plain sampler index with a register reference to a sampler. We also need to keep track of the sampler array size when there is a relative reference so that we can mark the whole array used. To facilitate implementation, we add a separate ADDR register that exclusively handles the sampler relative address. Other approaches would be more invasive. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: Add a new function for getting the nonconst sampler array indexChris Forbes2014-08-122-0/+14
| | | | | | | | | | | | | If the array index is not a constant expression, the existing support will assume a zero offset (giving us the sampler index of the base of the array). For dynamically uniform indexing of sampler arrays, we need both that and the indexing expression. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: Revert part of f5cc3fdcf1680b116612fac7c39f1bd79f5e555e.Kenneth Graunke2014-08-111-1/+1
| | | | | | Fixes non-termination in various Piglit tests. Reviewed-by: Jason Ekstrand <[email protected]>
* gallium: remove PIPE_SHADER_CAP_MAX_ADDRSMarek Olšák2014-08-111-2/+1
| | | | | | | | | | | | | | | This limit is fixed in Mesa core and cannot be changed. It only affects ARB_vertex_program and ARB_fragment_program. The minimum value for ARB_vertex_program is 1 according to the spec. The maximum value for ARB_vertex_program is limited to 1 by Mesa core. The value should be zero for ARB_fragment_program, because it doesn't support ARL. Finally, drivers shouldn't mess with these values arbitrarily. Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: compute supported GL versions at DRIscreen creationMarek Olšák2014-08-111-4/+27
| | | | | | | This computes all GL versions before any context is created. It's a requirement for GLX_MESA_query_renderer. Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: pass st_config_options to query_versionsMarek Olšák2014-08-111-0/+1
| | | | | | | So move it from dri_context to dri_screen. This will be needed for version computations. Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: return version 0 if the computed core profile version is too lowMarek Olšák2014-08-111-2/+7
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: add _mesa_get_version, a ctx-independent variant of _mesa_compute_versionMarek Olšák2014-08-112-126/+152
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: add a context-independent variant of _mesa_override_gl_versionMarek Olšák2014-08-112-10/+23
| | | | | | | v2: changed GLboolean -> bool Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: make _mesa_init_constants context-independent and publicMarek Olšák2014-08-112-101/+104
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: make _mesa_init_extensions context-independentMarek Olšák2014-08-113-6/+6
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: make st_init_limits context-independentMarek Olšák2014-08-113-10/+14
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: move ShaderCompilerOptions into gl_constantsMarek Olšák2014-08-1114-31/+31
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: make st_init_extensions context-independentMarek Olšák2014-08-113-192/+241
| | | | | | | | | Setting Const.MaxSamples needed a rework, so that it doesn't call st_choose_format, which depends on st_context. Other than that, there is no change in functionality. Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: make _mesa_override_glsl_version context-independentMarek Olšák2014-08-116-7/+8
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/stapi: move setting GL versions to the state trackerMarek Olšák2014-08-111-0/+14
| | | | | | | All flags are set for st/mesa, so the state tracker doesn't have to check them. Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: convert the ETC1 format to an uncompressed one if unsupportedMarek Olšák2014-08-115-12/+40
| | | | | | | | I don't know of any hardware which supports it. With this, GL_OES_compressed_ETC1_RGB8_texture is supported if RGBA8 is supported. Reviewed-by: Glenn Kennard <[email protected]>
* st/mesa: add st_context parameter to st_mesa_format_to_pipe_formatMarek Olšák2014-08-119-29/+32
| | | | | | This will be used by the next commit. Reviewed-by: Glenn Kennard <[email protected]>
* st/mesa: advertise ARB_ES3_compatibility if GLSL 3.30 and ETC2 are supportedMarek Olšák2014-08-111-0/+28
|
* st/mesa: add support for ETC2 formatsMarek Olšák2014-08-114-8/+93
| | | | | | | The formats are emulated by translating them into plain uncompressed formats, because I don't know of any hardware which supports them. This is required for GLES 3.0 and ARB_ES3_compatibility (GL 4.3).
* mesa: add helper _mesa_is_format_etc2Marek Olšák2014-08-112-0/+28
| | | | | | | v2: renamed GLboolean -> bool Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Ian Romanick <[email protected]>