summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* intel/decoders: handle decoding MI_BBS from ringLionel Landwerlin2019-03-071-1/+1
| | | | | | | | | An MI_BATCH_BUFFER_START in the ring buffer acts as a second level batchbuffer (aka jump back to ring buffer when running into a MI_BATCH_BUFFER_END). Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/decoders: add address space indicator to get BOsLionel Landwerlin2019-03-071-1/+1
| | | | | | | Some commands like MI_BATCH_BUFFER_START have this indicator. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* st/glsl: start spilling out common st glsl conversion codeTimothy Arceri2019-03-067-122/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NIR and TGSI paths are currently intertwined which makes it not only hard to follow but also makes it hard to take advantage of the differences in IR. Here we take the first step to splitting that path apart. With this we take the opportunity to no longer call the GLSL IR optimisation passes after the final lowering calls for NIR. We can instead just use the NIR passes which can produce better code and should also result in faster compile times. The speed-up can be measured in some dolphin uber shaders due to no longer calling lower_if_to_cond_assign() for example dolphin/ubershaders/120.shader_test goes from ~1.63 -> ~1.53 seconds on my machine. There are some code changes as a result of not calling lower_if_to_cond_assign(), this is because it flattens ifs that contain UBOs where as NIR's peephole select doesn't. This is were most of the regressions in Max Waves happens with shader-db. shader-db results (VEGA): Totals from affected shaders: SGPRS: 2349056 -> 2349640 (0.02 %) VGPRS: 1322160 -> 1323300 (0.09 %) Spilled SGPRs: 21190 -> 21527 (1.59 %) Spilled VGPRs: 99 -> 99 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 72 -> 72 (0.00 %) dwords per thread Code Size: 57260904 -> 57270932 (0.02 %) bytes Compile Time: 1107186 -> 1022942 (-7.61 %) milliseconds LDS: 786 -> 786 (0.00 %) blocks Max Waves: 391932 -> 391619 (-0.08 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Eric Anholt <[email protected]>
* i965: stop calling nir_lower_returns()Timothy Arceri2019-03-061-3/+1
| | | | | | We now call this for all drivers in glsl_to_nir() instead. Reviewed-by: Eric Anholt <[email protected]>
* glsl: use NIR function inlining for drivers that use glsl_to_nir()Timothy Arceri2019-03-062-2/+2
| | | | | | | | glsl_to_nir() is still missing support for converting certain functions to NIR, so for those we use the GLSL IR optimisations to remove the functions. Reviewed-by: Eric Anholt <[email protected]>
* st/nir: Move 64-bit lowering laterJason Ekstrand2019-03-061-2/+5
| | | | | | | | | | | | | | Now that we have a loop unrolling cost function and loop unrolling isn't going to kill us the moment we have a 64-bit op in a loop, we can go ahead and move 64-bit lowering later. This gives us the opportunity to do more optimizations and actually let the full optimizer run even on 64-bit ops rather than hoping one round of opt_algebraic will fix everything. This substantially reduces both fp64 shader compile times and the resulting code size. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_doubles: Inline functions directly in lower_doublesJason Ekstrand2019-03-062-23/+8
| | | | | | | | | | | | Instead of trusting the caller to already have created a softfp64 function shader and added all its functions to our shader, we simply take the softfp64 shader as an argument and do the function inlining ouselves. This means that there's no more nasty functions lying around that the caller needs to worry about cleaning up. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Add a shared helper for building float64 shadersJason Ekstrand2019-03-063-99/+5
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Compile the fp64 program based on nir optionsJason Ekstrand2019-03-061-1/+2
| | | | | | | | | | Instead of looking the devinfo directly, look at the lowering options we provided to NIR. This is more accurate as it's now checking for "do we need full software lowering" rather than a hardware bit. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()Timothy Arceri2019-03-062-4/+4
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_location_offset() -> struct_location_offset()Timothy Arceri2019-03-062-2/+2
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename is_record() -> is_struct()Timothy Arceri2019-03-063-8/+8
| | | | | | | | | | Replace was done using: find ./src -type f -exec sed -i -- \ 's/is_record(/is_struct(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* nir: Add multiplier argument to nir_lower_uniforms_to_ubo.Timur Kristóf2019-03-052-2/+2
| | | | | | | | | | | | | Note that locations can be set in different units, and the multiplier argument caters to supporting these different units. For example, st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4, while tgsi_to_nir uses bytes, so the multiplier should be 16. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Move nir_lower_uniforms_to_ubo to compiler/nir.Timur Kristóf2019-03-056-105/+2
| | | | | | | | | | | | The nir_lower_uniforms_to_ubo function is useful outside of mesa/state_tracker, and in fact is needed to produce NIR for drivers that have the PIPE_CAP_PACKED_UNIFORMS capability. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Implement threaded GL support.Kenneth Graunke2019-03-053-0/+51
| | | | | | | | | | | | | | | | | | | Now i965 supports mesa_glthread=true like Gallium drivers do. According to Markus (degasus), the Citra emulator now runs ~30% faster. Emmanuel (linkmauve) also reported that the Dolphin emulator improved by 2.8x on one game. (Both of those still need to be added to drirc.) An Intel Mesa CI run with mesa_glthread=true appears to be happy. Bioshock Infinite's benchmark mode seems to be around 15-20% faster on my Skylake GT4 at 1920x1080. Tested-by: Markus Wick <[email protected]> Tested-by: Emmanuel Gil Peyrot <[email protected]> Tested-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: [u/i]mulExtended optimization for GLSLSagar Ghuge2019-03-042-0/+2
| | | | | | | | | | | | | | | Optimize mulExtended to use 32x32->64 multiplication. Drivers which are not based on NIR, they can set the MUL64_TO_MUL_AND_MUL_HIGH lowering flag in order to have same old behavior. v2: Add missing condition check (Jason Ekstrand) Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <Matt Turner <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa: whitespace fixes in st_texture.hBrian Paul2019-03-041-9/+13
| | | | Trivial.
* st/mesa: line wrapping, whitespace fixes in st_cb_texture.cBrian Paul2019-03-041-2/+4
| | | | Trivial.
* st/mesa: whitespace fixes in st_sampler_view.cBrian Paul2019-03-041-6/+10
| | | | | Replace tabs w/ spaces. 80-column wrapping. Trivial.
* st/mesa: Invalidate the gallium array atom only if needed.Mathias Fröhlich2019-03-041-2/+4
| | | | | | | | | | | | Now that the buffer object usage history tracks if it is being used as vertex buffer object, we can restrict setting the ST_NEW_VERTEX_ARRAYS bit to dirty on glBufferData calls to buffers that are potentially used as vertex buffer object. Also put a note that the same could be done for index arrays used in indexed draws. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* mesa: Track buffer object use also for VAO usage.Mathias Fröhlich2019-03-044-4/+15
| | | | | | | | | We already track the usage history for buffer objects in a lot of aspects. Add GL_ARRAY_BUFFER and GL_ELEMENT_ARRAY_BUFFER to gl_buffer_object::UsageHistory. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* mesa: Expose EXT_texture_query_lod and add support for its use shadersGert Wollny2019-03-031-0/+1
| | | | | | | | | | EXT_texture_query_lod provides the same functionality for GLES like the ARB extension with the same name for GL. v2: Set ES 3.0 as minimum GLES version as required by the extension Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: add support for lowering fp64/int64 for nir driversDave Airlie2019-03-021-1/+98
| | | | | | | | | | | | | | | | | | | | This might enough for iris and possible r600 (when it gets NIR) This appears to work for iris. v2: * change cap return so DOUBLES == 2 means sw emu v3: * Refactor using int64/doubles lowering options which were added into nir options * Remove DOUBLES == 2 added in v2 [jordan: Remove "2" value on PIPE_CAP_DOUBLES] [jordan: Use lowering options added to nir options] Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* st/nir: count num_uniforms for FS bultin shaderCaio Marcelo de Oliveira Filho2019-02-271-0/+2
| | | | | | | | Usually the uniforms will be assigned locations and have their slots counted automatically, but for builtin shaders the location assignment is manual. So count them too otherwise we get num_uniforms == 0. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: fix shader cache for packed param listTimothy Arceri2019-02-282-0/+7
| | | | | | | | | | | | | | | Some types of params such as some builtins are always padded. We need to keep track of this so we can restore the list correctly. Here we also remove a couple of cache entries that are not actually required as they get rebuilt by the _mesa_add_parameter() calls. This patch fixes a bunch of arb_texture_multisample and arb_sample_shading piglit tests for the radeonsi NIR backend. Fixes: edded1237607 ("mesa: rework ParameterList to allow packing") Reviewed-by: Marek Olšák <[email protected]>
* i965: Fix allow_higher_compat_version workaround limited by OpenGL 3.0Yevhenii Kolesnikov2019-02-281-6/+12
| | | | | | | | | | | | Added check for higher compat profile being allowed before assigning certain extensions. Fixes: 272fe9494232 (mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile) Signed-off-by: Danylo Piliaiev <[email protected]> Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107052
* mesa: fix display list corner case assertionBrian Paul2019-02-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a failed assertion in glDeleteLists() for the following case: list = glGenLists(1); glDeleteLists(list, 1); when those are the first display list commands issued by the application. When we generate display lists, we plug in empty lists created with the make_list() helper. This function uses the OPCODE_END_OF_LIST opcode but does not call dlist_alloc() which would set the InstSize[OPCODE_END_OF_LIST] element to non-zero. When the empty list was deleted, we failed the InstSize[opcode] > 0 assertion. Typically, display lists are created with glNewList/glEndList so we set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc(). That's why this bug wasn't found before. To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST] element in make_list(). The game oolite was hitting this. Fixes: https://github.com/OoliteProject/oolite/issues/325 Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: whitespace/formatting fixes in st_cb_texture.cBrian Paul2019-02-261-32/+58
| | | | | | Remove trailing whitespace, replace tabs w/ spaces, etc. Trivial.
* i965: fixed clamping in set_scissor_bits when the y is flippedEleni Maria Stea2019-02-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calculating the scissor rectangle fields with the y flipped (0 on top) can generate negative values that will cause assertion failure later on as the scissor fields are all unsigned. We must clamp the bbox values again to make sure they don't exceed the fb_height. Also fixed a calculation error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 https://bugs.freedesktop.org/show_bug.cgi?id=109594 v2: - I initially clamped the values inside the if (Y is flipped) case and I made a mistake in the calculation: the clamp of the bbox[2] should be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I shouldn't have changed the ScissorRectangleYMax calculation. As the fixed code is equivalent with using CLAMP instead of MAX2 at the top of the function when bbox[2] and bbox[3] are calculated, and the 2nd is more clear, I replaced it. (Nanley Chery) v3: - Reversed the CLAMP change in bbox[3] as the API guarantees that the viewport height is positive. (Nanley Chery) v4: - Added nomination for the mesa-stable branch and the link to the second bugzilla bug (Nanley Chery) CC: <[email protected]> Tested-by: Paul Chelombitko <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965: Add support for sampling from XYUV imagesKasireddy, Vivek2019-02-262-0/+9
| | | | | | | | | Add support to the i965 DRI driver to sample from XYUV8888 buffers. Signed-off-by: Vivek Kasireddy <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: use nir_variable_create instead of open-coding the logicTapani Pälli2019-02-261-6/+4
| | | | | | | Fixes: 3d7611e9 "st/nir: use NIR for asm programs" Reported-by: Matthias Lorenz <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa: Reduce array updates due to current changes.Mathias Fröhlich2019-02-262-1/+10
| | | | | | | | | | | | Since using bitmasks we can easily check if we have any current value that is potentially uploaded on array setup. So check for any potential vertex program input that is not already a vao enabled array. Only flag array update if there is a potential overlap. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* dri: meson: do not prefix user provided dri-drivers-pathSergii Romantsov2019-02-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | The user can select the location where there dri drivers are installed by the dri-drivers-path meson option. By default path will be $prefix/$libdir/dri. Currently we add $prefix to the user provided path. Resulting in an incorrect or even missing path. v2: fixed dri_search_path by default, rebased to master v3: new commit-message (Emil Velikov), cc mesa-stable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698 CC: Rafael Antognolli <[email protected]> CC: Dylan Baker <[email protected]> Cc: 18.3 19.0 <[email protected]> Fixes: 306914db92e1 (meson: Add dridriverdir variable to dri.pc.) Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa/core: Enable EXT_depth_clamp for GLES >= 2.0Gert Wollny2019-02-252-4/+5
| | | | | | | | | | | | | | | | The extension NV_depth_clamp is written against OpenGL 1.2.1, and since GLES 2.0 is based on GL 2.0 there is no reason not to enable this extension also for GLES >= 2.0. v2: Use EXT_depth_clamp that has been proposed to Khronos v3: - Fix check for extension availability (Erik Faya-Lund) - Also fix the test in is_enabled v4: - Test both, ARB and EXT extension (Erik) v5: - Fix white space errors (Erik) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]>
* mesa: Fix RGBBuffers for renderbuffers with sized internal formatsKenneth Graunke2019-02-251-1/+4
| | | | | | | | | | | | | | | For texture attachments, 'f' is texImg->_BaseFormat, but for renderbuffer attachments, 'f' is att->Renderbuffer->InternalFormat. InternalFormat may be something like GL_RGB8, which causes our (f == GL_RGB) check to fail. Switch to using a proper _BaseFormat, which drops the size. Fixes dEQP-GLES31.functional.draw_buffers_indexed.random. max_required_draw_buffers.15 on iris when combined with a driver fix. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Timur Kristóf <[email protected]>
* st/mesa: remove unused header-fileErik Faye-Lund2019-02-243-43/+0
| | | | | | | | | | This header has been unused since f8f2520e88c ("st/mesa: Remove unnecessary headers"). And in the more than 8 years since, this hasn't been useful. So let's just get rid of it. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir, glsl: move pixel_center_integer/origin_upper_left to shader_info.fsAlejandro Piñeiro2019-02-217-32/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On GLSL that info is set as a layout qualifier when redeclaring gl_FragCoord, so somehow tied to a specific variable. But in practice, they behave as a global of the shader. On ARB programs they are set using a global OPTION (defined at ARB_fragment_coord_conventions), and on SPIR-V using ExecutionModes, that are also not tied specifically to the builtin. This patch moves that info from nir variable and ir variable to nir shader and gl_program shader_info respectively, so the map is more similar to SPIR-V, and ARB programs, instead of more similar to GLSL. FWIW, shader_info.fs already had pixel_center_integer, so this change also removes some redundancy. Also, as struct gl_program also includes a shader_info, we removed gl_program::OriginUpperLeft and PixelCenterInteger, as it would be superfluous. This change was needed because recently spirv_to_nir changed the order in which execution modes and variables are handled, so the variables didn't get the correct values. Now the info is set on the shader itself, and we don't need to go back to the builtin variable to set it. Fixes: e68871f6a ("spirv: Handle constants and types before execution modes") v2: (Jason) * glsl_to_nir: get the info before glsl_to_nir, while all the rest of the info gathering is happening * prog_to_nir: gather the info on a general info-gathering pass, not on variable setup. v3: (Jason) * Squash with the patch that removes that info from ir variable * anv: assert that OriginUpperLeft is true. It should be already set by spirv_to_nir. * blorp: set origin_upper_left on its core "compile fragment shader", not just on some specific places (for this we added an helper on a previous patch). * prog_to_nir: no need to gather specifically this fragcoord modes as the full gl_program shader_info is copied. * spirv_to_nir: assert that we are a fragment shader when handling this execution modes. v4: (reported by failing gitlab pipeline #18750) * state_tracker: update too due changes on ir.h/gl_program v5: * blorp: minor change after change on previous patch * radeonsi: update due this change. v6: (Timothy Arceri) * prog_to_nir: remove extra whitespace * shader_info: don't use :1 on origin_upper_left * glsl: program.fs.origin_upper_left/pixel_center_integer can be move out of the shader list loop
* st/mesa: always unmap the uploader in st_atom_array.cMarek Olšák2019-02-201-8/+6
| | | | | | | This is a no-op for drivers supporting persistent mappings. Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* i965: re-emit index buffer state on a reset option change.Andrii Simiklit2019-02-203-1/+13
| | | | | | | | | | | | | | Seems like we forget to update the index buffer (ib) status and IndexedDrawCutIndexEnable or CutIndexEnable flag is left unchanged it leads to ignoring of glEnable/glDisable functions for GL_PRIMITIVE_RESTART in some cases. The index buffer (ib) status should be re-emmited after the reset option change to avoid some unexpected behavior. Reviewed-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109451 Cc: <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]>
* st/nir: use NIR for asm programsTimothy Arceri2019-02-192-1/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses prog_to_nir to translate ARB assembly programs to NIR. Co-authored by Tim Arceri, Dave Airlie, and Ken Graunke: - [Tim Arceri]: original patch - [Dave Airlie]: fix crashes with parameter names - [Ken Graunke]: - Rebase on SCALAR_ISA cap, lower wpos_ytransform too. - Rebase on streamout fixes. - Lower system values for fragcoord support. - Don't try to use prog_to_nir for ATI_fragment_shader programs. - Create TGSI for fixed-function or ARB vertex shaders even if the driver prefers NIR, so we can create draw module shaders for feedback/select emulation, which rely on TGSI. Tested on: - iris (Intel Skylake/Kabylake): Piglit & GL CTS - Ken Graunke - radeonsi (AMD Vega 64): Piglit - Ken Graunke - vc4/v3d - Piglit - Eric Anholt - freedreno - dEQP - Kristian Høgsberg Fixes lit_degenerate_case on vc4 and v3d, and vp-address-01, vp-arl-constant-array-huge-offset-neg, and vp-arl-neg-array on v3d. No Piglit regressions on radeonsi; no dEQP regressions on freedreno. Acked-by: Eric Anholt <[email protected]> Tested-by: Eric Anholt <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.Kenneth Graunke2019-02-191-2/+8
| | | | | | | | | | | | Even if the driver wants to use NIR shaders, we may need to have TGSI tokens for creating draw module vertex shaders for the feedback/select render modes. So...if the st_vertex_program has any TGSI...copy it to the variant. Acked-by: Eric Anholt <[email protected]> Tested-by: Eric Anholt <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Align doubles to a 64-bit starting boundary, even if packing.Kenneth Graunke2019-02-191-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the new Intel Iris driver, I am using Tim's new packed uniform storage system. It works great, with one caveat: our scalar compiler backend assumes that uniform offsets will be aligned to the underlying data type. For example, doubles must be 64-bit aligned, floats 32-bit, half-floats 16-bit, and so on. It does not need any other padding. Currently, _mesa_add_parameter aligns everything to 32-bit offsets, creating doubles that have an unaligned offset. This patch alters that code to align doubles to 64-bit offsets. This may be slightly less optimal for drivers which can support full packing, and allow reads from unaligned offsets at no penalty. We could make this extra alignment optional. However, it only comes into play when intermixing double and single precision uniforms. Doubles are already not too common, and intermixed values (floats then doubles) is probably even less common. At most, we burn a single 32-bit slot to the alignment, which is not that expensive. So, it doesn't seem worthwhile to add the extra complexity. Eventually, we'll likely want to update this code to allow half-float values to be packed tighter than 32-bit offsets. At that point, we'll probably want to revisit what drivers ultimately want, and add options. Acked-by: Timothy Arceri <[email protected]>
* compiler: Make is_64bit(GL_*) helper more broadly availableKenneth Graunke2019-02-191-0/+31
| | | | | | | | I'd like to use this in the prog_parameter.c code, so I need to move it into C, make it non-static, and so on. This probably isn't the ideal place for it, but I couldn't think of a better one. Acked-by: Timothy Arceri <[email protected]>
* i965: always enable EXT_float_blendIlia Mirkin2019-02-181-0/+1
| | | | | | | | | From the table in isl_format.c, it appears that all generations support blending on 32-bit float surfaces. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/mesa: enable GL_EXT_float_blend when possibleIlia Mirkin2019-02-181-0/+10
| | | | | | | | | If the driver supports PIPE_BIND_BLENABLE on RGBA32F, flip EXT_float_blend on (which will affect ES3 contexts). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]>
* mesa: add explicit enable for EXT_float_blend, and error conditionIlia Mirkin2019-02-184-1/+26
| | | | | | | | | If EXT_float_blend is not supported, error out on blending of FP32 attachments in an ES2 context. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: scale factor changes should trigger recompileLionel Landwerlin2019-02-182-1/+16
| | | | | | | | Found by inspection. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3da858a6b990c5 ("intel/compiler: add scale_factors to sampler_prog_key_data") Reviewed-by: Tapani Pälli <[email protected]>
* mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachmentTapani Pälli2019-02-181-2/+6
| | | | | | | | | | | | | | | This fixes invalid access to Attachment array which would occur if caller would exceed MaxColorAttachments. In practice this should not ever happen because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be valid and InvalidateFramebuffer will error out before but this should make coverity happy. v2: const, remove _EXT (Ian) CID: 1442559 Fixes: 0c42b5f3cb9 "mesa: wire up InvalidateFramebuffer" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Assert the execobject handles match for this deviceChris Wilson2019-02-161-0/+2
| | | | | | | Object handles are local to the device fd, so double check we are not mixing together objects from multiple screens on execbuf submission. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Removed the field etc_format from the struct intel_mipmap_treeEleni Maria Stea2019-02-153-18/+1
| | | | | | | | | After the previous changes to emulate the ETC/EAC formats using the secondary shadow miptree, the etc_format field of the intel_mipmap_tree struct became redundant and the remaining check that used it has been replaced. (Nanley Chery) Reviewed-by: Nanley Chery <[email protected]>