summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Fix MapTextureImage for multi-slice/level stencil buffers.Kenneth Graunke2016-04-261-2/+2
| | | | | | | | | | | We called intel_miptree_get_image_offset() to get the image offsets for the current level/slice, but then proceeded to ignore the results and clobber level/slice 0 every time. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94713 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Move TCS output indirect_offset.file check out a level.Kenneth Graunke2016-04-261-42/+46
| | | | | | | | I want to add another condition. Moving the indirect_offset.file check out a level should make this a little easier. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/fs: Reduce the response length of sampler messages on Skylake.Kenneth Graunke2016-04-264-5/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Often, we don't need a full 4 channels worth of data from the sampler. For example, depth comparisons and red textures only return one value. To handle this, the sampler message header contains a mask which can be used to disable channels, and reduce the message length (in SIMD16 mode on all hardware, and SIMD8 mode on Broadwell and later). We've never used it before, since it required setting up a message header. This meant trading a smaller response length for a larger message length and additional MOVs to set it up. However, Skylake introduces a terrific new feature: for headerless messages, you can simply reduce the response length, and it makes the implicit header contain an appropriate mask. So to read only RG, you would simply set the message length to 2 or 4 (SIMD8/16). This means we can finally take advantage of this at no cost. total instructions in shared programs: 9091831 -> 9073067 (-0.21%) instructions in affected programs: 191370 -> 172606 (-9.81%) helped: 2609 HURT: 0 total cycles in shared programs: 70868114 -> 68454752 (-3.41%) cycles in affected programs: 35841154 -> 33427792 (-6.73%) helped: 16357 HURT: 8188 total spills in shared programs: 3492 -> 1707 (-51.12%) spills in affected programs: 2749 -> 964 (-64.93%) helped: 74 HURT: 0 total fills in shared programs: 4266 -> 2647 (-37.95%) fills in affected programs: 3029 -> 1410 (-53.45%) helped: 74 HURT: 0 LOST: 1 GAINED: 143 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Use inst->regs_written for rlen for texture instructionsJason Ekstrand2016-04-262-9/+3
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/fs: Properly report regs_written from SAMPLEINFOJason Ekstrand2016-04-262-2/+9
| | | | | | | | | The previous behavior would only allocate one register and then write four thus potentially stomping three innocent bystanders. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Set regs_written on texturing instructionsJason Ekstrand2016-04-261-0/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Don't force a header for texture offsets of 0.Kenneth Graunke2016-04-261-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Calling textureOffset() with an offset of <0, 0, 0> is equivalent to calliing texture(). We don't actually need to set up an offset, which causes a message header to be created. A fairly common pattern is to sample at a point with a bunch of offsets, and average them. It's natural to write all the lookups as textureOffset, but use <0, 0> for the center sample. shader-db results on Skylake: total instructions in shared programs: 9092095 -> 9092087 (-0.00%) instructions in affected programs: 2826 -> 2818 (-0.28%) helped: 12 HURT: 2 total cycles in shared programs: 70870166 -> 70870144 (-0.00%) cycles in affected programs: 15924 -> 15902 (-0.14%) helped: 2 HURT: 0 This also helps prevent code quality regressions in a future patch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by Jason Ekstrand <[email protected]>
* i965/blorp: Convert state setup to CJason Ekstrand2016-04-264-4/+3
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Make state setup C-safeJason Ekstrand2016-04-263-4/+4
| | | | | | | | Previously they (very rarely) used C++isms that prevented them from being compiled as C. As of this commit, they can be compiled as either C or C++. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Convert brw_blorp.cpp to a C fileJason Ekstrand2016-04-262-5/+2
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Make all of brw_blorp.h accessible to CJason Ekstrand2016-04-261-9/+8
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Turn brw_blorp_params into a C-style structJason Ekstrand2016-04-267-80/+71
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Turn coord_transform into a C-style structJason Ekstrand2016-04-262-17/+16
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Turn blorp_surface_info into a C-style structJason Ekstrand2016-04-267-69/+69
| | | | | | | | | | This commit is mostly mechanical except that it changes where we set the swizzle. Previously, the blorp_surface_info constructor defaulted the swizzle to SWIZZLE_XYZW. Now, we memset to zero and fill out the swizzle when we setup the rest of the struct. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Roll mip_info into surface_infoJason Ekstrand2016-04-262-37/+17
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Get rid of the blorp_blit_params classJason Ekstrand2016-04-262-167/+131
| | | | | | | It was really just a wrapper around the function that constructed it. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Remove the hiz params classJason Ekstrand2016-04-262-37/+42
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Remove the clear params classesJason Ekstrand2016-04-261-132/+83
| | | | | | | | | They didn't really add anything other than a key and extra layers of function calls. This commit just inlines the extra functions and gets rid of the extra classes. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Remove the arguments to brw_blorp_params()Jason Ekstrand2016-04-262-9/+5
| | | | | | | No one was using anything other than the defaults. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Refactor to get rid of the get_wm_prog virtual functionJason Ekstrand2016-04-267-97/+58
| | | | | | | | | | | | Instead of having a virtual member function for getting the WM/PS kernel, we simply add fields for prog_data and the kernel to brw_blorp_parms and always make sure those get set as part of the different constructors. v2: Use use prog_data != NULL to check for a valid program instead of a magic kernel offset value Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/meta: initialize values to avoid random behaviour on error pathJuha-Pekka Heikkila2016-04-261-1/+1
| | | | | | | | | if brw_meta_stencil_blit() errored at wrong place 'target' would be uninitialized and cause random behaviour on leaving the funtion. Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* meta: Avoid random memory access on errorJuha-Pekka Heikkila2016-04-261-1/+1
| | | | | | | | | | Initialize drawFb to NULL in _mesa_meta_CopyImageSubData_uncompressed() if getting readFb fails uninitialized drawFb will cause randomness on cleanup. Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa: Remove every double semi-colonJakob Sinclair2016-04-262-2/+2
| | | | | | Signed-off-by: Jakob Sinclair <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa/main: removing double semi-colonsJakob Sinclair2016-04-262-2/+2
| | | | | | | | | | Trivial change. Removing unnecessary semi-colons from the code. I don't have push access so someone reviewing this can push it. Signed-off-by: Jakob Sinclair <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* scons: Whenever possible decide what to do based on platform and not compiler.Jose Fonseca2016-04-261-2/+1
| | | | | | | | | | Because compilers like GCC and Clang are effectively available everywhere so their presence/absence is seldom conclusive. Furthermore, all compilers we use now have stdint.h. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glsl: add ability to use essl 3.20Ilia Mirkin2016-04-252-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* main: select ES3.2 version when all extensions are availableIlia Mirkin2016-04-251-1/+17
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* mesa/st: log some additional invalid-fbo casesRob Clark2016-04-251-0/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Unroll SIMD16 DDY_FINE on Sandybridge.Kenneth Graunke2016-04-251-1/+5
| | | | | | | | | | | | This fixes 10 dEQP-GLES3 subtests: dEQP-GLES3.functional.shaders.derivate.dfdy.texture.float_nicest.*. Matt noticed that our Piglit tests for this use even numbered registers, while the failing dEQP tests use odd numbered registers. We believe that it works for even numbered registers, but not otherwise. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa/gles: Allow format GL_RED to be used with MESA_FORMAT_R_UNORMJordan Justen2016-04-251-0/+2
| | | | | | | | | | | | | | | | If the bound framebuffer has a format of MESA_FORMAT_R_UNORM, then IMPLEMENTATION_COLOR_READ_FORMAT will return GL_RED. This change applies to OpenGLES contexts where additional restrictions are placed on the formats that are allowed to be supported. Fixes OpenGLES 3.1 CTS tests: * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16 * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16Linear * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32F * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32FLinear Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Mark URB reads as volatile.Kenneth Graunke2016-04-251-0/+3
| | | | | | | | | | They can be affected by URB writes. In the upcoming scalar TCS backend, this prevents read-modify-write cycles from being broken by CSE removing reads. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965: Make a few tessellation related functions non-static.Kenneth Graunke2016-04-253-47/+51
| | | | | | | | Also, move them to brw_shader.cpp so they're in a location for code used by both the vec4 and fs worlds. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965/tex_image: Flush certain subnormal ASTC channel valuesNanley Chery2016-04-231-0/+87
| | | | | | | | | | | | | | | When uploading a linear, void-extent, ASTC LDR block on Skylake, we are required to flush to zero the UNORM16 channel values that would be denormalized. This is specifically required for the values: 1, 2, and 3. Fixes the 14 failing tests in: dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.* v2: Split out flushing function (Kristian Høgsberg) v3: Map with READ instead of INVALIDATE (Kenneth Graunke) Signed-off-by: Nanley Chery <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/blorp: Enable for buffer resolvesTopi Pohjolainen2016-04-231-1/+1
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94181 Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Enable for normal color clearsTopi Pohjolainen2016-04-231-0/+9
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Fix clear code for ignoring colormask for XRGB formats on Gen9+Topi Pohjolainen2016-04-231-7/+26
| | | | | | | | | | | | | This is equivalent of 73b01e2711ff45a1f313d5372d6c8fa4fe55d4d2 for blorp. v2 (Ken): No need to call _mesa_format_has_color_component() now that the number of components is gotten from _mesa_base_format_component_count(). Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/formats: Take luminance into account in component countTopi Pohjolainen2016-04-231-0/+1
| | | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/blorp: Do not trigger re-emission of base state addressTopi Pohjolainen2016-04-232-2/+0
| | | | | | | | In case blorp needs to configure it will be just as if render or compute pipeline had configured it. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Reconfigure base state address only if neededTopi Pohjolainen2016-04-233-3/+7
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Use BRW_NEW_BLORP instead of trashing all state bitsTopi Pohjolainen2016-04-232-5/+2
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make all atoms to track BRW_NEW_BLORP by defaultKenneth Graunke2016-04-2362-46/+179
| | | | Reviewed-by: Topi Pohjolainen <[email protected]
* i965: Introduce state flag for blorpTopi Pohjolainen2016-04-232-0/+3
| | | | | | | | | | | | | | | | | | | | | | | In the past, BLORP has clobbered all BRW_NEW_* state flags, to trigger re-emission of the entire 3D pipeline on the next draw. However, there are some packets BLORP simply leaves alone, so there's no need to re-emit them. Trying to reduce the set of dirty bits flagged after BLORP runs is tricky. Instead, we introduce a BRW_NEW_BLORP flag. This should be set on any atom which emits a packet that BLORP also emits. When BLORP runs, it will flag BRW_NEW_BLORP, causing those packets to get re-emitted. This also makes it easy to avoid re-emitting specific atoms - we can simply drop the BRW_NEW_BLORP flag on those. To start, we assume that all packets need to be re-emitted. This is the safest approach and closest to the existing code's behavior. Many of these are obviously not required, and can be dropped in subsequent patches. Signed-off-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/blorp/gen6: Use normal base state address setupTopi Pohjolainen2016-04-233-54/+5
| | | | | | | | | | | | This is identical to the blorp version which only differs in case fragment shader isn't used. In that case blorp would reset batch buffer address to zero. This is not really needed, and having blorp to use base state address setup that is compatible with normal upload allows one to skip resetting it. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove pointers to non-existing atomsTopi Pohjolainen2016-04-231-8/+0
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Disable KHR_texture_compression_astc_hdr on Gen9Nanley Chery2016-04-222-4/+3
| | | | | | | | | | | | Although Gen9 samples from most HDR ASTC surfaces of correctly, there currently are no software workarounds to fix the incorrect sampling that occurs in others of certain color endpoint modes. With this change, we are no longer failing the 14 tests from: dEQP-GLES3.functional.texture.compressed.astc.endpoint_value_hdr_cem_15.* Signed-off-by: Nanley Chery <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Disable channel expressions for scalar GS, TCS, TES.Kenneth Graunke2016-04-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Broadwell, I get the following shader-db statistics: Tessellation Control Shaders: total instructions in shared programs: 57327 -> 57012 (-0.55%) instructions in affected programs: 27334 -> 27019 (-1.15%) helped: 45 HURT: 0 total cycles in shared programs: 265692 -> 255188 (-3.95%) cycles in affected programs: 263122 -> 252618 (-3.99%) helped: 184 HURT: 26 Tessellation Evaluation Shaders: total instructions in shared programs: 23236 -> 23157 (-0.34%) instructions in affected programs: 2791 -> 2712 (-2.83%) helped: 27 HURT: 0 total cycles in shared programs: 151858 -> 149704 (-1.42%) cycles in affected programs: 151858 -> 149704 (-1.42%) helped: 101 HURT: 114 Geometry Shaders: Orbital Explorer goes from 6442 -> 6356 instructions. Two Shadow of Mordor shaders increase by a single instruction. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Add support for 2x msaaTopi Pohjolainen2016-04-222-10/+9
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Add support for encoding/decoding interleaved 2x msaaTopi Pohjolainen2016-04-221-8/+36
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: don't lower mod() in glsl irSamuel Iglesias Gonsálvez2016-04-221-1/+0
| | | | | | | | | NIR will lower it in nir_opt_algebraic. No change in shader-db. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/surface_state: Use libisl functions for image format loweringJason Ekstrand2016-04-213-120/+12
| | | | | | | This lets us delete some redundant code and keep all of the image_load_store format lowering logic in one place: libisl. Reviewed-by: Chad Versace <[email protected]>