aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
...
* i965: Stop using gen7_update_sampler_state; rm gen7_sampler_state.c.Kenneth Graunke2014-08-024-194/+2
| | | | | | | | The code in brw_sampler_state.c now handles all generations; we don't need the extra Gen7+ only code anymore. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make brw_update_sampler_state use 8 bits for LOD fields on Gen7+.Kenneth Graunke2014-08-021-3/+4
| | | | | | | | | This was the only actual difference between Gen4-6 and Gen7+ in terms of the values we program. The rest was just mechanical structure rearrangement. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make brw_update_sampler_state() use brw_emit_sampler_state().Kenneth Graunke2014-08-021-119/+92
| | | | | | | | | | | | | | Instead of stuffing bits directly into the brw_sampler_state structure, we now store them in local variables, then use brw_emit_sampler_state() to assemble the packet. This separates the decision about what values to use from the actual packet emission, which makes the code more reusable across generations. v2: Put const on a bunch of local variables and move declarations, as suggested by Topi Pohjolainen. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Introduce a function to emit a SAMPLER_STATE structure.Kenneth Graunke2014-08-022-0/+93
| | | | | | | | | | | | This simply assembles all the SAMPLER_STATE fields into their proper bit locations. Making it work on all generations was easy enough; some of the fields are even in the same place. Not used by anything yet, but will be soon. I made it non-static so BLORP can use it too. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add const to upload_default_color's sampler parameter.Kenneth Graunke2014-08-022-2/+2
| | | | | | | | It doesn't edit the value, and this lets us use const in more places. Needed to implement Topi's review comments for the next patch. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Add #defines for SAMPLER_STATE fields.Kenneth Graunke2014-08-021-0/+54
| | | | | | | | | | | We'll use these to replace the existing structures. I've adopted the convention that "BRW" applies to all hardware, and "GENX" applies starting with generation X, but might be replaced by some later generation. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Convert wrap mode #defines to an enum.Kenneth Graunke2014-08-021-7/+9
| | | | | | | | This makes it easy to tell that they're grouped together, and also improves gdb printing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Delete gen7_upload_sampler_state_table and vtable mechanism.Kenneth Graunke2014-08-025-70/+3
| | | | | | | | | | | brw_upload_sampler_state_table now handles all generations, so we don't need the vtable mechanism either. There's still a lot of code duplication; the next patches will address that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make brw_upload_sampler_state_table handle Gen7+ as well.Kenneth Graunke2014-08-023-5/+49
| | | | | | | | | | | | | | | | This copies a few changes from gen7_upload_sampler_state_table; the next patch will delete that function. Gen7+ has per-stage sampler state pointer update packets, so we emit them as soon as we emit a new table for a stage. On Gen6 and earlier, we have a single packet, so we delay until we've changed everything that's going to be changed. v2: Split 3DSTATE_SAMPLER_STATE_POINTERS_XS packet emission into a helper function (suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Shift brw_upload_sampler_state_table away from structures.Kenneth Graunke2014-08-021-8/+15
| | | | | | | | | | | | The Gen4-6 and Gen7+ code is virtually identical, but both use different structure types. Switching to use a uint32_t pointer and operate on the number of DWords will make it possible to share code. It turns out that SURFACE_STATE is the same number of DWords on every platform currently; it will be easy to handle a change there, though. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Push computation for sampler state batch offsets up a level.Kenneth Graunke2014-08-021-10/+12
| | | | | | | | | | | | Other than this, brw_update_sampler_state only deals with a single SAMPLER_STATE structure, and doesn't need to know which position it is in the table. The caller takes care of dealing with multiple surface states. Pushing this up a level allows us to drop the ss_index parameter. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop unused 'ss_index' parameter from gen7_update_sampler_state.Kenneth Graunke2014-08-021-2/+2
| | | | | | | This was copied from the Gen4-6 code, but is unused. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Stop storing sdc_offset in brw_stage_state.Kenneth Graunke2014-08-023-18/+13
| | | | | | | | | | | | sdc_offset is produced and consumed in the same function, so there's no need to store it in the context, nor pass pointers to it through various call chains. Saves 128 bytes per brw_stage_state structure, and makes the code clearer as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop the degenerate brw_sampler_default_color structure.Kenneth Graunke2014-08-023-16/+8
| | | | | | | | | It's just an array of four floats, and we have an array of four floats, so this is literally just a memcpy...but with custom structs and strange macros to give the appearance of doing something more. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Write a better file comment for brw_sampler_state.c.Kenneth Graunke2014-08-021-7/+6
| | | | | | | The old one has been inaccurate for years. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Rename brw_wm_sampler_state.c to brw_sampler_state.c.Kenneth Graunke2014-08-023-2/+2
| | | | | | | | | | | | | | | When the driver was originally written, it only supported texturing in the pixel shader backend; vertex and geometry shader texturing came much later. Originally, the pixel shader was referred to as "WM" (the Windowizer/Masker unit). So, this code happened to only be relevant for the WM stage, at the time. However, sampler state really applies to all stages, so putting "wm" in the filename doesn't make sense. I dropped it in gen7_sampler_state.c; at this point the asymmetry just trips people up. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Don't set min_mag_neq bit in Gen6 SAMPLER_STATE.Kenneth Graunke2014-08-021-2/+0
| | | | | | | | | The "Min/Mag State Not Equal" bit is supposed to be set when the min/mag filters or address rounding modes differ. BLORP uses identical min/mag settings, so the bit should be unset. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Layout 1D Array as 2D Array with height of 1Jordan Justen2014-08-011-0/+20
| | | | | | | | | | | | | | | | | | | 1D array miptrees were being laid out as a 2D texture with 1 slice. This happened due to the mesa core storing the 1D array slice count in the height field. On Intel hardware, we want to create a 2D array with a height of 1 for the 1D array case. Fixes assertion failure in piglit (gen6, gen8): spec/glsl-1.30/execution/tex-miplevel-selection textureOffset 1DArrayShadow In release builds of Mesa, this test was observed to cause a GPU hang on gen8. Signed-off-by: Jordan Justen <[email protected]> Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81450 Tested-by: Ben Widawsky <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa: Add missing atomic buffer bindings and unbindingsAditya Atluri2014-08-011-0/+31
| | | | Reviewed-by: Marek Olšák <[email protected]>
* main/get_hash_params: Add GL_SAMPLE_SHADING_ARBJason Ekstrand2014-07-291-0/+1
| | | | | | | | | | | GL_SAMPLE_SHADING is specified as a valid pname for glGet in the GL_ARB_sample_shading extension. It seems as if we forgot to add it to the table of pnames. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* xmlconfig: Use program_invocation_short_name when building for cygwinYaakov Selkowitz2014-07-291-0/+2
| | | | | | | | | mesa/mesa/src/mesa/drivers/dri/common/xmlconfig.c:104:10: warning: #warning "Per application configuration won't work with your OS version." [-Wcpp] # warning "Per application configuration won't work with your OS version." Signed-off-by: Yaakov Selkowitz <[email protected]> Reviewed-by: Jon TURNEY <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glapi: add indexed blend functions (GL 4.0)Tapani Pälli2014-07-281-5/+5
| | | | | | | | | | | This makes some of the UE4 engine demos (Stylized, Mobile Temple) render correctly, tested on Intel Haswell machine. Signed-off-by: Tapani Pälli <[email protected]> Acked-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78716
* gallium: rename shader cap MAX_CONSTS to MAX_CONST_BUFFER_SIZEMarek Olšák2014-07-281-2/+3
| | | | | | | | | | This new name isn't so confusing. I also changed the gallivm limit, because it looked wrong. Reviewed-by: Brian Paul <[email protected]> v2: use sizeof(float[4])
* main/cs: Add additional compute shader constant valuesJordan Justen2014-07-272-0/+18
| | | | | | | | With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit: * arb_compute_shader-minmax Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* Add an accelerated version of F_TO_I for x86_64Jason Ekstrand2014-07-241-1/+5
| | | | | | | | | | | | | According to a quick micro-benchmark, this new version is 20% faster on my Haswell laptop. v2: Removed the XXX note about x86_64 from the comment v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC support for free. v4: Enable it for all x86_64 builds, not just with USE_X86_64_ASM Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Decide predicate/predicate_inverse outside of the for loop.Matt Turner2014-07-241-9/+14
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Swap if/else conditions in SEL peephole.Matt Turner2014-07-241-3/+3
| | | | | | Will clarify make the next commit easier to read. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Improve dead control flow elimination.Matt Turner2014-07-241-10/+15
| | | | | | | | ... to eliminate an ELSE instruction followed immediately by an ENDIF. instructions in affected programs: 704 -> 700 (-0.57%) Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/st: add support for interpolate_at_* opsIlia Mirkin2014-07-241-3/+9
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa: add ARB_clear_texture.xml to file list, remove duplicate declsIlia Mirkin2014-07-241-12/+0
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Accelerate uploads of RGBA and BGRA GL_UNSIGNED_INT_8_8_8_8_REV texturesJason Ekstrand2014-07-231-1/+5
| | | | | | | | | | Since intel is always going to be little-endian, GL_UNSIGNED_INT_8_8_8_8_REV is the same as GL_UNSIGNED_BYTE for RGBA and BGRA textures, so the same acceleration code will work. We might as well use it. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Fix the name in the error messageIan Romanick2014-07-231-1/+1
| | | | | | | Obvious copy-and-paste bug. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Set LastRT on the final FB write on Broadwell.Kenneth Graunke2014-07-231-4/+2
| | | | | | | | | | | | | | | | | In Piglit's EXT_framebuffer_multisample/alpha-to-coverage-dual-src-blend test, key->nr_color_regions == 2, but the dual source blend FB write has ir->target set to 0. So we failed to set "Last Render Target Select" on any FB write message. We only emit one FB write per render target, so my comment about setting LastRT on every FB write directed at the last color region is a bit... misinformed. According to the documentation, depth buffer writes and scoreboard updates happen on the FB write with LastRT set, so I believe we want to set it only once. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Port INTEL_DEBUG=optimizer to the vec4 backend.Kenneth Graunke2014-07-231-6/+36
| | | | | | | Largely via copy and paste. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Save the gl_shader_stage enum in backend_visitor.Kenneth Graunke2014-07-232-1/+4
| | | | | | | | | This will be useful for INTEL_DEBUG=optimizer in the vec4 backend, which needs to know whether it's currently processing a VS or GS. It isn't worth adding virtual methods for this case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Don't print WE_normal in disassembly.Kenneth Graunke2014-07-231-1/+1
| | | | | | | | | Dropping this helps most lines fit in an 80 column terminal. The absence of WE_normal also helps call attention to WE_all, where something unusual is going on. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* meta: Add a meta implementation of GL_ARB_clear_textureNeil Roberts2014-07-234-0/+198
| | | | | | | | | | | | | | | | | | | | Adds an implementation of the ClearTexSubImage driver entry point that tries to set up an FBO to render to the texture and then calls glClearBuffer with a scissor to perform the actual clear. If an FBO can't be created for the texture then it will fall back to using _mesa_store_ClearTexSubImage. When used in combination with _mesa_store_ClearTexSubImage this should provide an implementation that works for all DRI-based drivers. However as this has only been tested with the i965 driver it is currently only enabled there. v2: Only enable the extension for the i965 driver instead of all DRI drivers. Remove an unnecessary goto. Don't require GL_ARB_framebuffer_object. Add some more comments. v3: Use glClearBuffer* to avoid having to modify glClearColor and friends. Handle sRGB textures. Explicitly disable dithering. Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
* meta: Add a state flag for the GL_DITHERNeil Roberts2014-07-232-0/+12
| | | | | | | | | The Meta implementation of glClearTexSubImage is going to want to ensure that dithering is disabled so that it can get a consistent color across the whole texture when clearing. This adds a state flag to easily save it and set it to the default value when performing meta operations. Reviewed-by: Topi Pohjolainen <[email protected]>
* texstore: Add a generic implementation of GL_ARB_clear_textureNeil Roberts2014-07-232-0/+79
| | | | | | | | | Adds an implmentation of the ClearTexSubImage driver entry point that just maps the texture and writes the values in. The extension is not yet enabled by default because it doesn't work with multisample textures as they don't have a simple linear layout. Reviewed-by: Jason Ekstrand <[email protected]>
* mesa/main: Add generic bits of ARB_clear_texture implementationNeil Roberts2014-07-233-1/+271
| | | | | | | | | This adds the driver entry point for glClearTexSubImage and fills in the _mesa_ClearTexImage and _mesa_ClearTexSubImage functions that call it. v2: Don't clear some of the images if only one of them makes an error Reviewed-by: Jason Ekstrand <[email protected]>
* teximage: Add utility func for format/internalFormat compatibility checkNeil Roberts2014-07-231-21/+38
| | | | | | | In texture_error_check() there was a snippet of code to check whether the given format and internal format are basically compatible. This has been split out into its own static helper function so that it can be used by an implementation of glClearTexImage too.
* mesa/main: add ARB_clear_texture entrypointsIlia Mirkin2014-07-235-0/+32
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Neil Roberts <[email protected]>
* mesa: Don't use memcpy() in _mesa_texstore() for float depth texture dataAnuj Phogat2014-07-211-0/+15
| | | | | | | | | | | | | | | | because float depth texture data needs clamping to [0.0, 1.0]. Let the _mesa_texstore() fallback to slower path. Fixes Khronos GLES3 CTS tests: shadow_execution_vert shadow_execution_frag V2: Move the check to _mesa_texstore_can_use_memcpy() function. Add check for floating point data types. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965/fs: Fix gl_SampleMask handling for SIMD16 on Gen8+.Kenneth Graunke2014-07-211-5/+0
| | | | | | | | | | | | We actually want to use mov(16), not mov(8). Fixes 7 Piglit tests: ARB_sample_shading/builtin-gl-sample-mask [2468] and ARB_sample_shading/builtin-gl-sample-mask-simple [468]. Signed-off-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991 Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965/fs: Fix gl_SampleID for 2x MSAA and SIMD16 mode.Kenneth Graunke2014-07-213-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | We might be able to do this without an extra program key field, but this is non-invasive and fixes the bug, for now. This fixes the following Piglit tests on Broadwell: - ARB_sample_shading/builtin-gl-sample-id 2 - ARB_sample_shading/builtin-gl-sample-position 2 - EXT_framebuffer_multisample/multisample-blit 2 color - EXT_framebuffer_multisample/multisample-blit 2 color linear - EXT_framebuffer_multisample/multisample-blit 2 depth - EXT_framebuffer_multisample/no-color 2 depth combined - EXT_framebuffer_multisample/no-color 2 depth separate - EXT_framebuffer_multisample/no-color 2 depth single - EXT_framebuffer_multisample/no-color 2 depth-computed combined - EXT_framebuffer_multisample/no-color 2 depth-computed separate - EXT_framebuffer_multisample/no-color 2 depth-computed single - EXT_framebuffer_multisample/unaligned-blit 2 color msaa - EXT_framebuffer_multisample/unaligned-blit 2 depth msaa Signed-off-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991 Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Add missing persample_shading field to brw_wm_debug_recompile.Kenneth Graunke2014-07-211-0/+2
| | | | | | | | Otherwise, the performance warning for shader recompiles will just say "something else". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/disasm: Don't disassemble the URB complete field on Broadwell.Kenneth Graunke2014-07-211-2/+4
| | | | | | | | It doesn't exist, so attempting to read it will trigger generation assertions in the brw_inst API. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Disable hex offset printing in disassembly.Kenneth Graunke2014-07-211-1/+2
| | | | | | | | | | | | | | | Printing the hex offsets makes it basically impossible to diff assembly: if you add even a single instruction, the entire shader shows up as a difference. So, every time I want to compare assembly, I have to strip this out. The hex offsets might be useful when debugging compaction, or when inspecting the program cache buffer. Since it's occasionally useful, but uncommon, this patch disables it by default, but makes it easy to re-enable it temporarily when the need arises. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Use foreach_inst_in_block a couple more places.Matt Turner2014-07-212-8/+2
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Replace cfg instances with calls to calculate_cfg().Matt Turner2014-07-215-22/+22
| | | | | | | | | | | Avoids regenerating it unnecessarily. Every program in shader-db improved, none by an amount less than a 1/3 reduction. One Dota2 shader decreased from 62 -> 24. cfg calculations: 429492 -> 193197 (-55.02%) Reviewed-by: Topi Pohjolainen <[email protected]>