summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* osmesa: link with mesautilBrian Paul2014-08-041-0/+1
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* xlib: fix missing mesautil build breakageBrian Paul2014-08-041-0/+1
| | | | | | Fixes the non-DRI build. Reviewed-by: Jason Ekstrand <[email protected]>
* util: Gather some common macrosJason Ekstrand2014-08-044-0/+4
| | | | | | | | | | This gathers macros that have been included across components into util so that the include chain can be more vertical. In particular, this makes util stand on its own without any dependence whatsoever on the rest of mesa. Signed-off-by: "Jason Ekstrand" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* util: Move the open-addressing linear-probing hash_table to src/util.Kenneth Graunke2014-08-042-2/+2
| | | | | | | | | | | | | | | | | This hash table is used in core Mesa, the GLSL compiler, and the i965 driver, which makes it a good candidate for the new src/util module. It's much faster than program/hash_table.[ch] (see commit 6991c2922f5 for data), and José's u_hash_table.c has a comment saying Gallium should probably consider switching to a linear probing hash table at some point. So this seems like the best candidate for a shared data structure. Signed-off-by: Kenneth Graunke <[email protected]> v2 (Jason Ekstrand): Pick up another hash_table use and patch up scons Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* util: Move ralloc to a new src/util directory.Kenneth Graunke2014-08-0423-22/+23
| | | | | | | | | | | | | | | | | | For a long time, we've wanted a place to put utility code which isn't directly tied to Mesa or Gallium internals. This patch creates a new src/util directory for exactly that purpose, and builds the contents as libmesautil.la. ralloc seemed like a good first candidate. These days, it's directly used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl didn't make much sense. Signed-off-by: Kenneth Graunke <[email protected]> v2 (Jason Ekstrand): More realloc uses and some scons fixes Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: Delete stale "pre-gen4" comment in texture validation code.Kenneth Graunke2014-08-021-5/+0
| | | | | | | | | | | In commit 16060c5adcd4d809f97e874fcde763260c17ac18, Eric changed the code to not relayout just for baselevel changes - only if the range of miplevels actually increases. So this comment is now wrong. Notably, the i915 version of the code actually does what the comment says. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Delete sampler state structures.Kenneth Graunke2014-08-021-99/+0
| | | | | | | | We've moved to using bitshifts (like we did for surface state); nothing uses the structures anymore. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Replace sizeof(struct gen7_sampler_state) with the size itself.Kenneth Graunke2014-08-024-8/+8
| | | | | | | | | | These are the last users of struct gen7_sampler_state. v2: Use a local sampler_state_size variable, to help distinguish the various 16s (suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop sizeof(struct brw_sampler_state) from estimated prim size.Kenneth Graunke2014-08-021-3/+3
| | | | | | | | | | This is the last user of the structure. v2: Use a local variable with a sensible name so people know what 16 is. (Suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make BLORP use brw_emit_sampler_state().Kenneth Graunke2014-08-023-95/+33
| | | | | | | | This simplifies the code, removes use of the old structures, and also allows us to combine the Gen6 and Gen7+ code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Delete redundant sampler state dumping code.Kenneth Graunke2014-08-021-34/+5
| | | | | | | | | | | Although the Gen4-6 and Gen7+ variants used different structure types, they didn't use any of the fields - only the size, which is identical. So both decoders did exactly the same thing. Someday we should implement useful decoders for SAMPLER_STATE. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make some brw_sampler_state.c functions static again.Kenneth Graunke2014-08-022-8/+2
| | | | | | | | Now that gen7_sampler_state.c is gone, everything is once again in a single file. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Stop using gen7_update_sampler_state; rm gen7_sampler_state.c.Kenneth Graunke2014-08-024-194/+2
| | | | | | | | The code in brw_sampler_state.c now handles all generations; we don't need the extra Gen7+ only code anymore. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make brw_update_sampler_state use 8 bits for LOD fields on Gen7+.Kenneth Graunke2014-08-021-3/+4
| | | | | | | | | This was the only actual difference between Gen4-6 and Gen7+ in terms of the values we program. The rest was just mechanical structure rearrangement. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make brw_update_sampler_state() use brw_emit_sampler_state().Kenneth Graunke2014-08-021-119/+92
| | | | | | | | | | | | | | Instead of stuffing bits directly into the brw_sampler_state structure, we now store them in local variables, then use brw_emit_sampler_state() to assemble the packet. This separates the decision about what values to use from the actual packet emission, which makes the code more reusable across generations. v2: Put const on a bunch of local variables and move declarations, as suggested by Topi Pohjolainen. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Introduce a function to emit a SAMPLER_STATE structure.Kenneth Graunke2014-08-022-0/+93
| | | | | | | | | | | | This simply assembles all the SAMPLER_STATE fields into their proper bit locations. Making it work on all generations was easy enough; some of the fields are even in the same place. Not used by anything yet, but will be soon. I made it non-static so BLORP can use it too. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add const to upload_default_color's sampler parameter.Kenneth Graunke2014-08-022-2/+2
| | | | | | | | It doesn't edit the value, and this lets us use const in more places. Needed to implement Topi's review comments for the next patch. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Add #defines for SAMPLER_STATE fields.Kenneth Graunke2014-08-021-0/+54
| | | | | | | | | | | We'll use these to replace the existing structures. I've adopted the convention that "BRW" applies to all hardware, and "GENX" applies starting with generation X, but might be replaced by some later generation. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Convert wrap mode #defines to an enum.Kenneth Graunke2014-08-021-7/+9
| | | | | | | | This makes it easy to tell that they're grouped together, and also improves gdb printing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Delete gen7_upload_sampler_state_table and vtable mechanism.Kenneth Graunke2014-08-025-70/+3
| | | | | | | | | | | brw_upload_sampler_state_table now handles all generations, so we don't need the vtable mechanism either. There's still a lot of code duplication; the next patches will address that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make brw_upload_sampler_state_table handle Gen7+ as well.Kenneth Graunke2014-08-023-5/+49
| | | | | | | | | | | | | | | | This copies a few changes from gen7_upload_sampler_state_table; the next patch will delete that function. Gen7+ has per-stage sampler state pointer update packets, so we emit them as soon as we emit a new table for a stage. On Gen6 and earlier, we have a single packet, so we delay until we've changed everything that's going to be changed. v2: Split 3DSTATE_SAMPLER_STATE_POINTERS_XS packet emission into a helper function (suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Shift brw_upload_sampler_state_table away from structures.Kenneth Graunke2014-08-021-8/+15
| | | | | | | | | | | | The Gen4-6 and Gen7+ code is virtually identical, but both use different structure types. Switching to use a uint32_t pointer and operate on the number of DWords will make it possible to share code. It turns out that SURFACE_STATE is the same number of DWords on every platform currently; it will be easy to handle a change there, though. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Push computation for sampler state batch offsets up a level.Kenneth Graunke2014-08-021-10/+12
| | | | | | | | | | | | Other than this, brw_update_sampler_state only deals with a single SAMPLER_STATE structure, and doesn't need to know which position it is in the table. The caller takes care of dealing with multiple surface states. Pushing this up a level allows us to drop the ss_index parameter. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop unused 'ss_index' parameter from gen7_update_sampler_state.Kenneth Graunke2014-08-021-2/+2
| | | | | | | This was copied from the Gen4-6 code, but is unused. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Stop storing sdc_offset in brw_stage_state.Kenneth Graunke2014-08-023-18/+13
| | | | | | | | | | | | sdc_offset is produced and consumed in the same function, so there's no need to store it in the context, nor pass pointers to it through various call chains. Saves 128 bytes per brw_stage_state structure, and makes the code clearer as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop the degenerate brw_sampler_default_color structure.Kenneth Graunke2014-08-023-16/+8
| | | | | | | | | It's just an array of four floats, and we have an array of four floats, so this is literally just a memcpy...but with custom structs and strange macros to give the appearance of doing something more. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Write a better file comment for brw_sampler_state.c.Kenneth Graunke2014-08-021-7/+6
| | | | | | | The old one has been inaccurate for years. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Rename brw_wm_sampler_state.c to brw_sampler_state.c.Kenneth Graunke2014-08-023-2/+2
| | | | | | | | | | | | | | | When the driver was originally written, it only supported texturing in the pixel shader backend; vertex and geometry shader texturing came much later. Originally, the pixel shader was referred to as "WM" (the Windowizer/Masker unit). So, this code happened to only be relevant for the WM stage, at the time. However, sampler state really applies to all stages, so putting "wm" in the filename doesn't make sense. I dropped it in gen7_sampler_state.c; at this point the asymmetry just trips people up. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/blorp: Don't set min_mag_neq bit in Gen6 SAMPLER_STATE.Kenneth Graunke2014-08-021-2/+0
| | | | | | | | | The "Min/Mag State Not Equal" bit is supposed to be set when the min/mag filters or address rounding modes differ. BLORP uses identical min/mag settings, so the bit should be unset. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Layout 1D Array as 2D Array with height of 1Jordan Justen2014-08-011-0/+20
| | | | | | | | | | | | | | | | | | | 1D array miptrees were being laid out as a 2D texture with 1 slice. This happened due to the mesa core storing the 1D array slice count in the height field. On Intel hardware, we want to create a 2D array with a height of 1 for the 1D array case. Fixes assertion failure in piglit (gen6, gen8): spec/glsl-1.30/execution/tex-miplevel-selection textureOffset 1DArrayShadow In release builds of Mesa, this test was observed to cause a GPU hang on gen8. Signed-off-by: Jordan Justen <[email protected]> Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81450 Tested-by: Ben Widawsky <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* xmlconfig: Use program_invocation_short_name when building for cygwinYaakov Selkowitz2014-07-291-0/+2
| | | | | | | | | mesa/mesa/src/mesa/drivers/dri/common/xmlconfig.c:104:10: warning: #warning "Per application configuration won't work with your OS version." [-Wcpp] # warning "Per application configuration won't work with your OS version." Signed-off-by: Yaakov Selkowitz <[email protected]> Reviewed-by: Jon TURNEY <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965/fs: Decide predicate/predicate_inverse outside of the for loop.Matt Turner2014-07-241-9/+14
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Swap if/else conditions in SEL peephole.Matt Turner2014-07-241-3/+3
| | | | | | Will clarify make the next commit easier to read. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Improve dead control flow elimination.Matt Turner2014-07-241-10/+15
| | | | | | | | ... to eliminate an ELSE instruction followed immediately by an ENDIF. instructions in affected programs: 704 -> 700 (-0.57%) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Accelerate uploads of RGBA and BGRA GL_UNSIGNED_INT_8_8_8_8_REV texturesJason Ekstrand2014-07-231-1/+5
| | | | | | | | | | Since intel is always going to be little-endian, GL_UNSIGNED_INT_8_8_8_8_REV is the same as GL_UNSIGNED_BYTE for RGBA and BGRA textures, so the same acceleration code will work. We might as well use it. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Set LastRT on the final FB write on Broadwell.Kenneth Graunke2014-07-231-4/+2
| | | | | | | | | | | | | | | | | In Piglit's EXT_framebuffer_multisample/alpha-to-coverage-dual-src-blend test, key->nr_color_regions == 2, but the dual source blend FB write has ir->target set to 0. So we failed to set "Last Render Target Select" on any FB write message. We only emit one FB write per render target, so my comment about setting LastRT on every FB write directed at the last color region is a bit... misinformed. According to the documentation, depth buffer writes and scoreboard updates happen on the FB write with LastRT set, so I believe we want to set it only once. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Port INTEL_DEBUG=optimizer to the vec4 backend.Kenneth Graunke2014-07-231-6/+36
| | | | | | | Largely via copy and paste. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Save the gl_shader_stage enum in backend_visitor.Kenneth Graunke2014-07-232-1/+4
| | | | | | | | | This will be useful for INTEL_DEBUG=optimizer in the vec4 backend, which needs to know whether it's currently processing a VS or GS. It isn't worth adding virtual methods for this case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Don't print WE_normal in disassembly.Kenneth Graunke2014-07-231-1/+1
| | | | | | | | | Dropping this helps most lines fit in an 80 column terminal. The absence of WE_normal also helps call attention to WE_all, where something unusual is going on. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* meta: Add a meta implementation of GL_ARB_clear_textureNeil Roberts2014-07-234-0/+198
| | | | | | | | | | | | | | | | | | | | Adds an implementation of the ClearTexSubImage driver entry point that tries to set up an FBO to render to the texture and then calls glClearBuffer with a scissor to perform the actual clear. If an FBO can't be created for the texture then it will fall back to using _mesa_store_ClearTexSubImage. When used in combination with _mesa_store_ClearTexSubImage this should provide an implementation that works for all DRI-based drivers. However as this has only been tested with the i965 driver it is currently only enabled there. v2: Only enable the extension for the i965 driver instead of all DRI drivers. Remove an unnecessary goto. Don't require GL_ARB_framebuffer_object. Add some more comments. v3: Use glClearBuffer* to avoid having to modify glClearColor and friends. Handle sRGB textures. Explicitly disable dithering. Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
* meta: Add a state flag for the GL_DITHERNeil Roberts2014-07-232-0/+12
| | | | | | | | | The Meta implementation of glClearTexSubImage is going to want to ensure that dithering is disabled so that it can get a consistent color across the whole texture when clearing. This adds a state flag to easily save it and set it to the default value when performing meta operations. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/fs: Fix gl_SampleMask handling for SIMD16 on Gen8+.Kenneth Graunke2014-07-211-5/+0
| | | | | | | | | | | | We actually want to use mov(16), not mov(8). Fixes 7 Piglit tests: ARB_sample_shading/builtin-gl-sample-mask [2468] and ARB_sample_shading/builtin-gl-sample-mask-simple [468]. Signed-off-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991 Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965/fs: Fix gl_SampleID for 2x MSAA and SIMD16 mode.Kenneth Graunke2014-07-213-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | We might be able to do this without an extra program key field, but this is non-invasive and fixes the bug, for now. This fixes the following Piglit tests on Broadwell: - ARB_sample_shading/builtin-gl-sample-id 2 - ARB_sample_shading/builtin-gl-sample-position 2 - EXT_framebuffer_multisample/multisample-blit 2 color - EXT_framebuffer_multisample/multisample-blit 2 color linear - EXT_framebuffer_multisample/multisample-blit 2 depth - EXT_framebuffer_multisample/no-color 2 depth combined - EXT_framebuffer_multisample/no-color 2 depth separate - EXT_framebuffer_multisample/no-color 2 depth single - EXT_framebuffer_multisample/no-color 2 depth-computed combined - EXT_framebuffer_multisample/no-color 2 depth-computed separate - EXT_framebuffer_multisample/no-color 2 depth-computed single - EXT_framebuffer_multisample/unaligned-blit 2 color msaa - EXT_framebuffer_multisample/unaligned-blit 2 depth msaa Signed-off-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991 Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Add missing persample_shading field to brw_wm_debug_recompile.Kenneth Graunke2014-07-211-0/+2
| | | | | | | | Otherwise, the performance warning for shader recompiles will just say "something else". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/disasm: Don't disassemble the URB complete field on Broadwell.Kenneth Graunke2014-07-211-2/+4
| | | | | | | | It doesn't exist, so attempting to read it will trigger generation assertions in the brw_inst API. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Disable hex offset printing in disassembly.Kenneth Graunke2014-07-211-1/+2
| | | | | | | | | | | | | | | Printing the hex offsets makes it basically impossible to diff assembly: if you add even a single instruction, the entire shader shows up as a difference. So, every time I want to compare assembly, I have to strip this out. The hex offsets might be useful when debugging compaction, or when inspecting the program cache buffer. Since it's occasionally useful, but uncommon, this patch disables it by default, but makes it easy to re-enable it temporarily when the need arises. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Use foreach_inst_in_block a couple more places.Matt Turner2014-07-212-8/+2
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Replace cfg instances with calls to calculate_cfg().Matt Turner2014-07-215-22/+22
| | | | | | | | | | | Avoids regenerating it unnecessarily. Every program in shader-db improved, none by an amount less than a 1/3 reduction. One Dota2 shader decreased from 62 -> 24. cfg calculations: 429492 -> 193197 (-55.02%) Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/cfg: Add a foreach_block_and_inst macro.Matt Turner2014-07-211-0/+4
| | | | | | Will let us abstract how the instructions are stored. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add cfg to backend_visitor.Matt Turner2014-07-219-33/+48
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>