summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: Add XRGB to intel_texsubimage_tiled_memcpy()Courtney Goeltzenleuchter2013-12-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MESA_FORMAT_XRGB8888 is equivalent to MESA_FORMAT_ARGB8888 in terms of storage on the device, so okay to use this optimized copy routine. This series builds on work from Frank Henigman to optimize the process of uploading a texture to the GPU. This series adds support for MESA_XRGB_8888 and full miptrees where were found to be common activities in the Smokin' Guns game. The issue was found while profiling the app but that part is not benchmarked. Smokin-Guns uses mipmap textures with an internal format of GL_RGB (MESA_XRGB_8888 in the driver). These changes need a performance tool to run against to show how they improve execution performance for specific texture formats. Using this benchmark I've measured the following improvement on my Ivybridge Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz. 1024x1024 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 628.15 627.15 GL_RGB 265.95 456.35 512x512 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 600.23 597.00 GL_RGB 255.50 440.62 256x256 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 489.08 487.80 GL_RGB 229.03 376.63 Benchmark has been sent to mesa-dev list: teximage Signed-off-by: Courtney Goeltzenleuchter <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* Rename overloads of _mesa_glsl_shader_target_name().Paul Berry2013-12-301-2/+2
| | | | | | | | | | | | Previously, _mesa_glsl_shader_target_name() had an overload for GLenum and an overload for the gl_shader_type enum, each of which behaved differently. However, since GLenum is a synonym for unsigned int, and unsigned ints are often used in place of gl_shader_type (e.g. in loop indices), there was a big risk of calling the wrong overload by mistake. This patch gives the two overloads different names so that it's always clear which one we mean to call. Reviewed-by: Brian Paul <[email protected]>
* i965: Remove unused depth_mode parameter from translate_tex_format().Kenneth Graunke2013-12-294-4/+0
| | | | | | | | | | | | According to git blame, this hasn't been used in over two years: commit d2235b0f4681f75d562131d655a6d7b7033d2d8b Author: Eric Anholt <[email protected]> Date: Thu Nov 17 17:01:58 2011 -0800 i965: Always handle GL_DEPTH_TEXTURE_MODE through the shader. Signed-off-by: Kenneth Graunke <[email protected]>
* i965/blorp: unit test compiling integer typed texture fetchesTopi Pohjolainen2013-12-271-0/+86
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling simple gen6 zero-src sampledTopi Pohjolainen2013-12-271-0/+51
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling gen6 msaa-8 cms alpha blendTopi Pohjolainen2013-12-271-0/+57
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling bilinear filteredTopi Pohjolainen2013-12-271-0/+49
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling simple zero-src sampledTopi Pohjolainen2013-12-271-0/+56
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling unaligned msaa-8Topi Pohjolainen2013-12-271-0/+135
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling msaa-8 cms alpha blendTopi Pohjolainen2013-12-271-0/+145
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling msaa-4 ums to cmsTopi Pohjolainen2013-12-271-0/+59
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling msaa-8 cms to cmsTopi Pohjolainen2013-12-271-0/+62
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling msaa-8 ums to cmsTopi Pohjolainen2013-12-271-0/+61
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: unit test compiling blend and scaledTopi Pohjolainen2013-12-273-1/+339
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: allow unit tests to compile and dump assemblyTopi Pohjolainen2013-12-271-3/+16
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: dump the disassembly to the given fileTopi Pohjolainen2013-12-271-10/+10
| | | | | | | | instead of ignoring the argument and always dumping to standard output. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: allow fs-generator use without gl_fragment_programTopi Pohjolainen2013-12-271-3/+6
| | | | | | | Prepares the generator to accept hand-crafted blorp programs. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: generate fs programs also without any 8-width instructionsTopi Pohjolainen2013-12-271-2/+6
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i915: Add support for gl_FragData[0] reads.Henri Verbeet2013-12-221-0/+1
| | | | | | | | | | | Similar to 556a47a2621073185be83a0a721a8ba93392bedb, without this reading from gl_FragData[0] would cause a software fallback. Bugzilla: https://bugs.winehq.org/show_bug.cgi?id=33964 Signed-off-by: Henri Verbeet <[email protected]> Cc: 10.0 9.2 9.1 <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: inline r200 radeon texture format macros to facility search and replaceMark Mueller2013-12-212-102/+70
| | | | | Signed-off-by: Mark Mueller <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* mesa: GL_EXT_packed_depth_stencil is not optionalIan Romanick2013-12-205-5/+0
| | | | | | | | Every driver supports it. All current and future Gallium drivers always support it, and all existing classic drivers support it. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeon: Sort list of enabled extensionsIan Romanick2013-12-201-6/+5
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r200: Sort list of enabled extensionsIan Romanick2013-12-201-16/+10
| | | | | | | Note that ARB_occlusion_query was previously enabled twice. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Set fast color clear mcs_state on newly allocated image miptreesKeith Packard2013-12-201-3/+7
| | | | | | | | | | Just copying code from the dri2 path to set up the fast color clear state. This also removes a couple of bogus intel_region_reference calls. Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Correct check for re-bound buffer in intel_update_image_bufferKeith Packard2013-12-201-4/+15
| | | | | | | | | | | The buffer-object is the persistent thing passed through the loader, so when updating an image buffer, check to see if it is already bound to the provided bo. The region, on the other hand, is allocated separately for the miptree, and so will never be the same as that passed back from the loader. Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use RED for depth texture formats rather than INTENSITY.Kenneth Graunke2013-12-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While looking through the documentation, I found this in the Sandybridge PRM (Volume 4, Part 1, Page 140): "Use of sample_c with SURFTYPE_CUBE surfaces is undefined with the following surface formats: I24X8_UNORM, L24X8_UNORM, A24X8_UNORM, I32_FLOAT, L32_FLOAT, A32_FLOAT." I haven't observed this to be true, but it suggests that we may want to use other formats. We already perform DEPTH_TEXTURE_MODE swizzling in the shaders, and don't rely on the surface format to splat things appropriately. So using RED should work just as well as INTENSITY. A few notes about the formats: - R24_UNORM_X8_TYPELESS has the exact same properties as I24X8_UNORM. - R16_UNORM and R32_FLOAT are additionally supported as a render target, while the old I16_UNORM/I32_FLOAT formats are not. - R32_FLOAT_X8X24_TYPELESS is not supported as a render target, while the old format (R32G32_FLOAT) was. However, it shares the same properties as the formats we use for Z24, so it should suffice. This makes translate_tex_format and brw_blorp_surface_info::set a bit more similar. No Piglit changes on Sandybridge or Ivybridge. No oglconform changes on Sandybridge. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/gen6: Fix HiZ hang in WebGL Google MapsChad Versace2013-12-201-0/+15
| | | | | | | | | | | | | | | | | | | | | Emitting flushes before depth and hiz resolves at the top of blorp's state emission fixes the hang. Marchesin and I found the fix experimentally, as opposed to adhering to a documented hardware workaround. A more minimal fix likely exists, but this gets the job done. Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS. Tested by zooming in and out continuously for 2 hours. This patch is based on https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/8bc07bb70163c3706fb4ba5f980e57dc942f56dd CC: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740 Signed-off-by: Stéphane Marchesin <[email protected]> Signed-off-by: Chad Versace <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Store QPitch in intel_mipmap_tree.Kenneth Graunke2013-12-202-6/+10
| | | | | | | | | | | | | | | Broadwell allows us to specify an arbitrary value for QPitch, rather than baking a specific formula into the hardware and requiring software to lay things out to match. The only restriction is that the software provided QPitch needs to be large enough so successive array slices do not overlap. In order to support this flexibility, software needs to specify QPitch in a bunch of packets. Storing QPitch makes that easy, and allows us to adjust it in a single place should we wish to change it in the future. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Add support for Broadwell's new register types.Kenneth Graunke2013-12-203-1/+19
| | | | | | | | | | | | Broadwell introduces support for Q, UQ, and HF types. It also extends DF support to allow immediate values. Irritatingly, although HF and DF both support immediates, they're represented by a different value depending on the register file. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add BRW_REGISTER_TYPE_DF.Kenneth Graunke2013-12-203-0/+6
| | | | | | | | | Ivybridge, Baytrail, and Haswell support double float register types, but do not support them as immediate values. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Abstract BRW_REGISTER_TYPE_* into an enum with unique values.Kenneth Graunke2013-12-204-22/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | On released hardware, values 4-6 are overloaded. For normal registers, they mean UB/B/DF. But for immediates, they mean UV/VF/V. Previously, we just created #defines for each name, reusing the same value. This meant we could directly splat the brw_reg::type field into the assembly encoding, which was fairly nice, and worked well. Unfortunately, Broadwell makes this infeasible: the HF and DF types are represented as different numeric values depending on whether the source register is an immediate or not. To preserve sanity, I decided to simply convert BRW_REGISTER_TYPE_* to an abstract enum that has a unique value for each register type, and write translation functions. One nice benefit is that we can add assertions about register files and generations. I've chosen not to convert brw_reg::type to the enum, since converting it caused a lot of trouble due to C++ enum rules (even though it's defined in an extern "C" block...). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Decode three-source register types directly.Kenneth Graunke2013-12-201-25/+14
| | | | | | | | | | | | | | Three-source instructions use a different encoding for register types (and have a much more limited set to choose from). Previously, we translated those into BRW_REGISTER_TYPE_* values, then reused the existing reg_encoding mapping. Doing it directly is more straightforward and actually less code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Disassemble UV types, not UB types.Kenneth Graunke2013-12-201-2/+2
| | | | | | | | | UB types have never been supported as immediates. On Gen4-5, register encoding 4 is "Reserved." On Gen6+, it means UV. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add missing BRW_REGISTER_TYPE_UV.Kenneth Graunke2013-12-201-0/+1
| | | | | | | | Sandybridge added support for packed unsigned vectors. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation.Kenneth Graunke2013-12-201-1/+1
| | | | | | | | | When adding geometry shader support, we accidentally reversed the size and offset parameters. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]> Cc: "10.0" <[email protected]>
* i965: Use {point_sprite,flat}_enable variable names instead of dw*.Kenneth Graunke2013-12-202-10/+14
| | | | | | | | | | | Calling the local variables flat_enable and point_sprite_enable is clearer than dw16 and such. It also matches the names used in calculate_attr_overrides, which computes them. v2: Add /* dw16 */ and /* dw10 */ comments, requested by Jordan. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Zero out {point_sprite,flat}_enables in calculate_attr_overrides.Kenneth Graunke2013-12-202-6/+3
| | | | | | | | | | | | | | | calculate_attr_overrides is responsible for computing the point sprite and flat-shading enable bitfields. It does so by OR'ing in a bunch of bits. However, it relied on the caller to set the initial value to zero. This is pretty fragile - if the caller neglects to zero out those variables, then the enable bitfields end up full of garbage, which shows up as random things being flat-shaded. This patch moves the zero-initialization into calculate_attr_overrides, so that the computation is completely in one place. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Delete bogus BRW_REGISTER_TYPE_HF define.Kenneth Graunke2013-12-202-2/+0
| | | | | | | | | | git blame ascribes this to the initial commit of the driver. No released hardware has ever supported half float, according to the documentation for SrcType in the ISA reference. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* swrast: silence driContextSetFlags() parameter type warningBrian Paul2013-12-171-1/+1
|
* i965: Treat Haswell as 75 in the surface format table.Kenneth Graunke2013-12-131-1/+1
| | | | | | | Much like we do for G45. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: add support for gl_SampleMaskIn[]Chris Forbes2013-12-145-1/+30
| | | | | | | | v2: - add assert so we don't run into trouble on Gen6. - adjust for Tapani's rearrangement of ir_variable Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add driver entry point for ARB_texture_viewCourtney Goeltzenleuchter2013-12-131-0/+3
| | | | | | Signed-off-by: Courtney Goeltzenleuchter <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* dri_util: Don't assume __DRIcontext->driverPrivate is a gl_contextKristian Høgsberg2013-12-1314-10/+34
| | | | | | | | | | | | | | | | | The driverPrivate pointer is opaque to the driver and we can't assume it's a struct gl_context in dri_util.c. Instead provide a helper function to set the struct gl_context flags from the incoming DRI context flags. v2 (idr): Modify the other classic drivers to also use driContextSetFlags. I ran all the piglit GLX_ARB_create_context tests with i965 and classic swrast without regressions. Signed-off-by: Kristian Høgsberg <[email protected]> Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ian Romanick <[email protected]> [v1] Reviewed-by: Eric Anholt <[email protected]> Tested-by: Ilia Mirkin <[email protected]> [v1 on Gallium nouveau] Cc: "10.0" <[email protected]>
* swrast* (gallium, classic): add MESA_copy_sub_buffer support (v3)Dave Airlie2013-12-133-1/+54
| | | | | | | | | | | | | | | | | | | | | | | This patches add MESA_copy_sub_buffer support to the dri sw loader and then to gallium state tracker, llvmpipe, softpipe and other bits. It reuses the dri1 driver extension interface, and it updates the swrast loader interface for a new putimage which can take a stride. I've tested this with gnome-shell with a cogl hacked to reenable sub copies for llvmpipe and the one piglit test. I could probably split this patch up as well. v2: pass a pipe_box, to reduce the entrypoints, as per Jose's review, add to p_screen doc comments. v3: finish off winsys interfaces, add swrast classic support as well. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Dave Airlie <[email protected]> swrast: add support for copy_sub_buffer
* glsl: move variables in to ir_variable::data, part IITapani Pälli2013-12-125-25/+25
| | | | | | | | | | | | | This patch moves following bitfields and variables to the data structure: explicit_location, explicit_index, explicit_binding, has_initializer, is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray, from_named_ifc_block_array, depth_layout, location, index, binding, max_array_access, atomic Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl: move variables in to ir_variable::data, part ITapani Pälli2013-12-127-15/+15
| | | | | | | | | | This patch moves following bitfields in to the data structure: used, assigned, how_declared, mode, interpolation, origin_upper_left, pixel_center_integer Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl: introduce data section to ir_variableTapani Pälli2013-12-121-2/+2
| | | | | | | | Data section helps serialization and cloning of a ir_variable. This patch includes the helper bits used for read only ir_variables. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* swrast: fix readback regression since inversion fixDave Airlie2013-12-101-1/+1
| | | | | | | | | | | | | | | This readback from the frontbuffer with swrast was broken, that bug just made it more obviously broken, this fixes it by inverting the sub image gets. Also fixes a few other piglits. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72327 Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72325 (for 9.2 the patches this depends on were asked to be backported separately in an email). Cc: "9.2" "10.0" [email protected] Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* dri megadriver_stub: add compatibility for older DRI loadersJordan Justen2013-12-091-0/+126
| | | | | | | | | | | | | | | | | | To help the transition period when DRI loaders are being updated to support the newer __driDriverExtensions_foo mechanism, we populate __driDriverExtensions with the extensions returned by __driDriverExtensions_foo during a library contructor function. We find the driver foo's name by using the dladdr function which gives the path of the dynamic library's name that was being loaded. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Keith Packard <[email protected]> Cc: "10.0" <[email protected]>
* i965: Replace OUT_RELOC_FENCED with OUT_RELOC.Kenneth Graunke2013-12-092-16/+10
| | | | | | | | | | | | | | On Gen4+, OUT_RELOC_FENCED is equivalent to OUT_RELOC; libdrm silently ignores the fenced flag: /* We never use HW fences for rendering on 965+ */ if (bufmgr_gem->gen >= 4) need_fence = false; Thanks to Eric for noticing this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>