summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: Remove bogus assert.José Fonseca2013-07-051-4/+1
| | | | | | | | | | | It is perfectly valid for the swizzle to be bigger than 2. For example the texel offsets could be SAMPLE ..., IMM[0].zzz What is not correct is for chan_index to be bigger than 2. Trivial.
* nvc0: enable very initial support for nvf0 (GK110)Ben Skeggs2013-07-055-5/+76
| | | | | | | Shaders need a lot of work still. Basic stuff generally works, so this is basically just fine for gnome-shell, OA etc at this point. Signed-off-by: Ben Skeggs <[email protected]>
* gallivm: (trivial) fix bogus assertion for per-element lod with 1d resourcesRoland Scheidegger2013-07-052-2/+1
| | | | | | The assertion was always broken but the code unused until enabling the per-element lod code. Fixes piglit texelFetch vs isampler1D and similar tests (only run with GL 3.0 version override).
* gallivm: do per-pixel lod calculations for explicit lodRoland Scheidegger2013-07-0410-126/+195
| | | | | | | | | | | | | | | | | | | | | d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix overflows in the indexed rendering pathsZack Rusin2013-07-034-43/+159
| | | | | | | | | | | | | The semantics for overflow detection are a bit tricky with indexed rendering. If the base index in the elements array overflows, then the index of the first element should be used, if the index with bias overflows then it should be treated like a normal overflow. Also overflows need to be checked for in all paths that either the bias, or the starting index location. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/llvm: index overflows if it's greater than elt maxZack Rusin2013-07-031-1/+1
| | | | | | | | | The comparison, incorrectly, was greater-than-or-equal to elt max. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965: Move the rest of intel_tex_layout.c into brw_tex_layout.c.Kenneth Graunke2013-07-036-191/+102
| | | | | | | | | | | The texture alignment unit functions are called from brw_tex_layout.c, so it makes sense to put them there. Since the only caller of intel_get_texture_alignment_unit() is in brw_tex_layout.c, it could be made into a static function. However, this patch instead simply folds it into the caller, as it's only two lines anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Push intel_get_texture_alignment_unit call into brw_miptree_layoutKenneth Graunke2013-07-032-3/+3
| | | | | | | | | | | | | | intel_miptree_create_layout() calls intel_get_texture_alignment_unit() and then immediately calls brw_miptree_layout(). There are no other callers. intel_get_texture_alignment_unit() populates the miptree's alignment unit fields, which are used by brw_miptree_layout() to determine where to place each miplevel. Since brw_miptree_layout() needs those to be present, it makes sense to have it initialize them as the first step. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Declare for-loop counters in the loop in brw_tex_layout.c.Kenneth Graunke2013-07-031-11/+7
| | | | | | | | The driver is compiled in C99 mode, so this is not a problem. It's slighlty tidier. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Remove use of GLuint/GLint in brw_tex_layout.c.Kenneth Graunke2013-07-031-19/+19
| | | | | | | Using GL types is silly; this isn't even remotely API-facing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Tidy the brw_tex_layout.c copyright and file header comments.Kenneth Graunke2013-07-031-34/+31
| | | | | | | | This uses Doxygen style for the file comments, and generally makes it more consistent with the rest of the driver. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Move i945_texture_layout_2d to brw_tex_layout.cKenneth Graunke2013-07-033-71/+72
| | | | | | | This consolidates the miptree layout logic in a single file. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Remove fallthrough for Gen4 cube map layout.Kenneth Graunke2013-07-031-9/+7
| | | | | | | | | | | | | | | Now that both 2DArray and Cube layouts are taken care of by helper functions, it's easy to just call the right function for each generation. This is a little cleaner than falling through. This also reworks the comments. Referencing "Volume 1" of the BSpec isn't very helpful, since that's only available inside Intel, and it doesn't even use volume numbers. Also, "Ironlake...finally" sounds a bit strange considering that almost all hardware uses the 2D array approach. At this point, Gen4 is the only special case. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Combine GL_TEXTURE_CUBE_MAP_ARRAY case with the other array cases.Kenneth Graunke2013-07-031-5/+2
| | | | | | | These do the exact same thing; combining them is tidier. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Pull 3D texture layout code out into a helper function.Kenneth Graunke2013-07-031-77/+82
| | | | | | | A bit cleaner than having it in one giant function. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Replace maxBatchSize variable with BATCH_SZ define.Kenneth Graunke2013-07-034-5/+3
| | | | | | | | maxBatchSize was only ever initialized to BATCH_SZ, and a few places used BATCH_SZ directly anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Move annotate_aub out of the vtable.Kenneth Graunke2013-07-033-5/+2
| | | | | | | | brw_annotate_aub() is the only implementation of this function, so it makes sense to just call it directly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Move debug_batch hook out of the vtable.Kenneth Graunke2013-07-033-4/+2
| | | | | | | | brw_debug_batch() is the only implementation of this function, so it makes sense to just call it directly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Remove render_target_supported from the vtable.Kenneth Graunke2013-07-035-6/+3
| | | | | | | | | | | | brw_render_target_supported() is the only implementation of this function, so it makes sense to just call it directly. Rather than adding an #include of brw_wm.h, this patch moves the prototype to brw_context.h. Prototypes seem to be in rather arbitrary places at the moment, and either place seems as good as the other. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Move is_hiz_depth_format out of the vtable.Kenneth Graunke2013-07-036-31/+26
| | | | | | | | brw_is_hiz_depth_format() is the only implementation of this function, so it makes sense to just call it directly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Remove the invalidate_state() vtable hook.Kenneth Graunke2013-07-033-12/+0
| | | | | | | The hook was a noop. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Replace fprintfs with assertions in GLenum comparison translators.Kenneth Graunke2013-07-031-2/+2
| | | | | | | | | | | These functions translate GLenum comparison operations into the hardware enumerations. They should never be passed something other than a GL comparison operator, or something is very broken. Assertions seem more appropriate than fprintf. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Replace intel_state.c enums with those from brw_defines.h.Kenneth Graunke2013-07-033-102/+46
| | | | | | | | | | | | Both intel_context.h and brw_defines.h have #defines for comparison functions, stencil ops, blending logic ops, and blending factors. They're exactly the same values, so it makes sense to pick one. brw_defines.h is the logical place for this kind of stuff, so this patch converts intel_state.c to use the set defined there. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Delete pre-DRI2.3 viewport hacks.Kenneth Graunke2013-07-033-25/+1
| | | | | | | | | | The __DRI_USE_INVALIDATE extension was added in May 11th, 2010 by commit 4258e3a2e1c327. At this point, it's unlikely that anyone's using the right mix of new and old components to hit this path. Deleting it removes an untested code path and cleans up the driver a bit. Cc: Kristian Høgsberg <[email protected]> Cc: Keith Packard <[email protected]>
* i965: Remove "There are probably better ways" comment.Kenneth Graunke2013-07-031-5/+0
| | | | | | | There are always better ways to do things. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Delete brw_print_reg() function.Kenneth Graunke2013-07-033-99/+0
| | | | | | | | | | This wasn't called from anywhere; presumably it was used to examine brw_regs when debugging shader assembly. However, it prints registers in a different notation than brw_disasm.c which everyone is used to...which means I doubt anyone will want to use it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Move contents of intel_clear.h to intel_context.h.Kenneth Graunke2013-07-034-40/+2
| | | | | | | | | Having a header file for a single prototype seems rather excessive. Plus, the actual function is in brw_clear.c, not intel_clear.c, so there isn't even the .c/.h filename symmetry one might expect. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Move contents of intel_extensions.h to intel_context.h.Kenneth Graunke2013-07-034-37/+3
| | | | | | | | Having an entire header file for a single prototype seems a bit excessive. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Remove some dead code.Kenneth Graunke2013-07-0317-228/+0
| | | | | | | A random smattering of things that just aren't used anymore. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Delete dead intel_buffer_object::range_map_size field.Kenneth Graunke2013-07-031-1/+0
| | | | | | | Nothing uses this, apparently. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Remove intel_buffer_object::source.Kenneth Graunke2013-07-032-6/+0
| | | | | | | | | This was only used for BOs backed by system memory on i915. With that gone, there's nothing that even sets source to non-zero, so this is purely dead code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Fix buffer object segfault since removal of system memory BOs.Kenneth Graunke2013-07-031-0/+3
| | | | | | | | | | | | | | | | | Commit cf31a19300cbcecddb6bd0f878abb9316ebad2a1 removed support for BOs backed by system memory, as it was only useful for i915. However, it removed a little too much code: intel_bufferobj_buffer() used to call intel_bufferobj_alloc_buffer(), and after that commit, it didn't. This led to NULL pointer dereferences in several test cases, such as es3conform's transform_feedback_state_variables test. This commit restores the allocation, preserving the original behavior. It may not be the cleanest approach, but tidying should come later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66432 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* postprocess: move second temporary assertion into isolated configurationMatthew McClure2013-07-031-2/+2
| | | | | | | | | With this patch we will only assert that the second temporary is allocated, when there are more than two active filters. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423 Signed-off-by: Brian Paul <[email protected]>
* glsl: Ensure snprintf is defined on MSVC builds.José Fonseca2013-07-031-0/+1
| | | | | | | Should fix: src\glsl\opt_dead_builtin_varyings.cpp(244) : error C3861: 'snprintf': identifier not found ...
* targets/xvmc-nouveau: add in missing nv30 libIlia Mirkin2013-07-031-0/+1
| | | | | | | Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so that it may be dlopen'd. Signed-off-by: Ilia Mirkin <[email protected]>
* mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependenciesMarek Olšák2013-07-0216-43/+10
| | | | | | Not needed with do_dead_builtin_varyings. Reviewed-by: Ian Romanick <[email protected]>
* st/mesa: disable EXT_separate_shader_objectsMarek Olšák2013-07-022-1/+11
| | | | | | The extension disallows elimination of set-but-unused varyings. Reviewed-by: Ian Romanick <[email protected]>
* glsl/linker: eliminate unused and set-but-unused built-in varyingsMarek Olšák2013-07-025-2/+496
| | | | | | | | | | | | | This eliminates built-in varyings such as gl_Color, gl_SecondaryColor, gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is broken down into separate vec4s if needed. v2: - use a switch statement in varying_info_visitor::visit(ir_variable*) - use snprintf - disable the optimization for GLES2 Reviewed-by: Ian Romanick <[email protected]>
* glsl/linker: check against varying limit after unused varyings are eliminatedMarek Olšák2013-07-023-11/+33
| | | | | | | We counted even the varyings which were later eliminated, which was suboptimal. Reviewed-by: Ian Romanick <[email protected]>
* glsl/linker: link shaders in the opposite order (from fragment to vertex)Marek Olšák2013-07-021-50/+58
| | | | | | | | | | | | | | This ensures that inter-shader outputs and inputs are properly eliminated across 3 or more shader stages. The behavior is unchanged with 2 or less shader stages. For example, elimination of unused FS inputs causes elimination of matching GS outputs, which causes elimination of the GS inputs that were needed for evaluation of the eliminated GS outputs, which causes elimination of matching VS outputs. An unused FS input is all that's needed to trigger this chain reaction. Reviewed-by: Ian Romanick <[email protected]>
* mesa: renumber shader indices according to their placement in pipelineMarek Olšák2013-07-029-44/+32
| | | | | | | | | See my explanation in mtypes.h. v2: don't do this in gallium v3: also updated the comment at the gl_shader_type definition Reviewed-by: Ian Romanick <[email protected]>
* gallivm: Simplify intrinsic name construction.José Fonseca2013-07-021-23/+10
| | | | | | Just noticed this could be slightly shortened when fixing MSVC build. Trivial.
* glsl/builtins: Fix ARB_texture_cube_map_array built-in availability.Kenneth Graunke2013-07-022-1/+8
| | | | | | | | | | | | | | | | | | | | | This patch adds texture() for isamplerCubeArray and usamplerCubeArray, which were entirely missing. It also makes texture() with a LOD bias fragment shader specific. The main GLSL specification explicitly says that texturing with LOD bias should not be allowed for vertex shaders. Affects Piglit's ARB_texture_cube_map_array/compiler/tex_bias-01.vert. which tries to use bias in a vertex shader. Currently, it expects this to pass (so this patch regresses the test), but I've sent a patch to reverse the expected behavior (so this patch would fix the updated test): http://lists.freedesktop.org/archives/piglit/2013-June/006123.html NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallivm: Fix MSVC build.José Fonseca2013-07-021-8/+7
|
* gallivm: Fix indirect immediate registers.José Fonseca2013-07-021-2/+2
| | | | | | | | | | | If reg->Register.Indirect is true then the immediate is not truly a constant LLVM expression. There is no performance regression in using LLVMBuildBitCast, as it will fallback to LLVMConstBitCast internally when the argument is a constant. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* gallium/tests: fix the translate testZack Rusin2013-06-281-4/+4
|
* i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/wAnuj Phogat2013-07-011-0/+1
| | | | | | | | | This patch enables ext_framebuffer_multisample_blit_scaled extension on intel h/w >= gen6. Signed-off-by: Anuj Phogat <[email protected]> Acked-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/blorp: Add bilinear filtering of samples for multisample scaled blitsAnuj Phogat2013-07-012-11/+264
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation of ext_framebuffer_multisample_blit_scaled in i965/blorp uses nearest filtering for multisample scaled blits. Using nearest filtering produces blocky artifacts and negates the benefits of MSAA. That is the reason why extension was not enabled on i965. This patch implements the bilinear filtering of samples in blorp engine. Images generated with this patch are free from blocky artifacts and show big improvement in visual quality. Observed no piglit and gles3 regressions. V3: - Algorithm used for filtering assumes a rectangular grid of samples roughly corresponding to sample locations. - Test the boundary conditions on the edges of texture. V4: - Clip texcoords and use conditional MOVs. - Send texture dimensions as push constants. - Remove the optimization in case of scaled multisample blits. V5: - Move mcs_fetch() inside the 'for' loop after computing pixel coordinates. Signed-off-by: Anuj Phogat <[email protected]> Acked-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* docs: Import 9.1.4 release notes, add news item.Ian Romanick2013-07-013-0/+328
| | | | Signed-off-by: Ian Romanick <[email protected]>
* draw/translate: fix instancingZack Rusin2013-06-2817-40/+109
| | | | | | | | | | | | | | | | | | We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <[email protected]>