summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* r600g: reduce quant mode on evergreen+Alex Deucher2012-09-131-1/+1
| | | | | | | | | | Seems to have an affect on the allowable range of values. Set evergreen+ to 1/256 to match 6xx/7xx. fixes: https://bugs.freedesktop.org/show_bug.cgi?id=54877 Signed-off-by: Alex Deucher <[email protected]>
* radeonsi: don't use a staging resource for large transfersMarek Olšák2012-09-131-10/+0
| | | | It kills performance if the resource is linear.
* r600g: don't use a staging resource for large transfersMarek Olšák2012-09-131-10/+0
| | | | It kills performance if the resource is linear.
* r600g: convert the remnants of VGT state into immediate register writes/atoms v4Marek Olšák2012-09-138-57/+65
| | | | | | | | | v2: Group vgt register together to avoid lockup v3: Split multi primitive register and index bias register v4: Bump R600_NUM_ATOMS Signed-off-by: Marek Olšák <[email protected]> Signed-off-by: Jerome Glisse <[email protected]>
* r600g: emit the primitive type and associated regs only if the type is changedMarek Olšák2012-09-135-48/+38
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: add clip_misc_state for clip registers emitted in draw_vboMarek Olšák2012-09-138-22/+44
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: fix computing how much space is needed for a draw commandMarek Olšák2012-09-132-6/+12
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: fix the number of CS dwords of cb_misc_stateMarek Olšák2012-09-132-2/+2
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: atomize clip stateMarek Olšák2012-09-136-148/+38
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: atomize blend colorMarek Olšák2012-09-136-27/+25
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: atomize viewport stateMarek Olšák2012-09-137-40/+28
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: atomize stencil ref stateMarek Olšák2012-09-137-51/+56
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: remove unused state ID definitionsMarek Olšák2012-09-131-8/+0
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: initialize the first CS just like any other CSMarek Olšák2012-09-136-26/+24
| | | | | | by reusing the CS initialization in r600_context_flush. Reviewed-by: Jerome Glisse <[email protected]>
* r600g: add support for geometry shader samplers and constant buffersMarek Olšák2012-09-135-1/+52
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: put sampler states and views into an array indexed by shader typeMarek Olšák2012-09-136-72/+44
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: do fine-grained sampler state updatesMarek Olšák2012-09-136-51/+110
| | | | | | | | | | | | Update only those sampler states which are changed in a shader stage, instead of always updating all sampler states in the shader stage. That requires keeping a bitmask of those states which are enabled, and those states which are dirty at a given point (subset of enabled states). This is similar to how sampler views, constant buffers, and vertex buffers are handled. Reviewed-by: Jerome Glisse <[email protected]>
* r600g: consolidate set_viewport_state functionsMarek Olšák2012-09-133-48/+24
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: consolidate set_sampler_views functionsMarek Olšák2012-09-134-38/+17
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: put constant buffer state into an array indexed by shader typeMarek Olšák2012-09-136-40/+33
| | | | | | to easily and robustly handle multiple shader stages Reviewed-by: Jerome Glisse <[email protected]>
* r600g: cleanup state function namesMarek Olšák2012-09-133-37/+37
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: consolidate initialization of common state functionsMarek Olšák2012-09-135-150/+81
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: simplify flushingMarek Olšák2012-09-1312-190/+210
| | | | | | | | | | | | Based on the patch called "simplify and fix flushing and synchronization" by Jerome Glisse. Rebased, removed unneded code, simplified more and cleaned up. Also, SH_ACTION_ENA is not set when changing shaders (hw doesn't seem to need it). It's only used to flush constant buffers. Reviewed-by: Jerome Glisse <[email protected]>
* radeon/llvm: Fix lowering of vbuildTom Stellard2012-09-137-93/+19
| | | | | | Some of the old AMDIL code was hard-coding subreg indices when creating the VBUILD node, which was making it difficult to match the vector_insert patterns.
* radeon/llvm: Support fmul on SITom Stellard2012-09-131-1/+4
|
* i965: Fix out-of-order sampler unit usage in ARB fragment programs.Kenneth Graunke2012-09-122-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | ARB fragment programs use texture unit numbers directly, unlike GLSL which has an extra indirection. If a fragment program only uses one texture assigned to GL_TEXTURE1, SamplersUsed will only contain a single bit, which would make us only upload a single surface/sampler state entry. However, it needs to be the second entry. Using _mesa_fls() instead of _mesa_bitcount() solves this. For ARB programs, this makes num_samplers the ID of the highest texture unit used. Since GLSL uses consecutive integers assigned by the linker, _mesa_fls() should give the same result as _mesa_bitcount().. Fixes a regression since 85e8e9e000732908b259a7e2cbc1724a1be2d447, which caused GPU hangs in ETQW (and probably others), as well as breaking piglit test fp-fragment-position. v2: Add a comment, as suggested by Matt. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54098 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54179 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Tested-by: meng <[email protected]>
* mesa: Add a _mesa_fls() function to find the last bit set in a word.Kenneth Graunke2012-09-121-0/+22
| | | | | | | | | | | ffs() finds the least significant bit set; _mesa_fls() finds the /most/ significant bit. v2: Make it an inline function in imports.h, per Brian's suggestion. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/blorp: Fix offsets and width/height for stencil blits.Paul Berry2012-09-121-9/+37
| | | | | | | | Fixes piglit test "framebuffer-blit-levels draw stencil". NOTE: This is a candidate for stable release branches. Acked-by: Eric Anholt <[email protected]>
* i965/blorp: Reduce alignment restrictions for stencil blits.Paul Berry2012-09-121-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, we aligned all stencil blit operations to multiples of the size of a tile, since stencil buffers use W-tiling, and blorp has to approximate this by configuring the 3D pipeline for Y-tiling and swizzling coordinates. However, this was unnecessarily conservative; it turns out that the differences between W-tiling and Y-tiling are confined to 32-byte sub-tiles within the 4k tiling pattern; the layout of these 32-byte sub-tiles within the larger 4k tile is the same (8 sub-tiles across by 16 sub-tiles down, in column-major order). Therefore we only need to align stencil blit operations to multiples of the sub-tile size. Note: although the performance improvement of this change is probably quite small, the fact that W-tiling and Y-tiling formats only differ within 32-byte sub-tiles will be essential in a future patch to ensure that stencil blits work correctly between parts of the miptree other than level/layer 0. Making this change provides handy documentation (and validation) of this fact. NOTE: This is a candidate for stable release branches. Acked-by: Eric Anholt <[email protected]>
* i965/blorp: don't reduce stencil alignment restrictions when multisampling.Paul Berry2012-09-121-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | When blitting to a stencil buffer, we need to align the rectangle we send down the rendering pipeline, to account for the fact that the stencil buffer uses a W-tiled layout, but we are configuring its surface state as Y-tiled. Previously, when the stencil buffer was multisampled, we assumed that we could reduce the amount of alignment that was necessary, since each pixel occupies a block of 2x2 or 4x2 samples in the stencil buffer. That would have been correct if the coordinates we were adjusting were measured in pixels. However, the conversion from pixel coordinates to coordinates within the interleaved buffer has already been done; therefore the full alignment restriction applies. Note: the reason this mistake wasn't previously uncovered by piglit tests is because it is being masked by another mistake: the blorp engine is using overly conservative alignment restrictions when doing stencil blits. The overly conservative alignment restrictions will be removed in the patch that follows. Doing this fix now will prevent the subsequent patch from introducing regressions. NOTE: This is a candidate for stable release branches. Acked-by: Eric Anholt <[email protected]>
* intel: Add map_stencil_as_y_tiled to intel_region_get_aligned_offset.Paul Berry2012-09-128-13/+31
| | | | | | | | | | This patch modifies intel_region_get_aligned_offset() to make the appropriate calculation when the blorp engine sets up a W-tiled stencil buffer using a Y-tiled SURFACE_STATE. NOTE: This is a candidate for stable release branches. Acked-by: Eric Anholt <[email protected]>
* intel: Add map_stencil_as_y_tiled to intel_region_get_tile_masks.Paul Berry2012-09-128-13/+21
| | | | | | | | | | | When the blorp engine is performing a blit from one stencil buffer to another, it sets up the surface state for these buffers as Y-tiled, so it needs to be able to force intel_region_get_tile_masks() to return the appropriate masks for a Y-tiled region. NOTE: This is a candidate for stable release branches. Acked-by: Eric Anholt <[email protected]>
* i965/blorp: Account for offsets when emitting SURFACE_STATE.Paul Berry2012-09-124-4/+48
| | | | | | | | Fixes piglit tests "framebuffer-blit-levels {read,draw} depth". NOTE: This is a candidate for stable release branches. Reviewed-by: Eric Anholt <[email protected]>
* i965/blorp: Thread level and layer through brw_blorp_blit_miptrees().Paul Berry2012-09-123-6/+19
| | | | | | | | | | | | | | | | | | Previously, when performing a blit using the blorp engine, we failed to account for the level and layer of the source and destination. As a result, all blits would occur between miplevel 0 and layer 0 of the corresponding textures, regardless of which level/layer was bound to the framebuffer. This patch passes the correct level and layer through brw_blorp_miptrees() into the brw_blorp_blit_params data structure. Further patches in the series will adapt gen{6,7}_blorp_emit_surface_state to make use of these parameters. NOTE: This is a candidate for stable release branches. Reviewed-by: Eric Anholt <[email protected]>
* i965/blorp: Don't create a dummy renderbuffer just to fetch image offsets.Paul Berry2012-09-121-8/+1
| | | | | This is unnecessary--the image offsets can be read directly out of the miptree using intel_miptree_get_image_offset.
* i965/blorp: store x and y offsets in brw_blorp_mip_info.Paul Berry2012-09-124-28/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, gen{6,7}_blorp_emit_surface_state assumes that the src and dst surfaces are mapped to miplevel 0 and layer 0 (thus no surface offset is required). This is a bug, since the user might try to blit to and from levels/layers other than 0. To fix this bug, it will not be sufficient to have gen6_{6,7}_blorp_emit_surface_state look up the surface offset at the time they set up the surface state, since these offsets will need to be tweaked when blitting stencil buffers (due to the fact that stencil buffer blits have to swizzle between W and Y tiling formats). So, to pave the way for the bug fix, this patch causes the x and y offsets to be computed during blit setup and stored in brw_blorp_mip_info. As a result of this change, brw_blorp_mip_info doesn't need to store the level and layer anymore. For consistency, this patch makes a similar change to the handling of depth buffers when doing HiZ operations. NOTE: This is a candidate for stable release branches. Reviewed-by: Eric Anholt <[email protected]>
* i965/blorp: store surface width/height in brw_blorp_mip_info.Paul Berry2012-09-125-37/+48
| | | | | | | | | | | | | | | | | | | | Previously, gen{6,7}_blorp_emit_surface_state would look up the width and height of the surface at the time they set up the surface state, and then tweak it if necessary (it's necessary when a W-tiled surface is being mapped as Y-tiled). With this patch, we look up the width and height when setting up the blit, and store them in brw_blorp_mip_info. This allows us to do the necessary tweak in the brw_blorp_blit_params constructor (where it makes more sense). It also reduces the need to keep track of level and layer in brw_blorp_mip_info, so that a future patch can eliminate them entirely. For consistency, this patch makes a similar change to the handling of depth buffers when doing HiZ operations. NOTE: This is a candidate for stable release branches. Reviewed-by: Eric Anholt <[email protected]>
* i965/blorp: Change gl_renderbuffer* params to intel_renderbuffer*.Paul Berry2012-09-121-28/+32
| | | | | | | | | This makes it more convenient for blorp functions to get access to Intel-specific data inside the renderbuffer objects. NOTE: This is a candidate for stable release branches. Reviewed-by: Eric Anholt <[email protected]>
* i965/blorp: Clarify why width/height must be adjusted for Gen6 IMS surfaces.Paul Berry2012-09-122-1/+10
| | | | | | | | | Also add a clarifying comment for why the width/height doesn't need adjustment for Gen7. NOTE: This is a candidate for stable release branches. Reviewed-by: Eric Anholt <[email protected]>
* i965/gen6+: Adjust stencil buffer size after computing miptree layout.Paul Berry2012-09-121-12/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Gen6+ stencil buffers use W-tiling (a tiling arrangement which drm and the kernel are not aware of) we need to round up the width and height of a stencil buffer to multiples of the W-tile size (64x64) before allocating a stencil buffer. Previously, we rounded up the size of the base miplevel, and then computed the miptree layout based on the rounded up size. This was incorrect, because it meant that the total size of the miptree would not be properly W-tile aligned, and therefore we would not always allocate enough pages. (Note: even though the GL API doesn't allow creation of mipmapped stencil textures, it does allow mipmapping of a combined depth/stencil texture, and on Gen6+, a combined depth/stencil texture is internally implemented as a pair of separate depth and stencil buffers.) For example, on Sandy Bridge, when allocating a mipmapped stencil texture of size 128x128, we would first round up to the nearest multiple of 64x64 (causing no change to the size), and then compute the miptree layout (whose size worked out to 128x196). Then we would request an allocation of 128*196 bytes (6.125 pages), causing 7 pages to be allocated to the texture. However, the texture needs 8 pages, since each W-tile occupies a page, and it takes 2 W-tiles to cover a width of 128 and 4 W-tiles to cover a height of 196. This patch changes the order of operations so that the miptree layout is computed first and then the total size of the miptree is rounded up to be W-tile aligned. NOTE: This is a candidate for stable release branches. Reviewed-by: Eric Anholt <[email protected]>
* radeonsi: Properly handle NULL sampler views.Michel Dänzer2012-09-121-3/+3
| | | | | | | | | | Fixes piglit shaders/glsl-fs-uniform-sampler-array and many other similar tests. In fact, I just completed a piglit quick-driver.tests run without any GPU lockups or even VM protection faults. Yay! Signed-off-by: Michel Dänzer <[email protected]>
* radeonsi: Fix calculation of number of records in buffer resource.Michel Dänzer2012-09-121-1/+1
| | | | | | | | | | | The value was too small by 1 in some cases (non-first of several vertex elements interleaved in a single buffer). Fixes intermittent incorrect geometry in many apps, e.g. piglit spec/EXT_texture_snorm/fbo-generatemipmap-formats. Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Christian König <[email protected]>
* mesa: glGet: fix API check for EGL_image_external enumsImre Deak2012-09-111-6/+9
| | | | | | | | These enums are valid only in ES1 and ES2. So far they were marked valid incorrectly, depending on the previous API mask in the enum list. Signed-off-by: Imre Deak <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* mesa: glGet: fix indentation of print_table_statsImre Deak2012-09-111-9/+9
| | | | | | | No functional change. Signed-off-by: Imre Deak <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* mesa: glGet: fix indentation of find_valueImre Deak2012-09-111-4/+4
| | | | | | | No functional change. Signed-off-by: Imre Deak <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* mesa: glGet: fix indentation of _mesa_init_get_hashImre Deak2012-09-111-9/+9
| | | | | | | No functional change. Signed-off-by: Imre Deak <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* mesa: fix proxy texture error handling in glTexStorage()Brian Paul2012-09-111-37/+41
| | | | | | | | | This is basically a follow-on to 1f5b1f98468d5e80be39e619ed15c422fbede8d3. Basically, generate GL errors for ordinary invalid parameters for proxy targets the same as for non-proxy targets. Only texture size and OOM errors should be handled specially for proxies. Note: This is a candidate for the stable branches.
* mesa: make _mesa_get_proxy_target() non-staticBrian Paul2012-09-112-6/+8
| | | | | | Needed for the next patch. Note: This is a candidate for the stable branches.
* mesa: do internal format error checking for glTexStorage()Brian Paul2012-09-111-0/+48
| | | | | | | | Turns out we weren't doing any format checking before. Now check the internal format and, in particular, make sure that unsized internal formats aren't accepted. Note: This is a candidate for the stable branches.
* mesa/msaa: Allow X and Y flips in multisampled blits.Paul Berry2012-09-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the GL 4.3 spec, section 18.3.1 "Blitting Pixel Rectangles": If SAMPLE_BUFFERS for either the read framebuffer or draw framebuffer is greater than zero, no copy is performed and an INVALID_OPERATION error is generated if the dimensions of the source and destination rectangles provided to BlitFramebuffer are not identical, or if the formats of the read and draw framebuffers are not identical. It is not clear from the spec whether "dimensions" should mean both sign and magnitude, or just magnitude. Previously, Mesa interpreted "dimensions" as meaning both sign and magnitude, so any multisampled blit that attempted to flip the image in the X and/or Y direction would fail. However, Y flips are likely to be commonplace in OpenGL applications that have been ported from DirectX applications, as a result of the fact that DirectX and OpenGL differ in their orientation of the Y axis. Furthermore, at least one commercial driver (nVidia) permits Y filps, and L4D2 relies on them being permitted. So it seems prudent for Mesa to permit them. This patch changes Mesa to allow both X and Y flips, since there is no language in the spec to indicate that X and Y flips should be treated differently. NOTE: This is a candidate for stable release branches. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>