aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* r300g: rework command submission and resource space checkingMarek Olšák2011-01-0813-260/+505
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The motivation behind this rework is to get some speed by reducing CPU overhead. The performance increase depends on many factors, but it's measurable (I think it's about 10% increase in Torcs). This commit replaces libdrm's radeon_cs_gem with our own implemention. It's optimized specifically for r300g, but r600g could use it as well. Reloc writes and space checking are faster and simpler than their counterparts in libdrm (the time complexity of all the functions is O(1) in nearly all scenarios, thanks to hashing). (libdrm's radeon_bo_gem is still being used in the driver.) It works like this: cs_add_reloc(cs, buf, read_domain, write_domain) adds a new relocation and also adds the size of 'buf' to the used_gart and used_vram winsys variables based on the domains, which are simply or'd for the accounting purposes. The adding is skipped if the reloc is already present in the list, but it accounts any newly-referenced domains. cs_validate is then called, which just checks: used_vram/gart < vram/gart_size * 0.8 The 0.8 number allows for some memory fragmentation. If the validation fails, the pipe driver flushes CS and tries do the validation again, i.e. it validates only that one operation. If it fails again, it drops the operation on the floor and prints some nasty message to stderr. cs_write_reloc(cs, buf) just writes a reloc that has been added using cs_add_reloc. The read_domain and write_domain parameters have been removed, because we already specify them in cs_add_reloc. The space checking has been tested by putting small values in vram/gart_size variables.
* intel: Make renderbuffer tiling choice match texture tiling choice.Eric Anholt2011-01-071-4/+9
| | | | | | There really shouldn't be any difference between the two for us. Fixes a bug where Z16 renderbuffers would be untiled on gen6, likely leading to hangs.
* intel: Use the _BaseFormat from MESA_FORMAT_* in renderbuffer setup.Eric Anholt2011-01-071-36/+1
|
* docs: fix messed up names with special characters in relnotes-7.9.1Marek Olšák2011-01-081-2/+4
| | | | (cherry picked from commit 67aeab0b77fb6be864088e69ea74a010b6543fa1)
* docs: fix messed up names with special characters in relnotes-7.10Marek Olšák2011-01-081-2/+4
| | | | (cherry picked from commit 36009724fdd652ab29aa928ba78891afd650e768)
* i915: Drop old checks for the settexoffset hack.Eric Anholt2011-01-072-17/+6
|
* i915: Don't claim to support AL1616 when neither 830 nor 915 does it.Eric Anholt2011-01-071-1/+2
| | | | Fixes an abort in fbo-generatemipmap-formats.
* intel: Add a vtbl hook for determining if a format is renderable.Eric Anholt2011-01-077-38/+68
| | | | | | | By relying on just intel_span_supports_format, some formats that aren't supported pre-gen4 were not reporting FBO incomplete. And we also complained in stderr when it happened on i915 because draw_region gets called before framebuffer completeness validation.
* intel: expose ARB_framebuffer_object in the i915 driver.Eric Anholt2011-01-071-1/+1
| | | | | | | | | | | ARB_fbo no longer disallows mismatched width/height on attachments (shouldn't be any problem), mixed format color attachments (we only support 1), and L/A/LA/I color attachments (we already reject them on 965 too). It requires Gen'ed names (driver doesn't care), and adds FramebufferTextureLayer (we don't do texture arrays). So it looks like we're already in the position we need to be for this extension. Bug #27468, #32381.
* nvc0: fix reloc domain conflict on buffer migrationChristoph Bumiller2011-01-081-12/+12
| | | | | Occurred because the code assumed that buf->domain would remain equal to old_domain.
* nvc0: upload user buffers only from draw info min to max indexChristoph Bumiller2011-01-082-3/+9
| | | | There are actually applications that profit immensely from this.
* nvc0: fix emission of first 3 u8 indices to RING_NIChristoph Bumiller2011-01-081-1/+1
|
* nvc0: reset mt transfer address after read loop over layersChristoph Bumiller2011-01-081-0/+1
|
* nvc0: tie buffer memory release to the buffer fenceChristoph Bumiller2011-01-081-4/+7
| | | | | ... instead of the next fence to be emitted. This way we have a chance to reclaim the storage earlier.
* r300g: Remove invalid assertion.Łukasz Krotowski2011-01-081-1/+0
| | | | | | | Invalid after be1af4394e060677b7db6bbb8e3301e38a3363da (user buffer creation with width0 == ~0). Signed-off-by: Marek Olšák <[email protected]>
* docs: Import 7.10 release notes from 7.10 branchIan Romanick2011-01-071-5/+2742
|
* i965: Avoid double-negation of immediate values in the VS.Eric Anholt2011-01-071-4/+3
| | | | | | | | | | | | | | In general, we have to negate in immediate values we pass in because the src1 negate field in the register description is in the bits3 slot that the 32-bit value is loaded into, so it's ignored by the hardware. However, the src0 negate field is in bits1, so after we'd negated the immediate value loaded in, it would also get negated through the register description. This broke this VP instruction in the position calculation in civ4: MAD TEMP[1], TEMP[1], CONST[256].zzzz, CONST[256].-y-y-y-y; Bug #30156
* docs: Import 7.9.1 release notes from 7.9 branchIan Romanick2011-01-071-0/+404
|
* r600g: Also set const_offset if the buffer is not a user buffer in ↵Henri Verbeet2011-01-071-0/+2
| | | | r600_upload_const_buffer().
* r600g: Update some comments for Evergreen.Henri Verbeet2011-01-071-1/+3
|
* r600g: Split ALU clauses based on used constant cache lines.Henri Verbeet2011-01-072-21/+129
|
* r600g: Consistently use the copy of the alu instruction in ↵Henri Verbeet2011-01-071-9/+9
| | | | r600_bc_add_alu_type().
* r600g: Store kcache settings as an array.Henri Verbeet2011-01-073-24/+25
|
* r300g: derive user buffer sizes at draw timeMarek Olšák2011-01-079-104/+144
| | | | | | | This only uploads the [min_index, max_index] range instead of [0, userbuf size], which greatly speeds up user buffer uploads. This is also a prerequisite for atomizing vertex arrays in st/mesa.
* mesa: fix an error in uniform arrays in row calculating.Jian Zhao2011-01-071-1/+1
| | | | | | | | | | Fix the error in uniform row calculating, it may alloc one line more which may cause out of range on memory usage, sometimes program aborted when free the memory. NOTE: This is a candidate for 7.9 and 7.10 branches. Signed-off-by: Brian Paul <[email protected]>
* mesa: Directly include mfeatures.h in files that perform feature tests.Vinson Lee2011-01-0754-0/+54
|
* r600c: fix up SQ setup in blit code for Ontario/NIAlex Deucher2011-01-071-1/+87
|
* r600g: allow constant buffers to be user buffers.Dave Airlie2011-01-076-4/+44
| | | | | | | | | This provides an upload facility for the constant buffers since Marek's constants in user buffers changes. gears at least work on my evergreen now. Signed-off-by: Dave Airlie <[email protected]>
* r600c: add support for NI asicsAlex Deucher2011-01-065-1/+118
|
* r600g: add support for NI (Northern Islands) GPUsAlex Deucher2011-01-066-0/+115
| | | | This adds support for Barts, Turks, and Caicos asics.
* i965: Rename various gen6 #defines to match the documentation.Kenneth Graunke2011-01-0612-33/+33
| | | | | | | | This should make it easier to cross-reference the code and hardware documentation, as well as clear up any confusion on whether constants like CMD_3D_WM_STATE mean WM_STATE (pre-gen6) or 3DSTATE_WM (gen6+). This does not rename any pre-gen6 defines.
* svga: Ensure that the wrong vdecls don't get used in swtnl pathJakob Bornecrantz2011-01-063-0/+19
| | | | | | | The draw module set new state that didn't require swtnl which caused need_swtnl to be unset. This caused the call from to svga_update_state(svga, SVGA_STATE_SWTNL_DRAW) from the vbuf backend to overwrite the vdecls we setup there to be overwritten with the real buffers vdecls.
* glsl: Refresh autogenerated lexer and parser files.Ian Romanick2011-01-063-2649/+2700
| | | | For the previous commit.
* glsl: Support the 'invariant(all)' pragmaIan Romanick2011-01-064-0/+42
| | | | | | | | | | | | | Previously the 'STDGL invariant(all)' pragma added in GLSL 1.20 was simply ignored by the compiler. This adds support for setting all variable invariant. In GLSL 1.10 and GLSL ES 1.00 the pragma is ignored, per the specs, but a warning is generated. Fixes piglit test glsl-invariant-pragma and bugzilla #31925. NOTE: This is a candidate for the 7.9 and 7.10 branches.
* glsl: Allow less restrictive uses of sampler array indexing in GLSL <= 1.20Ian Romanick2011-01-061-4/+24
| | | | | | | | | | | | | | | | | | GLSL 1.10 and 1.20 allow any sort of sampler array indexing. Restrictions were added in GLSL 1.30. Commit f0f2ec4d added support for the 1.30 restrictions, but it broke some valid 1.10/1.20 shaders. This changes the error to a warning in GLSL 1.10, GLSL 1.20, and GLSL ES 1.00. There are some spurious whitespace changes in this commit. I changed the layout (and wording) of the error message so that all three cases would be similar. The 1.10/1.20 and 1.30 text is the same. The only difference is that one is an error, and the other is a warning. The GLSL ES 1.00 wording is similar but not quite the same. Fixes piglit test spec/glsl-1.10/compiler/constant-expressions/sampler-array-index-02.frag and bugzilla #32374.
* r300g: fix corruption when nr_cbufs==0 and multiwrites enabledMarek Olšák2011-01-061-1/+2
| | | | https://bugs.freedesktop.org/show_bug.cgi?id=32634
* r300g: remove the buffer range checkingMarek Olšák2011-01-062-60/+1
| | | | | | It's no longer needed because the upload buffer remains mapped while the CS is being filled (openarena, ut2004 and others that this code was for do not use VBOs by default).
* r300g: skip buffer validation of upload buffers when appropriateMarek Olšák2011-01-065-8/+36
| | | | because the upload buffers are reused for subsequent draw operations.
* util: add comments to u_upload_mgr and u_inlinesMarek Olšák2011-01-063-15/+38
|
* vbo: remove a redundant call to _ae_invalidate_stateMarek Olšák2011-01-061-1/+0
| | | | It's called in vbo_exec_invalidate_state too.
* st/mesa: remove unused members in st_contextMarek Olšák2011-01-061-9/+0
| | | | What were these for?
* tgsi: remove redundant name tables from tgsi_text, use those from tgsi_dumpMarek Olšák2011-01-063-56/+33
| | | | | I also specified the array sizes in the header so that one can use the Elements macro on it.
* gallium: drivers should reference vertex buffersMarek Olšák2011-01-0617-52/+89
| | | | So that a state tracker can unreference them after set_vertex_buffers.
* st/mesa: optimize constant buffer uploadsMarek Olšák2011-01-064-34/+20
| | | | | | | | | | | | | The overhead of resource_create, transfer_inline_write, and resource_destroy to upload constant data is very visible with some apps in sysprof, and as such should be eliminated. My approach uses a user buffer to pass a pointer to a driver. This gives the driver the freedom it needs to take the fast path, which may differ for each driver. This commit addresses the same issue as Jakob's one that suballocates out of a big constant buffer, but it also eliminates the copy to the buffer.
* st/mesa: do sanity checks on states only in debug buildsMarek Olšák2011-01-061-0/+4
|
* u_upload_mgr: new featuresMarek Olšák2011-01-066-36/+115
| | | | | | | | | | | | | | - Added a parameter to specify a minimum offset that should be returned. r300g needs this to better implement user buffer uploads. This weird requirement comes from the fact that the Radeon DRM doesn't support negative offsets. - Added a parameter to notify a driver that the upload flush occured. A driver may skip buffer validation if there was no flush, resulting in a better performance. - Added a new upload function that returns a pointer to the upload buffer directly, so that the buffer can be filled e.g. by the translate module.
* u_upload_mgr: keep the upload buffer mapped until it is flushedMarek Olšák2011-01-061-52/+14
| | | | | | | The map/unmap overhead can be significant even though there is no waiting on busy buffers. There is simply a huge number of uploads. This is a performance optimization for Torcs, a car racing game.
* mesa: fix build for NetBSDPierre Allegraud2011-01-063-25/+8
| | | | | | | | See http://bugs.freedesktop.org/show_bug.cgi?id=32859 NOTE: This is a candidate for the 7.9 and 7.10 branches. Signed-off-by: Brian Paul <[email protected]>
* glext: upgrade to version 67Brian Paul2011-01-061-2/+12
|
* mesa: Clean up header file inclusion in version.c.Vinson Lee2011-01-061-1/+1
| | | | | Include imports.h directly instead of indirectly through context.h. version.c does use any symbols that are added by context.h.