summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* mesa: remove duplicated init of MaxViewportsMaxence Le Doré2014-02-091-3/+0
| | | | | | | Already declared 5 lines before. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: add geometry shader output limitsGrigori Goronzy2014-02-0915-1/+54
| | | | | | | | v2: adjust limits for radeonsi and llvmpipe v3: add documentation Cc: "10.1" <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* mesa: Removed unnecessary check for NULL pointer when freeing memorySiavash Eliasi2014-02-091-2/+1
| | | | | | | | | Note that it is OK to pass NULL pointers to this function since this commit: mesa: modified _mesa_align_free() to accept NULL pointer http://cgit.freedesktop.org/mesa/mesa/commit/?id=f0cc59d68a9f5231e8e2111393a1834858820735 Reviewed-by: Marek Olšák <[email protected]>
* nv30: report 8 maximum inputsIlia Mirkin2014-02-081-1/+1
| | | | | | | | | | | | nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for nv40/nv30. This fixes compilation of the varying-packing tests. Furthermore it appears that the last 2 inputs on nv4x don't seem to work in those tests, so just report 8 everywhere for now. Tested on NV42, NV44. NV4B appears to have additional problems. Signed-off-by: Ilia Mirkin <[email protected]> Cc: 9.1 9.2 10.0 10.1 <[email protected]>
* nv50/ir/ra: some register spilling fixesChristoph Bumiller2014-02-091-5/+34
| | | | Cc: 10.1 <[email protected]>
* mesa: update assertion in detach_shader() for geom shadersBrian Paul2014-02-081-0/+1
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74723 Cc: "10.0" "10.1" <[email protected]> Tested-by: Andreas Boll <[email protected]>
* mesa: allocate gl_debug_state on demandBrian Paul2014-02-089-153/+274
| | | | | | | | | | | | We don't need to allocate all the state related to GL_ARB_debug_output until some aspect of that extension is actually needed. The sizeof(gl_debug_state) is huge (~285KB on 64-bit systems), not even counting the 54(!) hash tables and lists that it contains. This change reduces the size of gl_context alone from 431KB bytes to 145KB bytes on 64-bit systems and from 277KB bytes to 78KB bytes on 32-bit systems. Reviewed-by: Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: trivial clean-ups in errors.cBrian Paul2014-02-081-41/+84
| | | | | | | Whitespace changes, 78-column rewrapping, comment clean-ups, add some braces, etc. Reviewed-by: Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: remove _mesa_ prefix from some static functionsBrian Paul2014-02-081-27/+23
| | | | Reviewed-by: Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Label JIP and UIP in Broadwell shader disassembly.Kenneth Graunke2014-02-071-2/+6
| | | | | | | This makes it obvious which number is which. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Don't disassemble UIP field for Broadwell WHILE instructions.Kenneth Graunke2014-02-071-2/+1
| | | | | | | The WHILE instruction doesn't have UIP. It only has JIP. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Don't print source registers for Broadwell flow control.Kenneth Graunke2014-02-071-13/+14
| | | | | | | | | | | The bits which normally contain the source register descriptions actually contain the JIP/UIP jump targets, which we already printed. Interpreting JIP/UIP as source registers results in some really creepy looking output, like IF statements with acc14.4<0,1,0>UD sources. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix fast depth clear values on Broadwell.Kenneth Graunke2014-02-071-1/+4
| | | | | | | | | | | | | | Broadwell's 3DSTATE_CLEAR_PARAMS packet expects a floating point value regardless of format. This means we need to stop converting it to UNORM. Storing the value as float would make sense, but since we already have a uint32_t field, this patch continues shoehorning it into that. In a sense, this makes mt->depth_clear_value the DWord you emit in the packet, rather than the clear value itself. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nvc0: handle TGSI_SEMANTIC_LAYERChristoph Bumiller2014-02-075-5/+4
| | | | Cc: 10.1 <[email protected]>
* nvc0: create the SW objectChristoph Bumiller2014-02-072-0/+10
| | | | It's required for being able to use software methods now.
* nvc0/ir/emit: hardcode vertex output stream to 0 for nowChristoph Bumiller2014-02-071-2/+7
|
* i965: Enable ARB_texture_gather for one component on Gen6.Chris Forbes2014-02-082-1/+3
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Emit shader w/a for Gen6 gatherChris Forbes2014-02-082-0/+32
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Emit shader w/a for Gen6 gatherChris Forbes2014-02-082-0/+35
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add surface format overrides for Gen6 gatherChris Forbes2014-02-081-5/+32
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add Gen6 gather wa to sampler keyChris Forbes2014-02-082-0/+32
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Optimize triop_csel with all-true or all-false.Eric Anholt2014-02-071-0/+7
| | | | Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize various cases of fma (aka MAD).Eric Anholt2014-02-071-0/+13
| | | | Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize lrp(x, x, coefficient) --> x.Eric Anholt2014-02-071-0/+2
| | | | | | | | | | | total instructions in shared programs: 1627754 -> 1624534 (-0.20%) instructions in affected programs: 45748 -> 42528 (-7.04%) GAINED: 3 LOST: 0 (serious sam, humus domino demo) Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize pow(x, 1) -> x.Eric Anholt2014-02-071-0/+4
| | | | | | | | | | | total instructions in shared programs: 1627826 -> 1627754 (-0.00%) instructions in affected programs: 6640 -> 6568 (-1.08%) GAINED: 0 LOST: 0 (HoN and savage2) Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize log(exp(x)) and exp(log(x)) into x.Eric Anholt2014-02-071-0/+36
| | | | Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize ~~x into x.Eric Anholt2014-02-071-0/+5
| | | | | | | v2: Fix pasteo of an extra abs being inserted (caught by many). Rewrite to drop the silly switch statement. Reviewed-by: Matt Turner <[email protected]> (v1)
* i965: Add some informative debug when the X Server botches DRI2 GetBuffers.Eric Anholt2014-02-071-1/+11
| | | | | | | | We've had various bug reports over the years where miptrees are missing, and when I screwed it up while adding DRI2 to the modesetting driver, I figured I should put the info necessary for debug here. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove redundant check in blitter-based glBlitFramebuffer().Eric Anholt2014-02-071-10/+0
| | | | | | | The intel_miptree_blit() code checks the format for us now, plus it handles xrgb vs argb for us. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix Gen8+ disassembly of half float subregister numbers.Kenneth Graunke2014-02-071-0/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Use the new brw_load_register_mem helper for draw indirect.Kenneth Graunke2014-02-071-31/+22
| | | | | | | | | | | This makes it work on Broadwell, too. v2: Drop bogus double write to 3DPRIM_BASE_VERTEX register (caught by Chris Forbes). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Implement a brw_load_register_mem helper function.Kenneth Graunke2014-02-072-0/+32
| | | | | | | | | | | | This saves some boilerplate and hides the OUT_RELOC/OUT_RELOC64 distinction. Placing the function in intel_batchbuffer.c is rather arbitrary; there wasn't really an obvious place for it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Fix INTEL_DEBUG=vs for fixed-function/ARB programs.Kenneth Graunke2014-02-072-4/+4
| | | | | | | | | | | Since commit 9cee3ff562f3e4b51bfd30338fd1ba7716ac5737, INTEL_DEBUG=vs has caused a NULL pointer dereference for fixed-function/ARB programs. In the vec4 generators, "prog" is a gl_program, and "shader_prog" is the gl_shader_program. This is different than the FS visitor. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Don't lose precision qualifiers when encountering "centroid".Kenneth Graunke2014-02-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mesa fails to retain the precision qualifier when parsing: #version 300 es centroid in mediump vec2 v; Consider how the parser's type_qualifier production is applied. First, the precision_qualifier rule creates a new ast_type_qualifier: <precision: mediump> Then the storage_qualifier rule creates a second one: <flags: in> and calls merge_qualifier() to fold in any previous qualifications, returning: <flags: in, precision: mediump> Finally, the auxiliary_storage_qualifier creates one for "centroid": <flags: centroid> it then does $$ = $1 and $$.flags |= $2.flags, resulting in: <flags: centroid, in> Since precision isn't stored in the flags bitfield, it is lost. We need to instead call merge_qualifier to combine all the fields. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reported-by: Kevin Rogovin <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* st/mesa: avoid sw fallback for getting/decompressing texturesBrian Paul2014-02-071-1/+3
| | | | | | | | | | | If st_GetTexImage() is to decompress the texture, avoid the fallback path even if prefer_blit_based_texture_transfer = false. For drivers that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we were always taking the fallback path for texture decompression rather than rendering a quad. The later is a lot faster. Cc: "10.0" "10.1" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/tgsi: correct typo propagated from NV_vertex_program1_1Erik Faye-Lund2014-02-072-3/+3
| | | | | | | | | | | | | | | | In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F800000 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/tgsi: use CLAMP instead of open-coded clampsErik Faye-Lund2014-02-071-22/+4
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* egl: Unhide functionality in _eglInitSurface()Juha-Pekka Heikkila2014-02-071-1/+3
| | | | | | | | | | _eglInitResource() was used to memset entire _EGLSurface by writing more than size of pointed target. This does work as long as Resource is the first element in _EGLSurface, this patch fixes such dependency. Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* egl: Unhide functionality in _eglInitContext()Juha-Pekka Heikkila2014-02-071-1/+2
| | | | | | | | | | _eglInitResource() was used to memset entire _EGLContext by writing more than size of pointed target. This does work as long as Resource is the first element in _EGLContext, this patch fixes such dependency. Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glx: Add missing null check in __glX_send_client_info()Juha-Pekka Heikkila2014-02-071-0/+4
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Add missing null check in fs_visitor::dead_code_eliminate_local()Juha-Pekka Heikkila2014-02-071-0/+4
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glx: Add some missing null checks in glx_pbuffer.cJuha-Pekka Heikkila2014-02-071-4/+15
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glsl: Fix null access on file read errorJuha-Pekka Heikkila2014-02-071-1/+2
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glx: Add missing null check in __glXCloseDisplayJuha-Pekka Heikkila2014-02-071-1/+2
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glx: Add missing null checks in glxcmds.cJuha-Pekka Heikkila2014-02-071-8/+20
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* main/get: support ARB_gpu_shader5Jordan Justen2014-02-065-1/+26
| | | | | | | | If a driver enables ARB_gpu_shader5 and sets Const.MaxVertexSteams >= 4, then piglit's arb_gpu_shader5-minmax test should now pass. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glapi: add definitions for ARB_gpu_shader5Jordan Justen2014-02-062-0/+17
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nouveau/codegen: allow tex offsets on non-TXF instructions (e.g. TXL)Ilia Mirkin2014-02-061-0/+8
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Christoph Bumiller <[email protected]>
* nv50: only over-allocate by a page for codeIlia Mirkin2014-02-061-4/+5
| | | | | | | | | The pre-fetching doesn't go too far. Tested with over-allocating by only a page, and didn't see any errors in dmesg. Saves ~512KB of VRAM. Signed-off-by: Ilia Mirkin <[email protected]> Cc: 10.1 <[email protected]> Reviewed-by: Christoph Bumiller <[email protected]>
* nv50: fix layerid to be the fp input number rather than vp output numberIlia Mirkin2014-02-063-7/+9
| | | | | | | | | | In the tests they were the same so it didn't matter, but indications are that this is the correct behaviour. Also take this opportunity to (trivially) support using gl_Layer in fp. Cc: 10.1 <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Christoph Bumiller <[email protected]>