summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: generate register and packet tables for an IB parser from sid.hMarek Olšák2015-08-264-0/+190
| | | | | | | | | | | This makes writing a good IB parser a lot easier. It generates 2 tables: - packet3 table - register table with all registers, fields, and named values Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* radeonsi: remove duplicated register definitions and instruction definitionsMarek Olšák2015-08-261-3160/+0
| | | | | | | | | Instruction encoding isn't needed in Mesa. The border color address registers were duplicated. Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* r600g,radeonsi: remove unused ill-formed register field definitionsMarek Olšák2015-08-262-2/+0
| | | | | Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* radeonsi: add an initial dump_debug_state implementation dumping shadersMarek Olšák2015-08-264-0/+64
| | | | | | | This is usually called after a draw call. Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* radeonsi: allow si_dump_key to write to a fileMarek Olšák2015-08-262-18/+19
| | | | | Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* gallium/ddebug: new pipe for hang detection and driver state dumping (v2)Marek Olšák2015-08-2610-0/+2134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: lots of improvements This is like identity or trace, but simpler. It doesn't wrap most states. Run with: GALLIUM_DDEBUG=1000 [executable] where "executable" is the app and "1000" is in miliseconds, meaning that the context will be considered hung if a fence fails to signal in 1000 ms. If that happens, all shaders, context states, bound resources, draw parameters, and driver debug information (if any) will be dumped into: /home/$username/dd_dumps/$processname_$pid_$index. Note that the context is flushed after every draw/clear/copy/blit operation and then waited for to find the exact call that hangs. You can also do: GALLIUM_DDEBUG=always to do the dumping after every draw/clear/copy/blit operation without flushing and waiting. Examples of driver states that can be dumped are: - Hardware status registers saying which hw block is busy (hung). - Disassembled shaders in a human-readable form. - The last submitted command buffer in a human-readable form. v2: drop pipe-loader changes, drop SConscript rename dd.h -> dd_pipe.h Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* gallium: add flags parameter to pipe_screen::context_createMarek Olšák2015-08-2656-64/+95
| | | | | | | | This allows creating compute-only and debug contexts. Reviewed-by: Brian Paul <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* gallium: add an interface for dumping debug driver stateMarek Olšák2015-08-262-0/+17
| | | | | | Reviewed-by: Brian Paul <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
* radeonsi: mark unreachable paths to avoid warningsGrazvydas Ignotas2015-08-262-3/+3
| | | | | | | | | Otherwise we get: warning: 'num_user_sgprs' may be used uninitialized in this function ... Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/auxiliary: optimize rgb9e5 helper some moreRoland Scheidegger2015-08-261-45/+42
| | | | | | | | | | | | | | | | I used this as some testing ground for investigating some compiler bits initially (e.g. lrint calls etc.), figured I could do much better in the end just for fun... This is mathematically equivalent, but uses some tricks to avoid doubles and also replaces some float math with ints. Good for another performance doubling or so. As a side note, some quick tests show that llvm's loop vectorizer would be able to properly vectorize this version (which it failed to do earlier due to doubles, producing a mess), giving another 3 times performance increase with sse2 (more with sse4.1), but this may not apply to mesa. No piglit change. Acked-by: Marek Olšák <[email protected]>
* gallium/auxiliary: optimize rgb9e5 helper a bitRoland Scheidegger2015-08-261-18/+17
| | | | | | | | | | | | | | | | | This code (lifted straight from the extension) was doing things the most inefficient way you could think of. This drops some of the more expensive float operations, in particular - int-cast floors (pointless, values always positive) - 2 raised to (signed) integers (replace with simple exponent manipulation), getting rid of a misguided comment in the process (implement with table...) - float division (replace with mul of reverse of those exponents) This is like 3 times faster (measured for float3_to_rgb9e5), though it depends (e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not, division is not too bad on cpus with early-exit divs). Note that keeping the double math for now (float x + 0.5), as the results may otherwise differ. Acked-by: Marek Olšák <[email protected]>
* gallium/ttn: Use nir_builder_insert() rather than poking at cf_list.Kenneth Graunke2015-08-251-16/+16
| | | | | | | | | I intend to remove nir_builder::cf_node_list, so I can't have this code poking at it directly. The proper way is to set the insertion point and then simply insert things there. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Store gl_shader_stage in nir_shader.Kenneth Graunke2015-08-251-4/+21
| | | | | | | | | | | | This makes it easy for NIR passes to inspect what kind of shader they're operating on. Thanks to Michel Dänzer for helping me figure out where TGSI stores the shader stage information. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: fix compile break after splitting out nir_control_flow.hRob Clark2015-08-251-0/+1
| | | | | | | | | | | | | | | The commit: commit b49371b8ede380f10ea3ab333246a3b01ac6aca5 Author: Connor Abbott <[email protected]> AuthorDate: Tue Jul 21 19:54:18 2015 -0700 nir: move control flow modification to its own file split out some control flow related APIs into a separate header, but did not update drivers. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix compile break after fxn->start_block removalRob Clark2015-08-251-1/+1
| | | | | | | | | | | | | | The commit: commit 8e0d4ef3410ea07d9621df3e083bc3e7c1ad2ab0 Author: Kenneth Graunke <[email protected]> AuthorDate: Thu Aug 6 18:18:40 2015 -0700 nir: Delete the nir_function_impl::start_block field. removed the start_block field without fixing up drivers.. Signed-off-by: Rob Clark <[email protected]>
* nir: move control flow modification to its own fileConnor Abbott2015-08-241-0/+1
| | | | | | | | | | | | | | | | We want to start reworking and expanding this code, but it'll be a lot easier to do once we disentangle it from the rest of the stuff in nir.c. Unfortunately, there are a few unavoidable dependencies in nir.c on methods we'd rather not expose publicly, since if not used in very specific situations they can cause Bad Things (tm) to happen. Namely, we need to do some magical control flow munging when adding/removing jumps. In the future, we may disallow adding/removing jumps in nir_instr_insert_*() and nir_instr_remove(), and use separate functions that are part of the control flow modification code, but for now we expose them and put them in a separate, private header. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/a4xx: formats updateRob Clark2015-08-241-5/+5
| | | | | | Fixes glamor, which wants to use R8 integer textures. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-08-245-5/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* Revert "radeon/winsys: increase the IB size for VM"Marek Olšák2015-08-234-17/+6
| | | | | | | | | This reverts commit 567394112d904096abff1d994ab952f475dfb444. It regressed performance. It looks like smaller IBs are better, because the GPU goes idle quicker and there is less waiting for buffers and fences. Cc: 11.0 <[email protected]>
* nv50: fix 2d engine blits for 64- and 128-bit formatsIlia Mirkin2015-08-231-0/+4
| | | | | | | This fixes bin/ext_framebuffer_multisample-formats all_samples Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv50: account for the int RT0 rule for alpha-to-one/covIlia Mirkin2015-08-233-11/+23
| | | | | | | | Same as commit 1af0641db but for nvc0. If an integer texture is bound to RT0, don't do alpha-to-one or alpha-to-coverage. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv50,nvc0: disable depth bounds test on blitIlia Mirkin2015-08-232-0/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* r600g: Fix assert in tgsi_cmpGlenn Kennard2015-08-231-2/+2
| | | | | | | | Fixes https://bugs.freedesktop.org/show_bug.cgi?id=91726 Signed-off-by: Glenn Kennard <[email protected]> Cc: "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nouveau: add codegen/unordered_set.h to the tarballEmil Velikov2015-08-221-1/+2
| | | | Signed-off-by: Emil Velikov <[email protected]>
* winsys/sw/kms-dri: don't attempt to bundle the sconscriptEmil Velikov2015-08-221-2/+0
| | | | | | | | The build/file was removed with an earlier commit while the EXTRA_DIST was forgotten. Fixes: 66d77cd71c6 (scons: don't build the kms-dri winsys) Signed-off-by: Emil Velikov <[email protected]>
* winsys/amdgpu: automake: remove missing headersEmil Velikov2015-08-221-2/+0
| | | | | | | The files are not referenced in any other place in whole of mesa. They are likely remnants of the early development stage. Signed-off-by: Emil Velikov <[email protected]>
* android: enable amdgpu winsys in radeonsi driverMauro Rossi2015-08-221-2/+2
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* android: fix cflags and includes for amdgpu winsysMauro Rossi2015-08-221-0/+10
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vc4: Actually allow math results to allocate into r4.Eric Anholt2015-08-212-1/+7
| | | | | | | | | | I switched us to tracking whether the results *could* go to r4, but then didn't make a separate register class for the class bits that included r4. Switch the "any" class to actually be "any", and name the "any but r4" class more appropriately. total instructions in shared programs: 96798 -> 94680 (-2.19%) instructions in affected programs: 62736 -> 60618 (-3.38%)
* vc4: Fold the 16-bit integer pack into the instructions generating it.Eric Anholt2015-08-215-30/+22
| | | | | total instructions in shared programs: 97580 -> 96798 (-0.80%) instructions in affected programs: 52826 -> 52044 (-1.48%)
* vc4: Reuse QPU dumping for packing bits in QIR.Eric Anholt2015-08-213-22/+26
|
* vc4: Make _dest variants of qir ALU helpers to provide an explicit dest.Eric Anholt2015-08-212-4/+20
|
* vc4: Use the SSA defs list for figuring out eligible MOVs for copy prop.Eric Anholt2015-08-211-12/+10
| | | | | I thought I'd converted this over previously. It was copy propagating MOVs badly with the new destination packing flags.
* st/nine: Always use user constant buffersKrzysztof Sobiecki2015-08-211-1/+3
| | | | | | | | | | | | | | We had several reports of users hitting bugs with the other path to upload constants, and switching to the user constant buffer path solves the bugs. User constant buffers are expected to be slower for Nvidia cards, so ideally this patch should be reverted when the path is fixed. Reviewed-by: Axel Davy <[email protected]> Signed-off-by: Krzysztof Sobiecki <[email protected]>
* st/nine: Silent warning in nine_ffAxel Davy2015-08-211-0/+2
| | | | | | release build was complaining Signed-off-by: Axel Davy <[email protected]>
* st/nine: Silent warning in sm1_declusage_to_tgsiAxel Davy2015-08-211-1/+1
| | | | | | release build was complaining Signed-off-by: Axel Davy <[email protected]>
* st/nine: Silent warning in NineCubeTexture9_ctorAxel Davy2015-08-211-1/+1
| | | | | | The compiler was complaining the value may be uninitialised when it is used (which is wrong). Initialize to NULL to silent the warning.
* st/nine: Silent warning in update_vertex_bufferAxel Davy2015-08-211-1/+0
| | | | There was an unused variable
* st/nine: Catch setting the same shaderAxel Davy2015-08-211-0/+6
| | | | | | | | This is quite rare that an app does set again the same shaders, but it isn't an expensive check either. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Avoid Constant upload when there is no changeAxel Davy2015-08-211-0/+42
| | | | | | | | | | | It is very common for d3d9 apps to set again the constants they need before every draw call, even if nothing changed. Since we are mostly gpu bound, it is better to check for change, and upload constants again (and thus use gpu bandwith) only if the constants changed. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Fix the number of texture stagesAxel Davy2015-08-212-4/+6
| | | | | | | | | | | The number of texture stages is 8. 'tex_stage' array was too big, and thus the checks with 'Elements(state->ff.tex_stage)' were passing, causing some invalid API calls to pass, and crash because of out of bounds write since bumpmap_vars was just the correct size. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Use CSO cache for sampler viewsAxel Davy2015-08-214-23/+5
| | | | | | | | The CSO cache unbinds views that are not needed anymore, which we don't do. It checks for change before committing the views. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Calculate dummy sampler state only onceAxel Davy2015-08-213-35/+24
| | | | Signed-off-by: Axel Davy <[email protected]>
* st/nine: Better check shader constant limitsAxel Davy2015-08-211-9/+27
| | | | Signed-off-by: Axel Davy <[email protected]>
* st/nine: Remove NINED3DRS_ZBIASSCALEAxel Davy2015-08-213-23/+12
| | | | | | | | | It wasn't giving the expected result. This fixes some object being transparents in games like FEAR. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Implement special DOTPRODUCT3 behaviourAxel Davy2015-08-211-0/+4
| | | | | | Taken from wine tests Signed-off-by: Axel Davy <[email protected]>
* st/nine: Implement ff vertex data passthroughAxel Davy2015-08-211-7/+61
| | | | | | Fixes Wine tests Signed-off-by: Axel Davy <[email protected]>
* st/nine: Change nine_state_update orderAxel Davy2015-08-211-63/+76
| | | | | | | | | nine_update_state called every draw call. This patch attemps to change the order of the checks to have better control flow Signed-off-by: Axel Davy <[email protected]>
* st/nine: Programmable ps D3DTTSS_PROJECTED supportAxel Davy2015-08-217-8/+74
| | | | | | The implementation used Wine tests for conformance Signed-off-by: Axel Davy <[email protected]>
* st/nine: Complete ff texture transform implementationAxel Davy2015-08-213-70/+174
| | | | | | Wine tests were used to get it right. Signed-off-by: Axel Davy <[email protected]>