summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* main/cs: Add gl_context::ComputeProgramPaul Berry2014-09-011-0/+15
| | | | Reviewed-by: Jordan Justen <[email protected]>
* mesa: Convert NewDriverState to 64-bitsJordan Justen2014-09-016-13/+25
| | | | | | | i965 will have more than 32 bits when BRW_STATE_COMPUTE_PROGRAM is added. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: Modify state upload to allow 2 different sets of state atoms.Paul Berry2014-09-012-25/+32
| | | | | | | The set of state atoms for compute shaders is currently empty; it will be filled in by future patches. Reviewed-by: Jordan Justen <[email protected]>
* i965: Modify dirty bit handling to support 2 pipelines.Paul Berry2014-09-014-16/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The hardware state for compute shaders is almost entirely orthogonal to the hardware state for 3D rendering. To avoid sending unnecessary state to the hardware, we'll need to have a separate set of state atoms for the compute pipeline and the 3D pipeline. That means we need to maintain two separate sets of dirty bits to determine which state atoms need to be run. But the dirty bits are not completely independent; for example, if BRW_NEW_SURFACES is flagged while doing 3D rendering, then not only do we need to re-run 3D state atoms that depend on BRW_NEW_SURFACES, but we also need to re-run compute state atoms that depend on BRW_NEW_SURFACES. But we'll also need to re-run those state atoms the next time the compute pipeline is run. To accomplish this, we record two sets of dirty bits, one for each pipeline. When bits are dirtied (via SET_DIRTY_BIT() or SET_DIRTY_ALL()) we set them to the dirty state in both pipelines. When brw_state_upload() is run, we clear the dirty bits just for the pipeline that was run. Note that since the number of pipelines is known at compile time to be 2, the compiler should unroll the loops in SET_DIRTY_BIT() and SET_DIRTY_ALL(). Reviewed-by: Jordan Justen <[email protected]>
* i965: Create a macro for checking a dirty bit.Paul Berry2014-09-012-1/+7
| | | | | | | This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <[email protected]>
* i965: Create a macro for setting all dirty bits.Paul Berry2014-09-014-7/+18
| | | | | | | This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <[email protected]>
* i965: Create a macro for setting a dirty bit.Paul Berry2014-09-0132-67/+74
| | | | | | | This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <[email protected]>
* i965: add missing parens in vec4 visitorDave Airlie2014-09-021-1/+2
| | | | | | | | | coverity reported this, Matt said it look like missing parens, not bad identing, so lets try that. Cc: "10.2 10.3" <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nouveau: don't leak dec struct on errorDave Airlie2014-09-021-1/+1
| | | | | | | This one path doesn't goto fail, so it seems to leak dec. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* xvmc/tests: %C isn't a valid printf specifier.Dave Airlie2014-09-021-2/+2
| | | | | | Reported-by: Coverity scanner. Signed-off-by: Dave Airlie <[email protected]>
* nouveau/nv40: quiten coverity warning in unused vertex texture code.Dave Airlie2014-09-021-0/+1
| | | | | | This fixes the code, but we never run it anyways, so silence coverity. Signed-off-by: Dave Airlie <[email protected]>
* nv50: remove unused variablesIlia Mirkin2014-09-012-2/+0
| | | | | | Recent code changes have caused these to no longer be used. Remove them. Signed-off-by: Ilia Mirkin <[email protected]>
* mesa: force height of 1D textures to be 1 in texture viewsIlia Mirkin2014-09-011-0/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* nv50: attach the buffer bo to the miptree structuresIlia Mirkin2014-09-011-8/+5
| | | | | | | | | The current code... makes no sense. Use nouveau_bo_ref to attach the bo to the exposed resource so as to have the proper lifetime guarantees. Tested-by: Emil Velikov <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* nv50: mt address may not be the underlying bo's start addressIlia Mirkin2014-09-013-12/+14
| | | | | | | | | | | | | | With VP2, nv50_miptree is faked because the underlying bo's have to be laid out in a certain way. This is done by adjusting the address. Make sure that blits (and everything else for consistency) use the mt address rather than the bo address as a base. This fixes retrieving chroma plane with VDPAU. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82255 Tested-by: Emil Velikov <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* nv50: set the miptree address when clearing bo's in vp2 initIlia Mirkin2014-09-011-0/+2
| | | | | | | | | | The mt address is about to be used more, make sure it's set appropriately. Reported-by: Emil Velikov <[email protected]> Tested-by: Emil Velikov <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* nv50/ir: avoid creating instructions that can't be emittedIlia Mirkin2014-09-011-0/+4
| | | | | | | | | | | When constant folding a MAD operation, we first fold the multiply and generate an ADD. However we do so without making sure that the immediate can be handled in the saturate case. If it can't, load the immediate in a separate instruction. Reported-by: Tiziano Bacocco <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* nvc0: don't make 1d staging textures linearIlia Mirkin2014-09-011-1/+0
| | | | | | | | | | | Experimentally, the sampler doesn't appear to like these, neither as buffer nor as rect textures. So remove 1D from the list of texture types to make linear when used for staging. This fixes the OSD in mplayer for VDPAU. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* nv50: zero out unbound samplersIlia Mirkin2014-09-011-2/+5
| | | | | | | | | Samplers are only defined up to num_samplers, so set all samplers above nr to NULL so that we don't try to read them again later. Tested-by: Christian Ruppert <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* nvc0/ir: avoid infinite recursion when finding first uses of texIlia Mirkin2014-09-012-8/+29
| | | | | | | | | | | In certain circumstances, findFirstUses could end up doubling back on instructions it had already processed, resulting in an infinite recursion. Avoid this by keeping track of already-visited instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079 Tested-by: Tobias Klausmann <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* freedreno/ir3: add DDX/DDYRob Clark2014-09-011-4/+53
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't keep IR aroundRob Clark2014-09-011-1/+6
| | | | | | | Once we've assembled the shader, no need to keep the intermediate around. Signed-off-by: Rob Clark <[email protected]>
* i965/fs: Don't segfault when debug-logging a null programJason Ekstrand2014-09-011-2/+2
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Don't segfault when debug-logging a null programJason Ekstrand2014-09-011-2/+2
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: implement EXPCLEAR optimization for depthMarek Olšák2014-09-015-2/+23
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: initialize HTILE to fully-expanded stateMarek Olšák2014-09-011-1/+3
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement fast depth clearMarek Olšák2014-09-014-2/+21
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move DB_RENDER_CONTROL into draw_vboMarek Olšák2014-09-015-58/+46
| | | | | | So that I can add fast depth clear. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: disable occlusion queries if they are not neededMarek Olšák2014-09-011-0/+8
| | | | | | | We always left them enabled, which turned off HiZ in some cases. This should improve performace with Hyper-Z. Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: force fast stencil and HTILE stencil off, fixing a Hyper-Z hangMarek Olšák2014-09-012-9/+14
| | | | | | | | | | | | | This should be as fast as no HTILE for stencil. I think we can still get full performance with depth-only rendering even if stencil is present in the buffer but not used, but I'm not 100% sure. This may be revisited when HiS and fast stencil clear are implemented. This fixes a hang in Brutal Legend. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64471 Reviewed-by: Michel Dänzer <[email protected]>
* r600g: set VGT_ENHANCE=4 on R7xxMarek Olšák2014-09-012-0/+2
| | | | | | | This is a golden setting on RV740, but there is a hw bug which recommends setting it on all R7xx chipsets. Acked-by: Michel Dänzer <[email protected]>
* r600g: expose AMD_vertex_shader_layer and *_viewport_index on R600-R700Marek Olšák2014-09-011-1/+1
| | | | | | already implemented Acked-by: Michel Dänzer <[email protected]>
* r600g: fix layered clearMarek Olšák2014-09-011-1/+2
| | | | | Cc: [email protected] Acked-by: Michel Dänzer <[email protected]>
* r600g: some DB bug workarounds for R6xx DB flushingMarek Olšák2014-09-011-0/+7
| | | | Acked-by: Michel Dänzer <[email protected]>
* r600g: enable fast depth clear for array textures and cubemapsMarek Olšák2014-09-011-1/+2
| | | | | | I have a piglit test that hits this. Acked-by: Michel Dänzer <[email protected]>
* r600g: use HTILE allocator from SIMarek Olšák2014-09-013-47/+23
| | | | | | | | | | | | It's almost the same. This enables tiling for HTILE. It also enables Hyper-Z for other texture targets (1D, 1D_ARRAY, 2D_ARRAY, CUBE, CUBE_ARRAY, 3D, RECT). 2D array depth textures are tested by Unigine Sanctuary and my new piglit test. Acked-by: Michel Dänzer <[email protected]>
* r600g: set DB_DEPTH_SIZE.HEIGHT_TILE_MAX for EG/CM, inline other fieldsMarek Olšák2014-09-011-9/+12
| | | | | | | | This fixes rendering to non-zero layer/face/slice with HTILE. v2: added the assertion Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: set DB_DEPTH_SIZE.HEIGHT_TILE_MAX, inline other fieldsMarek Olšák2014-09-011-9/+8
| | | | | | | | | | This fixes rendering to a non-zero layer/face/slice with HTILE. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72685 v2: added the assertion Reviewed-by: Michel Dänzer <[email protected]>
* r600g: Implement sm5 geometry shader instancingGlenn Kennard2014-09-013-2/+14
| | | | | | Requires Evergreen or later hardware. Signed-off-by: Glenn Kennard <[email protected]>
* glsl_to_tgsi: allocate and enlarge arrays for temporaries on demandMarek Olšák2014-09-011-18/+33
| | | | | | | | | | | This fixes crashes if the number of temporaries is greater than 4096. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66184 v2: added fail paths for realloc failures Cc: 10.2 10.3 [email protected] Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/pb_bufmgr_cache: limit the size of cacheMarek Olšák2014-09-014-8/+30
| | | | | | This should make a machine which is running piglit more responsive at times. e.g. streaming-texture-leak can easily eat 600 MB because of how fast it creates new textures.
* pipe-loader: use the correct screen indexMarek Olšák2014-09-011-2/+18
|
* egl/dri2: use the correct screen indexMarek Olšák2014-09-012-10/+30
| | | | Required for multi-GPU configuration where each GPU has its own X screen.
* i965/fs: don't use ir->shadow_comparitor in emit_texture_*Connor Abbott2014-09-012-7/+5
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: don't pass ir_variable * to emit_samplepos_setup()Connor Abbott2014-09-013-5/+4
| | | | | | | | We were only using it to get at its type, which we already know because it's a builtin variable. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: don't pass ir_variable * to emit_frontfacing_interpolation()Connor Abbott2014-09-014-6/+6
| | | | | | | | | | We were only using it to get at its type, which we already know because it's a builtin variable. v2 (Ken): Rebase on Matt's optimized gl_FrontFacing calculations. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix GPU hangs when INTEL_DEBUG=no16 is set.Kenneth Graunke2014-08-311-1/+2
| | | | | | | | The replicated data clear shader needs to be SIMD16, or else the GPU will hang. So, compile it even if INTEL_DEBUG=no16 is set. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: fix make tarballsEmil Velikov2014-09-011-1/+2
| | | | | | | | | | | | | Current method of generating distribution tar-balls involves manually invoking make + target name in the appropriate places. This temporary solution is used until we get 'make dist' working. Currently it does not work, as in order to have the target (which is also a filename) available in the final Makefile we need to add a PHONY target + use the correct target name. Cc: "10.2 10.3" <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* i965/vec4: Remove try_emit_saturateAbdiel Janulgue2014-08-312-22/+0
| | | | | | | | | Now that saturate is implemented natively as an instruction, we can cut down on unneeded functionality. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/fs: Refactor try_emit_saturateAbdiel Janulgue2014-08-311-15/+8
| | | | | | | | | | | | v3: Since the fs backend can emit saturate as a separate instruction, there is no need to detect for min/max instructions and to rewrite the instruction tree accordingly. On the other hand, we don't need to emit a separate saturated mov either when the expression generating src can do saturate directly. v4: Add can_do_saturate() check before enabling saturate modifer (Ken) Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>