| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Provide the real vendor and and hardcode the device id as
0xffffffff as the devices currently using freedreno are non-pci.
The device features UMA.
Cc: Rob Clark <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]> (v1)
Reviewed-by: Roland Scheidegger <[email protected]> (v1)
v2: Reuse opcode gaps as suggested by Marek
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For all the people interested in testing the freedreno driver on
their Android devices. The next commit will hook these up within
the libEGL driver (via the gallium-egl backend).
There may be some rough edges but those can be sorted when a
willing builder/tester comes along.
v2:
- s/freefreno/freedreno/. Spotted by Matt Turner.
- Use the installed libdrm headers.
Cc: "10.1 10.2" <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than including two extra folders only for two headers,
just prefix the headers and be done with it.
Cc: "10.1 10.2" <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This limit is fixed in Mesa core and cannot be changed.
It only affects ARB_vertex_program and ARB_fragment_program.
The minimum value for ARB_vertex_program is 1 according to the spec.
The maximum value for ARB_vertex_program is limited to 1 by Mesa core.
The value should be zero for ARB_fragment_program, because it doesn't
support ARL.
Finally, drivers shouldn't mess with these values arbitrarily.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This new name isn't so confusing.
I also changed the gallivm limit, because it looked wrong.
Reviewed-by: Brian Paul <[email protected]>
v2: use sizeof(float[4])
|
|
|
|
|
|
|
| |
Opps, I should use larger fonts, I guess.
Reported-by: Ilia Mirkin <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the bits we want to share between generations from fd3_program to
ir3_shader. So overall structure is:
fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3
|- ...
\- ir3_shader_variant -> ir3
So the ir3_shader becomes the topmost generation neutral object, which
manages the set of variants each of which generates, compiles, and
assembles it's own ir.
There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/,
etc.
Keep the split between the gallium level stateobj and the shader helper
object because it might be a good idea to pre-compute some generation
specific register values (ie. anything that is independent of linking).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
First step of reoganization split out compiler (so it can be shared
between a3xx and a4xx). Rename ir3_shader -> ir3 (since we'll want
the name ir3_shader for a higher level object).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The scheduler also needs to be aware of predicate register (p0) in
addition to address register (a0).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Remove some obsolete comments, rename deref->addr.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
It seems like for the most part, different behaviors, workarounds, etc,
should be conditional on GPU patch revision (ie. a320.0 vs a320.2)
rather than GPU id (a320 vs a330).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fixed size heap is a remnant of the fdre-a3xx assembler. Yet it is
convenient for being able to free the entire data structure in one shot
without worrying about leaking nodes.
Change it to dynamically grow the heap size (adding chunks) as needed so
we don't have an artificial upper limit on shader size (other than hw
limits) and don't always have to allocate worst-case size.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Don't assert (debug builds) or assign random uninitialized value for
predicate register (p0).. that screws up kill, etc.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Actually what we currently handle is just the SCALED versions, and not
the int versions. The difference probably matters more when we actually
support integer in the compiler.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Teach new compiler scheduling and register assignment how to deal with
relative addressing. This gets us what we need to avoid falling back to
old compiler for CONST[ADDR[0].x+n]. It is also a prerequisite for temp
file relative addressing, although that is going to also need some
cleverness in register assignment to keep arrays grouped together.
NOTE: doing address calculation in full precision and then narrowing to
s16 in the mov to addr reg seems to sometimes cause lockups (and
sometimes work?!). It seems more reliable to do the address calculation
in s16, like the blob does. Which means teaching RA how to deal with
mixed half and full precision allocation. Fortunately that didn't turn
out to be too hard, so that is a nice bonus which we could probably take
better advantage of elsewhere.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Technically we should not need these. CP_LOAD_STATE can be pipelined.
But removing them broke a few piglit tests, like fbo-depth-
GL_DEPTH_COMPONENT24-readpixels. I expect these are just masking a
problem elsewhere, or perhaps they are only needed under some more
specific circumstances. But until that is understood properly, give
back a bit of the perf boost we got from c63450e8.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that this cap is used to determine the availability of both, adjust
its name to reflect the new reality.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
v2:
Added comments to util_draw_indirect, clarified and fixed map size.
Removed unlikely().
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blob driver seems to need WFI in some cases after CP_EVENT_WRITE,
implying that this is asynchronous and should reset needs_wfi.
Also, CP_INVALIDATE_STATE seems to need WFI. But CP_LOAD_STATE
does not.
The blob driver also puts WFIs before writing GRAS_CL_VPORT registers.
The latter may be a work-around, as these registers should be banked/
context registers. I haven't yet found a lockup that this averts, but
I expect viewport to change infrequently so out of paranoia I will
keep these for now.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Add support for more vertex buffer formats.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Worth about ~0.5fps in xonotic, for example.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Some apps seem to give us a null sampler/view for texture slots which
come before the last used texture slot. In particular 0ad triggers
this.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Marek v2: add a cap
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Seems the opcodes are slightly different from a2xx. Resync headers and
move blend_func() helper into hw generation specific code.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
We already multiply by bytes per pixel for this, so f3ba7611 broke
mem2gmem for depth/stencil. Drop the now-redundant mutiply by cpp.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
In cases where there was no color buf bound, there were inconsistancies
in register settings related to position of depth/stencil inside GMEM.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
These aren't buffers we ever read back from CPU, so using incorrect
reloc fxn wasn't really harming anything. But might as well be correct.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
I think a3xx and later should support (it is part of GLES3), but this
isn't needed for the time being and still needs to be reversed.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Split it up into some smaller fxns so it doesn't grow into a huge
monster as we add things.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Gallium already gives us height==1 for these, so the texture state is
already setup correctly to emulate 1D textures as a Nx1 2D texture. We
just need to supply the .y coord.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
In particular, we want mesa to emulate primitive restart for us.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
That was easy. Turns out it is just a matter of setting one bit.
Enable sampling from sRGB texture, and therefore enable GL 2.1 :-)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The loops for updating the multiple packed fields in SP_VS_OUT[] and
SP_VS_VPC_DST[] will zero out one register beyond the last that on
required. Which is normally not a problem (and is kinda convenient
when looking at cmdstream dumps) unless we have maximum (16) varyings.
Fix loop termination condition so that this does not happen.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We need to size input/output tables big enough for special inputs/
outputs (gl_Position, gl_FrontFacing, etc) which, while they don't
count towards the hw limit of 16 attributes or 16 varyings, we do
still need to track them all the same.
Signed-off-by: Rob Clark <[email protected]>
|