On 16-11-25 03:40 PM, Christian König wrote:
Am 25.11.2016 um 20:32 schrieb Jason Gunthorpe:
> This assumes the commands are fairly short lived of course, the
> expectation of the mmu notifiers is that a flush is reasonably prompt
Correct, this is another problem. GFX command submissions usually
don't take longer than a few milliseconds, but compute command
submission can easily take multiple hours.
I can easily imagine what would happen when kswapd is blocked by a GPU
command submission for an hour or so while the system is under memory
I'm thinking on this problem for about a year now and going in circles
for quite a while. So if you have ideas on this even if they sound
totally crazy, feel free to come up.
Our GPUs (at least starting with VI) support compute-wave-save-restore
and can swap out compute queues with fairly low latency. Yes, there is
some overhead (both memory usage and time), but it's a fairly regular
thing with our hardware scheduler (firmware, actually) when we need to
preempt running compute queues to update runlists or we overcommit the
hardware queue resources.