Skip to content

Conversation

xrq-phys
Copy link

This patch tries to ensure when one runs sageattn under:

with torch.cuda.stream(stream):
    sageattn(q, k, v)

all kernels would be enqueued onto the correct CUDA stream.

@walker-ai
Copy link

Hi, I'm currently working on support CUDA graph of SA, but I've encountered some output errors. I want to know if this PR is related to me. Maybe this stream issue could cause some correctness errors?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants