-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Bug description
I'm reproducing Single-cell RNA-seq and ATAC-seq integration tutorial on brain3k data but my results differ from one obtained in the original notebook, in both GEX and ATAC modalities.
Reproduction
Packages used:
numpy==1.24.2
pandas==1.5.3
scanpy==1.9.6
anndata==0.8.0
muon==0.1.5
GEX
Differences start when computing PCA (reference, then mine - notice the difference on the bar scale):
and they continue into clustering (reference, then mine):
In this modality I'm able to classify cells so the final result is almost the same as in the notebook, but differences still exist (reference, then mine):




ATAC
Even though interval
field in rna modality is the same, TSS enrichment plots look different (reference, then mine):
Shapes differ greatly in PCA space (reference, then mine):
and clustering bears little resemblance, which makes it difficult for me to label cell types (reference, then mine):
Conclusion
I'm new to this area so I'm not sure if it's expected to have the same results for every run; I would think that some minor differences are to be expected as long as it doesn't change cell annotation. However, especially when working on ATAC data, I'm not able to overcome them.
I've had similar issues with reproducing PBMC10k tutorial, but it's a more complicated dataset (in terms of number of different clusters/cell types) so I thought that it would be easier to reproduce this tutorial.
Can you please help me identify what is the source of these differences? If it's related to the dating of the tutorials, would you update them? They are very informative and detailed and, as such, represent a valuable resource for beginners in the field.
Thank you in advance!