Skip to content

Unable to replicate results from brain3k tutorial #6

@bozidar-obradovic

Description

@bozidar-obradovic

Bug description

I'm reproducing Single-cell RNA-seq and ATAC-seq integration tutorial on brain3k data but my results differ from one obtained in the original notebook, in both GEX and ATAC modalities.

Reproduction

Packages used:
numpy==1.24.2
pandas==1.5.3
scanpy==1.9.6
anndata==0.8.0
muon==0.1.5

GEX

Differences start when computing PCA (reference, then mine - notice the difference on the bar scale):

Screenshot 2024-02-22 at 11 54 13 Screenshot 2024-02-22 at 11 53 53

and they continue into clustering (reference, then mine):

Screenshot 2024-02-22 at 11 55 377 Screenshot 2024-02-22 at 11 55 47

In this modality I'm able to classify cells so the final result is almost the same as in the notebook, but differences still exist (reference, then mine):

Screenshot 2024-02-22 at 11 59 41 Screenshot 2024-02-22 at 12 00 08 Screenshot 2024-02-22 at 12 00 52 Screenshot 2024-02-22 at 12 01 08

ATAC

Even though interval field in rna modality is the same, TSS enrichment plots look different (reference, then mine):

Screenshot 2024-02-22 at 12 08 11 Screenshot 2024-02-22 at 12 09 29

Shapes differ greatly in PCA space (reference, then mine):

Screenshot 2024-02-22 at 12 13 08 Screenshot 2024-02-22 at 12 13 19

and clustering bears little resemblance, which makes it difficult for me to label cell types (reference, then mine):

Screenshot 2024-02-22 at 12 15 16 Screenshot 2024-02-22 at 12 15 27

Conclusion

I'm new to this area so I'm not sure if it's expected to have the same results for every run; I would think that some minor differences are to be expected as long as it doesn't change cell annotation. However, especially when working on ATAC data, I'm not able to overcome them.
I've had similar issues with reproducing PBMC10k tutorial, but it's a more complicated dataset (in terms of number of different clusters/cell types) so I thought that it would be easier to reproduce this tutorial.
Can you please help me identify what is the source of these differences? If it's related to the dating of the tutorials, would you update them? They are very informative and detailed and, as such, represent a valuable resource for beginners in the field.

Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions