Test Run¶
We try the method with some test data
Available Data¶
We use the data from BRCA cohort in TCGA. In order to speedup the analysis we limited the file to include only chromosome 17
Running the tool¶
Manifest¶
The manifest.yaml file define the input files and the order of the discovery analysis. To read more about the rules to follow in writing the manifest.yaml file see to the Manifest section.
Here an example manifest file:
title: Test biallelic with some cBio portal data
date: 07/03/2022
ref:
genes:
path: ../ref/gencode_sort.v19.bed.gz
format_driver: bed
sample_donors:
path: data_mutations_mskcc_17.txt.gz
format_driver: maf
input:
- path: data_cna_hg19_17.seg.gz
type: scna
format_driver: simple_segments
extra_driver_args: {}
- path: data_mutations_mskcc_17.txt.gz
type: snv
format_driver: maf
extra_driver_args: {}
- path: data_mutations_mskcc_17.txt.gz
type: indel
format_driver: maf
extra_driver_args: {}
analyses:
- name: write_aberrations
- name: write_sample_donor
- name: annotate_snv
- name: annotate_double_snv
- name: annotate_indel
- name: summary_oncoprint_png
Execute the biallelic_inactivation command:¶
$ biallelic_inactivation ../test/data/test_cbio/manifest.yaml
INFO : 2023-09-13 12:29:50,146 : Writing bi logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,146 : Parse MANIFEST from ../test/data/test_cbio/manifest.yaml
INFO : 2023-09-13 12:29:50,150 : Writing ref_genes_bed logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,476 : Read Genes from /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/ref/gencode_sort.v19.bed.gz in BED format
INFO : 2023-09-13 12:29:50,517 : Recorded 20345 genes
INFO : 2023-09-13 12:29:50,542 : Writing ref_sample_donors_maf logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,542 : Read Sample/Donor information from /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/data_mutations_mskcc_17.txt.gz
INFO : 2023-09-13 12:29:50,684 : Recorded 932 Sample/Donor information
INFO : 2023-09-13 12:29:50,687 : Writing scna_simple_segments logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,687 : Read Segments file from ../test/data/test_cbio/data_cna_hg19_17.seg.gz
INFO : 2023-09-13 12:29:50,715 : Selected 3830 segments over 13249
INFO : 2023-09-13 12:29:50,722 : Writing snv_maf logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,723 : Read file ../test/data/test_cbio/data_mutations_mskcc_17.txt.gz as MAF format
INFO : 2023-09-13 12:29:50,832 : Selected 4191 SNVs over 7353 variants
INFO : 2023-09-13 12:29:50,841 : Writing indel_maf logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,841 : Read file ../test/data/test_cbio/data_mutations_mskcc_17.txt.gz as MAF format
INFO : 2023-09-13 12:29:50,940 : Selected 708 INDELs over 7353 variants
Matplotlib is building the font cache; this may take a moment.
INFO : 2023-09-13 12:30:37,301 : Writing discovery_write_aberrations logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:30:37,301 : Write aberration file to /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/results/aberration_0.tsv
INFO : 2023-09-13 12:30:37,316 : Write aberration file to /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/results/aberration_1.tsv
INFO : 2023-09-13 12:30:37,336 : Write aberration file to /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/results/aberration_2.tsv
INFO : 2023-09-13 12:30:37,341 : Writing discovery_write_sample_donor logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:30:37,341 : Write Sample info file to /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/results/samples_donor_info.tsv
INFO : 2023-09-13 12:30:37,346 : Writing discovery_annotate_snv logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:30:37,359 : Divide the aberrations over 907 samples
INFO : 2023-09-13 12:31:08,255 : Found 987 biallelic inactivation over 907 samples
INFO : 2023-09-13 12:31:08,265 : Writing discovery_annotate_double_snv logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:31:08,278 : Divide the aberrations over 907 samples
INFO : 2023-09-13 12:31:20,943 : Found 59 biallelic inactivation over 907 samples
INFO : 2023-09-13 12:31:20,947 : Writing discovery_annotate_indel logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:31:20,957 : Divide the aberrations over 907 samples
INFO : 2023-09-13 12:31:34,669 : Found 317 biallelic inactivation over 907 samples
INFO : 2023-09-13 12:31:34,674 : Writing discovery_summary_oncoprint_png logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:31:34,674 : Found 3 output results for the summary
After the command execute without errors, the results will be available in the results folder at the same path of the manifest.yaml file
$ ls ../test/data/test_cbio/results
aberration_0.tsv
aberration_1.tsv
aberration_2.tsv
result_discovery_annotated_double_snv.tsv
result_discovery_annotated_indel.tsv
result_discovery_annotated_snv.tsv
samples_donor_info.tsv
summary_oncoprint.png
The content of the results folder depends on the analyses included in the manifest file.
In our test example we included an oncoprint overview of the cohort
Fix the png
To show the image we have to copy to the document root folder. This is a workaround of the documentation, totally unrelated to the library itself
$ cp ../test/data/test_cbio/results/summary_oncoprint.png ./_static/summary_oncoprint.png