Test Run

We try the method with some test data

Available Data

We use the data from BRCA cohort in TCGA. In order to speedup the analysis we limited the file to include only chromosome 17

Running the tool

Manifest

The manifest.yaml file define the input files and the order of the discovery analysis. To read more about the rules to follow in writing the manifest.yaml file see to the Manifest section.

Here an example manifest file:

title: Test biallelic with some cBio portal data
date: 07/03/2022
ref:
  genes:
    path: ../ref/gencode_sort.v19.bed.gz
    format_driver: bed
  sample_donors:
    path:  data_mutations_mskcc_17.txt.gz
    format_driver: maf
input:
  - path: data_cna_hg19_17.seg.gz
    type: scna
    format_driver: simple_segments
    extra_driver_args: {}
  - path: data_mutations_mskcc_17.txt.gz
    type: snv
    format_driver: maf
    extra_driver_args: {}
  - path: data_mutations_mskcc_17.txt.gz
    type: indel
    format_driver: maf
    extra_driver_args: {}

analyses:
  - name: write_aberrations
  - name: write_sample_donor
  - name: annotate_snv
  - name: annotate_double_snv
  - name: annotate_indel
  - name: summary_oncoprint_png

Execute the biallelic_inactivation command:

$ biallelic_inactivation ../test/data/test_cbio/manifest.yaml
INFO : 2023-09-13 12:29:50,146 : Writing bi logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,146 : Parse MANIFEST from ../test/data/test_cbio/manifest.yaml
INFO : 2023-09-13 12:29:50,150 : Writing ref_genes_bed logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,476 : Read Genes from /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/ref/gencode_sort.v19.bed.gz in BED  format
INFO : 2023-09-13 12:29:50,517 : Recorded 20345 genes
INFO : 2023-09-13 12:29:50,542 : Writing ref_sample_donors_maf logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,542 : Read Sample/Donor information from /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/data_mutations_mskcc_17.txt.gz
INFO : 2023-09-13 12:29:50,684 : Recorded 932 Sample/Donor information
INFO : 2023-09-13 12:29:50,687 : Writing scna_simple_segments logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,687 : Read Segments file from ../test/data/test_cbio/data_cna_hg19_17.seg.gz
INFO : 2023-09-13 12:29:50,715 : Selected 3830 segments over 13249
INFO : 2023-09-13 12:29:50,722 : Writing snv_maf logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,723 : Read file ../test/data/test_cbio/data_mutations_mskcc_17.txt.gz as MAF format
INFO : 2023-09-13 12:29:50,832 : Selected 4191 SNVs over 7353 variants
INFO : 2023-09-13 12:29:50,841 : Writing indel_maf logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:29:50,841 : Read file ../test/data/test_cbio/data_mutations_mskcc_17.txt.gz as MAF format
INFO : 2023-09-13 12:29:50,940 : Selected 708 INDELs over 7353 variants
Matplotlib is building the font cache; this may take a moment.
INFO : 2023-09-13 12:30:37,301 : Writing discovery_write_aberrations logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:30:37,301 : Write aberration file to /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/results/aberration_0.tsv
INFO : 2023-09-13 12:30:37,316 : Write aberration file to /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/results/aberration_1.tsv
INFO : 2023-09-13 12:30:37,336 : Write aberration file to /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/results/aberration_2.tsv
INFO : 2023-09-13 12:30:37,341 : Writing discovery_write_sample_donor logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:30:37,341 : Write Sample info file to /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/results/samples_donor_info.tsv
INFO : 2023-09-13 12:30:37,346 : Writing discovery_annotate_snv logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:30:37,359 : Divide the aberrations over 907 samples
INFO : 2023-09-13 12:31:08,255 : Found 987 biallelic inactivation over 907 samples
INFO : 2023-09-13 12:31:08,265 : Writing discovery_annotate_double_snv logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:31:08,278 : Divide the aberrations over 907 samples
INFO : 2023-09-13 12:31:20,943 : Found 59 biallelic inactivation over 907 samples
INFO : 2023-09-13 12:31:20,947 : Writing discovery_annotate_indel logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:31:20,957 : Divide the aberrations over 907 samples
INFO : 2023-09-13 12:31:34,669 : Found 317 biallelic inactivation over 907 samples
INFO : 2023-09-13 12:31:34,674 : Writing discovery_summary_oncoprint_png logs to folder /home/docs/checkouts/readthedocs.org/user_builds/biallelic-py/checkouts/latest/test/data/test_cbio/logs
INFO : 2023-09-13 12:31:34,674 : Found 3 output results for the summary

After the command execute without errors, the results will be available in the results folder at the same path of the manifest.yaml file

$ ls ../test/data/test_cbio/results
aberration_0.tsv
aberration_1.tsv
aberration_2.tsv
result_discovery_annotated_double_snv.tsv
result_discovery_annotated_indel.tsv
result_discovery_annotated_snv.tsv
samples_donor_info.tsv
summary_oncoprint.png

The content of the results folder depends on the analyses included in the manifest file.

In our test example we included an oncoprint overview of the cohort

Fix the png

To show the image we have to copy to the document root folder. This is a workaround of the documentation, totally unrelated to the library itself

$ cp ../test/data/test_cbio/results/summary_oncoprint.png ./_static/summary_oncoprint.png
_static/summary_oncoprint.png