Posts: 440
Threads: 1
Joined: Sep 2024
10-05-2024, 10:37 AM
(This post was last modified: 10-05-2024, 10:37 AM by Light.)
Alright, I'll work on this later, but the nice part is on Colab download speed is extremely good and conversion will be done entirely on runtime and not local system so it could make BAMtoBED lot easier and even be done on Mobile
My .zip PACKEDPED to Merged v62 is working flawless
Posts: 503
Threads: 72
Joined: Nov 2023
Gender: Male
Ethnicity: Arab
(03-16-2024, 05:37 PM)teepean Wrote: So I have had questions over the years about creating datasets and here I present a new program called aDNA to dataset (AKA make-myself-redundant)
There are both Linux and Windows versions available. The instructions are very simple so I would like to get feedback from anyone interested in creating their own datasets from BAMs.
Notice: Windows version includes all the files necessary to run except references. For Linux you need to have samtools and pileupCaller in your path.
pileupCaller can be downloaded from here and for samtools I assume you know how to use apt, pacman, yum etc.
https://github.com/stschiff/sequenceTools
Main page:
https://github.com/teepean/adna_to_dataset
Download:
https://github.com/teepean/adna_to_datas.../v.0.2.zip
PileupCaller uses default settings and if you want to modify them you have to edit the .bat or .sh.
EDIT: the program supports only hs37d5 and hg19 as references as those are the most commonly used in aDNA papers. hg38/T2T support can be added if AADR starts supporting those references.
Will this work on a cram file?
Posts: 410
Threads: 13
Joined: Oct 2023
Gender: Undisclosed
(10-20-2024, 10:58 AM)Genetics189291 Wrote: (03-16-2024, 05:37 PM)teepean Wrote: So I have had questions over the years about creating datasets and here I present a new program called aDNA to dataset (AKA make-myself-redundant)
There are both Linux and Windows versions available. The instructions are very simple so I would like to get feedback from anyone interested in creating their own datasets from BAMs.
Notice: Windows version includes all the files necessary to run except references. For Linux you need to have samtools and pileupCaller in your path.
pileupCaller can be downloaded from here and for samtools I assume you know how to use apt, pacman, yum etc.
https://github.com/stschiff/sequenceTools
Main page:
https://github.com/teepean/adna_to_dataset
Download:
https://github.com/teepean/adna_to_datas.../v.0.2.zip
PileupCaller uses default settings and if you want to modify them you have to edit the .bat or .sh.
EDIT: the program supports only hs37d5 and hg19 as references as those are the most commonly used in aDNA papers. hg38/T2T support can be added if AADR starts supporting those references.
Will this work on a cram file?
It should work.
Posts: 503
Threads: 72
Joined: Nov 2023
Gender: Male
Ethnicity: Arab
(10-20-2024, 11:05 AM)teepean Wrote: (10-20-2024, 10:58 AM)Genetics189291 Wrote: (03-16-2024, 05:37 PM)teepean Wrote: So I have had questions over the years about creating datasets and here I present a new program called aDNA to dataset (AKA make-myself-redundant)
There are both Linux and Windows versions available. The instructions are very simple so I would like to get feedback from anyone interested in creating their own datasets from BAMs.
Notice: Windows version includes all the files necessary to run except references. For Linux you need to have samtools and pileupCaller in your path.
pileupCaller can be downloaded from here and for samtools I assume you know how to use apt, pacman, yum etc.
https://github.com/stschiff/sequenceTools
Main page:
https://github.com/teepean/adna_to_dataset
Download:
https://github.com/teepean/adna_to_datas.../v.0.2.zip
PileupCaller uses default settings and if you want to modify them you have to edit the .bat or .sh.
EDIT: the program supports only hs37d5 and hg19 as references as those are the most commonly used in aDNA papers. hg38/T2T support can be added if AADR starts supporting those references.
Will this work on a cram file?
It should work.
I’ll try it out today I’m off work
Posts: 503
Threads: 72
Joined: Nov 2023
Gender: Male
Ethnicity: Arab
(10-20-2024, 11:05 AM)teepean Wrote: (10-20-2024, 10:58 AM)Genetics189291 Wrote: (03-16-2024, 05:37 PM)teepean Wrote: So I have had questions over the years about creating datasets and here I present a new program called aDNA to dataset (AKA make-myself-redundant)
There are both Linux and Windows versions available. The instructions are very simple so I would like to get feedback from anyone interested in creating their own datasets from BAMs.
Notice: Windows version includes all the files necessary to run except references. For Linux you need to have samtools and pileupCaller in your path.
pileupCaller can be downloaded from here and for samtools I assume you know how to use apt, pacman, yum etc.
https://github.com/stschiff/sequenceTools
Main page:
https://github.com/teepean/adna_to_dataset
Download:
https://github.com/teepean/adna_to_datas.../v.0.2.zip
PileupCaller uses default settings and if you want to modify them you have to edit the .bat or .sh.
EDIT: the program supports only hs37d5 and hg19 as references as those are the most commonly used in aDNA papers. hg38/T2T support can be added if AADR starts supporting those references.
Will this work on a cram file?
It should work.
It says reference not compatible I thought it would use lift over or something like that, that’s a shame it’s a great tool though
Posts: 410
Threads: 13
Joined: Oct 2023
Gender: Undisclosed
(10-20-2024, 11:05 AM)teepean Wrote: (10-20-2024, 10:58 AM)Genetics189291 Wrote: (03-16-2024, 05:37 PM)teepean Wrote: So I have had questions over the years about creating datasets and here I present a new program called aDNA to dataset (AKA make-myself-redundant)
There are both Linux and Windows versions available. The instructions are very simple so I would like to get feedback from anyone interested in creating their own datasets from BAMs.
Notice: Windows version includes all the files necessary to run except references. For Linux you need to have samtools and pileupCaller in your path.
pileupCaller can be downloaded from here and for samtools I assume you know how to use apt, pacman, yum etc.
https://github.com/stschiff/sequenceTools
Main page:
https://github.com/teepean/adna_to_dataset
Download:
https://github.com/teepean/adna_to_datas.../v.0.2.zip
PileupCaller uses default settings and if you want to modify them you have to edit the .bat or .sh.
EDIT: the program supports only hs37d5 and hg19 as references as those are the most commonly used in aDNA papers. hg38/T2T support can be added if AADR starts supporting those references.
Will this work on a cram file?
It should work.
Took me a while to find a CRAM with hs37d5 reference and it does work.
Posts: 503
Threads: 72
Joined: Nov 2023
Gender: Male
Ethnicity: Arab
(10-20-2024, 12:19 PM)teepean Wrote: (10-20-2024, 11:05 AM)teepean Wrote: (10-20-2024, 10:58 AM)Genetics189291 Wrote: Will this work on a cram file?
It should work.
Took me a while to find a CRAM with hs37d5 reference and it does work.
I think mine is on hg38
Posts: 410
Threads: 13
Joined: Oct 2023
Gender: Undisclosed
(10-20-2024, 12:17 PM)Genetics189291 Wrote: (10-20-2024, 11:05 AM)teepean Wrote: (10-20-2024, 10:58 AM)Genetics189291 Wrote: Will this work on a cram file?
It should work.
It says reference not compatible I thought it would use lift over or something like that, that’s a shame it’s a great tool though
It supports only hs37d5 and hg19 and references based on those. You might have hg38 reference CRAM which is why it does not work as the 1240K has only positions based on GRCh37/hg19.