Posts: 188
Threads: 6
Joined: Sep 2023
Suppose we have the results of whole-genome sequencing as part of a medical project (in various formats). What's the best way to convert it to a format that works best for the G25? It is clear that the researcher is unlikely to agree to make 23andme files. What better way to ask him? Make an eigenstrate or plink for sample collections? Or collective VCF? And which toolkit is better for filtering files?
I wrote about the Ukrainian-Romanian project, which has not yet been published
https://genarchivist.com/showthread.php?tid=674 . The project of collect samples from Ukrainians of different regions is also continuing, they have already collected 10,000
https://english.elpais.com/international...FH6l_iXzPg . The goals of these projects are primarily medical, they are conducting a full-genome study, and will publish it no earlier than 2026. And I would somehow like to ask the authors if they would like to share samples for g25, but I need to explain to them what they should do for this and how to filter.
Posts: 23
Threads: 2
Joined: Feb 2024
Gender: Male
Y-DNA (P): N-MF207071
(09-25-2024, 02:01 PM)Gordius Wrote: Suppose we have the results of whole-genome sequencing as part of a medical project (in various formats). What's the best way to convert it to a format that works best for the G25? It is clear that the researcher is unlikely to agree to make 23andme files. What better way to ask him? Make an eigenstrate or plink for sample collections? Or collective VCF? And which toolkit is better for filtering files?
I wrote about the Ukrainian-Romanian project, which has not yet been published https://genarchivist.com/showthread.php?tid=674 . The project of collect samples from Ukrainians of different regions is also continuing, they have already collected 10,000 https://english.elpais.com/international...FH6l_iXzPg . The goals of these projects are primarily medical, they are conducting a full-genome study, and will publish it no earlier than 2026. And I would somehow like to ask the authors if they would like to share samples for g25, but I need to explain to them what they should do for this and how to filter.
If it's a medical project, it depends on the consents they have obtained from the subjects and approved by the institutional ethics board whether they can release anything publicly at all. Data collected on medical projects typically require some institutional ethical clearance for access where you have to state your purpose and are expected to restrict yourself to using it for that purpose only. I'd say it's unlikely you could get clearance without an affiliation to a legitimate medical/academic institution.
Posts: 188
Threads: 6
Joined: Sep 2023
09-26-2024, 07:52 AM
(This post was last modified: 09-26-2024, 07:52 AM by Gordius.)
(09-26-2024, 04:31 AM)ronin92 Wrote: (09-25-2024, 02:01 PM)Gordius Wrote: Suppose we have the results of whole-genome sequencing as part of a medical project (in various formats). What's the best way to convert it to a format that works best for the G25? It is clear that the researcher is unlikely to agree to make 23andme files. What better way to ask him? Make an eigenstrate or plink for sample collections? Or collective VCF? And which toolkit is better for filtering files?
I wrote about the Ukrainian-Romanian project, which has not yet been published https://genarchivist.com/showthread.php?tid=674 . The project of collect samples from Ukrainians of different regions is also continuing, they have already collected 10,000 https://english.elpais.com/international...FH6l_iXzPg . The goals of these projects are primarily medical, they are conducting a full-genome study, and will publish it no earlier than 2026. And I would somehow like to ask the authors if they would like to share samples for g25, but I need to explain to them what they should do for this and how to filter.
If it's a medical project, it depends on the consents they have obtained from the subjects and approved by the institutional ethics board whether they can release anything publicly at all. Data collected on medical projects typically require some institutional ethical clearance for access where you have to state your purpose and are expected to restrict yourself to using it for that purpose only. I'd say it's unlikely you could get clearance without an affiliation to a legitimate medical/academic institution.
They plan to make the data public after publication. This is essentially a certain continuation of the project "Genome diversity in Ukraine"
https://academic.oup.com/gigascience/art...59/6079618 , it was the data from this project that provided the majority of samples of Ukrainians for Vahaduo. Here are the dataset of this old projectt:
http://gigadb.org/dataset/100835 . There is a large genomic file of 12.81 GB, and there is a shortened file in plink format of 10.77 MB. If only they could make a similar shortened plink file from new samples, but it needs to be filtered somehow so that it is best suited for G25. Theoretically, they can be asked to share such a shortened file before the official publication of the data.