Merging BED with v62 dataset with Google Colab
#46
fam files really all you need to open for selecting purposes
Inquirer likes this post
Reply
#47
(04-09-2025, 06:49 PM)AimSmall Wrote: fam files really all you need to open for selecting purposes

I check the .bim file all the time, as I select some snips or check the number of snips or else.
Reply
#48
This topic became very interesting all of a sudden. I have so many other things to share... Hopefully this is just the beginning.
Reply
#49
(04-09-2025, 06:51 PM)TanTin Wrote:
(04-09-2025, 06:49 PM)AimSmall Wrote: fam files really all you need to open for selecting purposes

I check the .bim file all the time, as I select some snips or check the number of snips or else.

I usually use plink for that
Reply
#50
(04-09-2025, 06:49 PM)AimSmall Wrote: fam files really all you need to open for selecting purposes

(04-09-2025, 06:54 PM)TanTin Wrote: This topic became very interesting all of a sudden. I have so many other things to share...  Hopefully this is just the beginning.

I was away from my computer for a few hours. However, I returned and managed to get things working thanks to you two (though perhaps not perfectly because I received error messages). Anyhow, here's a model that I got to work, somewhat. What do you guys think?

   
Reply
#51
Here's another one.

   
Reply
#52
I'm running into an issue here. I downloaded the samples @TanTin provided. I converted them to Eigenstrat format(though I kept running into an issue where it kept setting the populations to "ignore" but I resolved that). Yet everytime I go to merge them with the larger dataset, I'm greeted to this message "fatalx: OOPS snp file has changed since genotype file was created Aborted (core dumped)". Not sure what to do to get it to actually work. I'm using Eigensoft to merge by the way. Does anyone have the samples in eigenstrat format?
Reply
#53
(04-10-2025, 07:07 AM)ModusOperandi Wrote: I'm running into an issue here. I downloaded the samples @TanTin provided. I converted them to Eigenstrat format(though I kept running into an issue where it kept setting the populations to "ignore" but I resolved that). Yet everytime I go to merge them with the larger dataset, I'm greeted to this message "fatalx: OOPS snp file has changed since genotype file was created Aborted (core dumped)". Not sure what to do to get it to actually work. I'm using Eigensoft to merge by the way. Does anyone have the samples in eigenstrat format?

For the "ignore" error, you need to go into the .fam file and find>replace all the instances of 9 (or -9?, it's the value in the last column either way), with 1.
I don't remember running into the OOPS snp, but OOPS indiv maybe same fix?
https://genarchivist.net/showthread.php?...8#pid30538
ModusOperandi likes this post
Reply
#54
(04-10-2025, 08:36 PM)Kale Wrote: For the "ignore" error, you need to go into the .fam file and find>replace all the instances of 9 (or -9?, it's the value in the last column either way), with 1.
I don't remember running into the OOPS snp, but OOPS indiv maybe same fix?
https://genarchivist.net/showthread.php?...8#pid30538

I set hashcheck to "NO" and it didn't give me the error, however it did abruptly end the process and returned the message "killed. Figured it might be a low RAM issue so I increased my computer's memory, ran it again it and got a lot further along in the merge process, the program was actually returning "OK" after reading the geno file. But unfortunately once again the process got abruptly ended and returned the same "killed" message. Not sure what to do at this point
Reply
#55
How much RAM do you have?
Reply
#56
(04-15-2025, 02:41 AM)Kale Wrote: How much RAM do you have?

16GB total
Reply
#57
I was able to merge v62 with a few hundred other samples and convert to Eigenstrat with 16GB RAM.
That was using Linux, plink to merge, convertf to convert.
Reply
#58
(04-15-2025, 05:11 AM)Kale Wrote: I was able to merge v62 with a few hundred other samples and convert to Eigenstrat with 16GB RAM.
That was using Linux, plink to merge, convertf to convert.

I suspect it's this particular sample. It's strange because in the past I've been able to merge samples into datasets with little to no issue on PCs with even less ram with this
Reply
#59
I think I solved the issue. Turns out, I wasn't allocating enough memory to the VM so it could get the job done. It needed about 10GB of RAM dedicated to it just to keep itself from killing the process early, anything less just did not want to work
TanTin and Kale like this post
Reply
#60
Out of curiosity, what is the file size of everyone's merged genotype data file?  I've spoken on this in the past, but I am not sure why but whenever I merge a sample with another larger dataset, the file size always explodes to 20+ GB.  This makes each process slower than before as it seems admixtools reads the entire dataset before calculating statistics. This wouldn't be much of an issue to deal with if it were just a 5GB file, but this merged data is over 4 times as large. Where is the extra data in the file even coming from?
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)