List of biased SNPs for qpadm
|
PACKEDPED:- https://drive.google.com/file/d/1---gS5I...drive_link
Only 475K have overlap with 1233K AADR, so it's worse than even HO, basically trash (10-05-2024, 03:57 PM)AimSmall Wrote:(10-05-2024, 03:50 PM)Light Wrote: So it becomes 883K SNPs now from 1233K, interesting I counted the total SNPs ![]()
10-05-2024, 04:07 PM
(This post was last modified: 10-05-2024, 04:08 PM by Genetics189291.)
(10-05-2024, 03:41 PM)AimSmall Wrote: That question wasn't in the context of comparing to calculators using those SNPs. You're missing the point. The calculators would also have to have those biased SNPs removed for your comparison to be meaningful. You're only removing biased SNPs on one side of the comparison. Yes, by removing the biased SNPs from your data, it’s likely that your ancestry results are now more historically accurate and more reflective of your true ancestry. Here’s why: Global 25 tries to position your genetic data in a “global space” using modern and ancient populations. By using only unbiased SNPs, you’ve given the model a cleaner dataset to work with. Even if the reference populations might still include some biased SNPs, your data is now less affected by any distortions, making your results a more accurate representation of your genetic heritage. Conclusion Yes, by removing the biased SNPs, it’s very likely that your results are now more accurate, particularly in terms of reflecting your true historical ancestry. This process has removed potential distortions and given you a clearer and more reliable genetic picture. I get what you mean now if global25 includes biased snps and you have removed them it can cause a mismatch If Global 25 is using reference samples that include the biased SNPs, while your sample has had those SNPs removed, there are a few potential impacts on your results: 1. Mismatched SNP Sets • Global 25 relies on comparing your SNP data to the SNPs in the reference populations. If your data no longer contains the biased SNPs but the reference samples do, it could lead to a mismatch in the number of SNPs being compared. In such cases, Global 25 would only be able to use the common SNPs that both your sample and the reference samples share. If too many SNPs are missing from your data, it might slightly reduce the precision of your ancestry results. 2. Impact on PCA Positioning • Principal Component Analysis (PCA), which Global 25 is based on, works by finding patterns in the data that distinguish different populations. If the biased SNPs are still influencing the reference samples but are absent from your sample, it might cause a slight shift in where your ancestry is placed on the PCA plot. However, since Global 25 uses a very large number of SNPs, the effect of missing some biased SNPs should be relatively small unless the removed SNPs were heavily biased in a specific direction. 3. Possible Slight Distortion in Comparison • If the reference populations contain SNPs that you have removed due to bias, those populations might appear slightly closer to or farther from your genetic position than they should. However, the fact that you’ve removed biased SNPs means that your sample will be less influenced by distortions, so in theory, your results might still be more accurate in representing your true ancestry, even if the reference populations are slightly skewed. 4. Small Overall Impact • Global 25 uses a very large number of SNPs across many populations. As long as you haven’t removed a huge proportion of SNPs (for example, tens of thousands), the overall effect on your results should be minimal. The algorithm will adjust to work with the common SNPs that remain between your sample and the reference populations. 5. Possible Adjustment to Reference Populations • If you are particularly concerned about the mismatch, one possible solution is to ensure that the reference populations used by Global 25 are also adjusted to exclude the same biased SNPs, if possible. This would make the comparison more consistent, though it is more challenging unless you have control over or access to the reference datasets used. Summary While removing biased SNPs could lead to a mismatch with the reference samples in Global 25, the impact on your results should be minimal unless a large proportion of SNPs were removed. The overall effect might result in a slightly less precise comparison, but since you’ve removed biased SNPs, your sample could be more representative of your true ancestry despite the differences in the datasets.
10-05-2024, 04:12 PM
(10-05-2024, 04:03 PM)Light Wrote: PACKEDPED:- https://drive.google.com/drive/folders/1...sp=sharing Basically useless like you said that list filters too many snps it seems
10-05-2024, 04:18 PM
(10-05-2024, 04:05 PM)Light Wrote:I'm missing something. The list he posted had 16K biased SNPs listed. How did the remaining SNPs after removal get down to 883K. Shouldn't it be in the neighborhood of 1217K? Are there additional bias SNPs listed somewhere than the 16K posted I'm not accounting for? That's a 350K SNPs difference.(10-05-2024, 03:57 PM)AimSmall Wrote:(10-05-2024, 03:50 PM)Light Wrote: So it becomes 883K SNPs now from 1233K, interesting
10-05-2024, 04:21 PM
(10-05-2024, 04:18 PM)AimSmall Wrote:(10-05-2024, 04:05 PM)Light Wrote:I'm missing something. The list he posted had 16K biased SNPs listed. How did the remaining SNPs after removal get down to 883K. Shouldn't it be in the neighborhood of 1217K? Are there additional bias SNPs listed somewhere than the 16K posted I'm not accounting for? That's a 350K SNPs difference.(10-05-2024, 03:57 PM)AimSmall Wrote: The biased SNP count list was only 16,759 SNPs. How'd you arrive at 883K? Quote: here's a temporary link. I can't find a better upload service right now.
10-05-2024, 04:23 PM
(10-05-2024, 04:18 PM)AimSmall Wrote:(10-05-2024, 04:05 PM)Light Wrote:I'm missing something. The list he posted had 16K biased SNPs listed. How did the remaining SNPs after removal get down to 883K. Shouldn't it be in the neighborhood of 1217K? Are there additional bias SNPs listed somewhere than the 16K posted I'm not accounting for? That's a 350K SNPs difference.(10-05-2024, 03:57 PM)AimSmall Wrote: The biased SNP count list was only 16,759 SNPs. How'd you arrive at 883K? The file he provided had 883k it was empire it removed 345,041 snps from my raw data
10-05-2024, 04:34 PM
I'm feeling dense. When I go to https://pastelink.net/zqh56z40 and copy and paste those SNPs, I get 16,760.
I tried another route and saved that entire HTML page locally. Copied the div tag with the SNPs, removed the HTML encoding and still only get 16K. I don't see a file to download, just the pastebin. ![]()
10-05-2024, 04:41 PM
(10-05-2024, 04:34 PM)AimSmall Wrote: I'm feeling dense. When I go to https://pastelink.net/zqh56z40 and copy and paste those SNPs, I get 16,760. Did you download this file Quote:here's a temporary link. I can't find a better upload service right now.
10-05-2024, 04:42 PM
Nomad....
What is the difference between this file https://easyupload.io/oi9p09 and the https://pastelink.net/zqh56z40. The latter is your first post with 16K snps listed. The second had 883K. Something is seriously amiss. How many bias SNPs exist? That's a huge difference. Why the difference in counts?
10-05-2024, 04:43 PM
(10-05-2024, 04:34 PM)AimSmall Wrote: I'm feeling dense. When I go to https://pastelink.net/zqh56z40 and copy and paste those SNPs, I get 16,760. Can you put yours In .txt file and upload it
10-05-2024, 06:06 PM
I tested the AG, SG and DG version of the same sample with and without removing the biased SNPs:
Without filtering Code: I1496.AG with filtering Code: I1496.AG Without filtering, only the AG version has a passing p value. With filtering all 3 do. left and right pops are also exclusively AG. It's crazy that so many academic papers carelessly mix different data types.
10-05-2024, 06:07 PM
(10-05-2024, 04:42 PM)AimSmall Wrote: Nomad.... Maybe pastelink deleted a part of my data because it was too big. Use the easyupload version.
10-05-2024, 06:08 PM
|
« Next Oldest | Next Newest »
|
Users browsing this thread: 1 Guest(s)