Welcome to the World Haplogroup & Haplo-I Subclade Predictor. Please don't look at the mess - we're still under construction - don't be surprised if the furniture gets moved around during the next few weeks. The movers seem to be taking their time and work at odd hours so I'll be sure to leave a light on for you. Besides these few inconveniences... come on in, we're open for visitors!

Let's start off by saying that Cullen's Predictor in no way replaces Whit Athey's Predictor, a program I still consider a marvel of applied statistics. I highly recommend Whit's Predictor and there's much to be learned by comparing the results obtained from both; we have two different predictors with slightly different goals and methods. To try out Whit's program for yourself, please follow this link to Whit Athey's Haplogroup Predictor.

The World Haplogroup & Haplo-I Subclade Predictor works on a Bootstrap WGD ( weighted genetic distance ) algorithm that's a variation of a goodness-of-fit test, intended to close the margin in the trade-off between size of the database, complexity of the algorithm, and the accuracy of the prediction. In basic terms, the Predictor makes a large number of random sample observations of the entered haplotype and predicts, for each observation, which modal haplotype best describes the sample of markers. Each modal Haplotype is rated in percent by its ability to best describe the sample of markers during the trials. Whit's Bayesian method is surprisingly accurate in predicting amongst 20-some haplogroups with only a few markers. My hope is to extend that to 100-some haplogroups / subhaplogroups and subclades without sacrificing too much in the way of accuracy. Currently I am utilising 86 modal haplotypes (Y-STR-37 markers) representing the world's haplogroups / subhaplogroups, and 56 modal haplotypes (Y-STR-67 markers) of Haplo-I subclades, for a total of 142 modal haplotypes used for comparison to your own Y-STR signature. You may find the following links helpful while learning to use the Predictor:

Updates: Beta-Version 0.93a Beta-v0.93a After finishing the recent internal modifications, the tweaks, and the addition of several modal sets, as well as addressing several issues - I'm calling this a minor revision of 0.93a. This will probably be the last version posted ( except for minor touches ) while the core routine is rewritten. The 'Corbett-Q' module will be also be added during the rewrite. By the time Beta-Version 0.95 is released, all modals should be in full 67-marker format. The core routine must be rewritten due to timeout constraints on the JavaScript programming. As more modals are added, trial runs must be reduced to avoid script timeouts. Reduced trial runs results in more variable prediction percentages in the random bootstrap prediction algorithm. Tolerable margins for prediction variability are now very difficult to maintain with 142 modal sets - and the problem will be have to be addressed before I can contemplate taking all modal sets to full 67-marker format.

World Haplogroup & Haplo-'I' Subclade Predictor
Concept and bootstrap WGD prediction algorithm programmed in JavaScript by Jim Cullen of the Cullen Genealogy Homepage
Based on subclade modal values and geographic distributions from the research of Ken Nordtvedt.
Marker input ranges and laboratory conventions adapted from those of Family Tree DNA.

Panel 1
Panel 2
Panel 3
Panel 4
Markers Selected
in Panels 1 - 3
Markers Selected
in Panels 1 - 4
  Y-37 Haplogroup / Sub-Haplogroup or Y-67 Haplo-I Subclade Prediction:

Haplo-I Subclade Prediction is disabled until you perform a Haplogroup Prediction on a changed set of markers. Your predicted Haplogroup(s) and probabilities will be displayed. If the predictor indicates that your haplotype is probably haplogroup 'I', then Haplo-I Subclade Prediction will be re-enabled.
The Haplogroup 'I' Subclade Predictor was programmed in Javascript by Jim Cullen of the Cullen Genealogy Homepage.
All rights are reserved jointly by Jim Cullen and Ken Nordtvedt. Reproduction of this code is prohibited without prior permission.
Beta Version 0.92 - Apr 19, 2008

Custom Order Marker Entry Tool ( C.O.M.E.T. )
INSTRUCTIONS for using COMET are now located on the Help Page.
Thanks go out to John Simpson for the C.O.M.E.T. acronym.
  Your Custom-Ordered Marker List:

  Type or Paste Your Marker Data Here and click Submit:


World Haplogroup & Haplo-I Subclade Predictor: Custom Order Marker Entry Tool ( COMET ) - Version 1.2 - April 20, 2008

* * * Acknowledgments * * *

The Acknowledgments are meant to serve as credit to those who have gone out of their way to freely provide the results of their research to make this project possible. However it must be understood that, for the most part, the support and assistance consisted mainly of advice, feedback, and quality data in the form of modal haplotypes. Any questions, comments, or concerns regarding method, user interface, performance, or accuracy should be directed towards myself - the one who solely conceived and hand-coded the algorithm to perform the predictions.

I gratefully acknowledge the support and technical assistance of numerous individuals without whom this work would not have been possible these are the tireless compilers, organizers, and researchers that have made this work possible. I would also like to recognize the advice, support, and feedback provided by the members of the genetic genealogy community as a whole these are the fine people that have made it all worthwhile. In particular, I would like to give credit and sincere thanks to the following individuals for their generous contribution to the Haplogroup & Haplo-I Subclade Predictor:

  • Ken Nordtvedt: Perhaps best known for his research into population varieties within Y-Haplogroup 'I' and their Extended Modal Haplotypes. These modal haplotypes, found on his web site, were the basis for the Haplogroup and Haplo-I Subclade Predictor.

  • Bonnie Schrack: Along with Jeff Schweitzer, Group Administrators of FTDNA's Y-Haplogroup J DNA Project for their work with Haplo-J modal haplotypes.

  • Charles Kerchner: Administrator of FTDNA's Kerchner's Y-DNA R1b and Subclades Haplogroup Project, Kerchner's R1b1c10 (U152+) Project, geographical projects, numerous surname projects and other online groups, he was also one of the co-founders of ISOGG. His research and expertise in Haplogroup R1b is invaluable.

  • E3b Project: Thanks go out to the administrators of FTDNA's E3b Y-DNA Project for their help with the E3b modal haplotypes.

  • Alfred A. Aburto Jr.: Many thanks also to Alfred Aburto for his advice and impressive data collection on haplotypes within Haplogroup-J.

  • Dennis Wright: In recognition for his advice, feedback, and helpful data on Haplogroup R1b. Dennis runs the Irish Type III Website.

  • David Weston, Dan Draghici, Mike Maddi, and Gary Corbett: In recognition of their efforts with the YDNA Haplogroup R1b-U106/S21 Research Group. S21 defines what is currently known as the R1b1c9 branch of the R1b-Tree. Also Christopher Meek who runs the S29 Y-DNA Project. S29 defines the current R1b1c9b subclade.

  • Rebekah Canada & The yDNA Haplogroup Q Project: Modal haplotypes used by the Predictor for Haplo-Q are found at The yDNA Haplogroup Q Project, a very clear, well-organized, and very informative haplogroup project website. Clusters previously used in the Predictor had already been defined by the Haplo-Q Project and so their modal haplotypes have taken the place of mine. Thanks again to Rebekah for her help. See also the yDNA Haplogroup Q Project hosted by FTDNA.

  • John McEwan: I acknowledge the work John has done on STR and SNP analysis of many of the world's haplogroups. His modal haplotypes have been used in the predictor to cover those groups that have not been covered by anyone else. John's webpage is the Dal Riadic Migration Y Chromosome DNA Genealogy Page.

  • Victor Villarreal: For additional assistance with Haplogroup E3b from his website at Haplozone: E3b Project. If you haven't visited his site - you need to go there now and take a look around. Just about every haplogroup project could learn something from the Haplozone E3b website. Organization and presentation of data is absolutely key - and Haplozone is the best example I've ever seen.

  • Robert Tarin: Additional E3b material, including a paper on an Iberian Subcluster of E3b from 2005.

  • Rebekah Canada & Johan Swaerdenheim: Thanks to FTDNA's Scandinavian Y-DNA Project for the modal haplotypes within Haplogroup-N.

  • Vincent Vizachero, Ellen Levy, & John Simpson: For their assistance and advice on Haplogroup R2.

