We consider repeats with motif lengths between 1-6bp. E.g. AAAAA, ACACACACAC, CAGCAGCAGCAG, TATCTATCTATCTATCTATC, TTTTGTTTTGTTTTGTTTTG, AAAAACAAAAACAAAAAC would all fall under this definition.
We have helped develop HipSTR and GangSTR for genome-wide genotyping of STRs from next-generation sequencing data. Earlier work (mutation rates and constraint) are based on genotypes obtained using lobSTR. More recent work (GTEx eSTRs, imputation results) are based on genotypes obtained using HipSTR and GangSTR. Recent dataset called EnsebmleTR incorporates even other STR callers like Illumina's ExpansionHunter.
We are aiming to create the most comprehensive genome-wide dataset of STRs and if the loci you are searching has been previously described in the literature it should be available on WebSTR. However, WebSTR is currently split in several subsets (STR panels) that are using different methods to both identify repeats on the reference genome and to further genotype repeat variations. Due to the statistical nature of these pipelines and filters used at the different stages of the analysis (for instance excluding overlapping repeats) an important loci could have been filtered out. In this case we aim to manually update important missing loci, so please contact us if you find such a case.
We currently report only one motif for such STRs and it is a consensus motif based on the reference genome. This current limitation in WebSTR is due to the differences in how repeat motifs are handled by different genotyping tools. WebSTR reports STRs genotyped by many different tools and we for now have taken a decision to simplify this aspect of reporting, but plan to improve this in the future.
Yes, if allele frequency statistics are available they are displayed on the locus level page for several cohorts consisting of European, African, and East Asian ancestry.
We would love to include your dataset! Contact mgymrek AT ucsd DOT edu to discuss adding summary level STR statistics or allele frequencies for a different cohort to the site.
Yes, we provide all the neccessary data files on request if you would like to do it. Corresponding instructions can be found on Github for frontend and backend tier of WebSTR.