We consider repeats with motif lengths between 1-6bp. E.g. AAAAA, ACACACACAC, CAGCAGCAGCAG, TATCTATCTATCTATCTATC, TTTTGTTTTGTTTTGTTTTG, AAAAACAAAAACAAAAAC would all fall under this definition. The set of STRs used by WebSTR was obtained from the HipSTR hg19 reference STR file.
We have helped develop lobSTR and HipSTR for genome-wide genotyping of STRs from next-generation sequencing data. Earlier work (mutation rates and constraint) are based on genotypes obtained using lobSTR. More recent work (GTEx eSTRs, imputation results) are based on genotypes obtained using HipSTR.
Not yet, but this is coming soon! We plan to display allele frequency information for several cohorts consisting of European, African, and East Asian ancestry. Stay tuned!
We would love to include your dataset! Contact mgymrek AT ucsd DOT edu to discuss adding summary level STR statistics or allele frequencies for a different cohort to the site.