P-clouds
2013 - 2016
P-clouds is a methodology for identifying transposable elements in genomes and for estimating the false positive and false negative rates of identification. While using this program, I found a number of critical bugs in the software that lead incorrect results. I tracked down these errors and fixed them as they were found. I then created a number of new methods for identification using similar ideas from P-clouds and tested their efficacy against older methods.
Associated publication:
-
Gu, Wanjun, et al. "Identification of repeat structure in large genomes using repeat probability clouds." Analytical biochemistry 380.1 (2008): 77-83.
Tools: C, Perl, R, Git
Repository: https://github.com/PollockLaboratory/pclouds