Pages

Genome-wide Manhattan Plots in STATA

I have used this chunk of code on numerous occasions to plot GWAS data, so I thought I'd share!
twoway scatter logpval absposMB, msize(tiny) ysize(2) xsize(7) xaxis(1 2) yline(16.881438 2.9957323, lcolor(red))
xline(245.522847 488.541076 688.046816 879.458034 1060.3159 1231.291599 1389.919738 1536.194564 1674.623832 1810.03746 1944.489844 2076.939655 2191.082635 2297.45122 2397.790135 2486.617389 2565.392131 2641.509284 2705.320935 2767.756899 2814.701222 2864.255932, lpattern(dash) lcolor(gs12))
xlabel(122 "1" 367 "2" 588 "3" 783 "4" 969 "5" 1145 "6" 1310 "7" 1463 "8" 1605 "9" 1742 "10" 1877 "11" 2010 "12" 2134
"13" 2244 "14" 2347 "15" 2442 "16" 2526 "17" 2603 "18" 2673 "19" 2736 "20" 2791 "21" 2839 "22" 2941 "X", axis(2))
xlabel( 0(50)3000, axis(1) alternate) subtitle(,bcolor(white) ring(30) orientation(vertical) position(9) nobexpand) ytitle("-Log p-value") xtitle("Physical Distance (Mb)", axis(1))
The variables needed are a log p-value (or some other statistic) and the absolute genomic position of each SNP (distance from the beginning of chromosome 1). If you need the offsets to compute this absolute position, they are listed in MB in the xline(---) portion of the plot. It makes something like this:



Enjoy!