<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-1440516860833895494</id><updated>2012-01-24T09:48:59.740-08:00</updated><category term='string'/><category term='boxplot'/><category term='point pch'/><category term='regression'/><category term='histogram'/><category term='nnet tree'/><category term='iom'/><category term='bagging boosting SVM'/><category term='pairs'/><category term='unix'/><category term='color'/><category term='html'/><category term='symbol'/><category term='layout'/><category term='image'/><category term='axis'/><category term='general'/><category term='regression nonlinear gam'/><category term='figure'/><category term='3dplot'/><title type='text'>Experience with R</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>33</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-6283233650547979392</id><published>2010-09-03T17:44:00.000-07:00</published><updated>2010-09-03T17:44:20.297-07:00</updated><title type='text'>Venn Diagram</title><content type='html'>Method 1:&lt;br /&gt;A=LETTERS[1:15];B=LETTERS[5:20]&lt;br /&gt;C=LETTERS[12:18];D=LETTERS[7:17]&lt;br /&gt;All = unique(c(A, B, C, D))&lt;br /&gt;Distribution = matrix(0, ncol=4, nrow=length(All));&lt;br /&gt;colnames(Distribution)&amp;lt;-c('A','B','C','D');&lt;br /&gt;rownames(Distribution)&amp;lt;-All;&lt;br /&gt;Distribution[A, 1] = 1;&lt;br /&gt;Distribution[B, 2] = 1;&lt;br /&gt;Distribution[C, 3] = 1;&lt;br /&gt;Distribution[D, 4] = 1;&lt;br /&gt;library(limma);&lt;br /&gt;par(mfrow=c(2, 2))&lt;br /&gt;a &amp;lt;- vennCounts(Distribution[ ,c(1, 3,4)]); vennDiagram(a, main="", mar=c(0.5, 0.5, 0.5, 0.5), cex=0.9, lwd=1);&lt;br /&gt;a &amp;lt;- vennCounts(Distribution[ ,c(1, 2,4)]); vennDiagram(a, main="", mar=c(0.5, 0.5, 0.5, 0.5), cex=0.9, lwd=1);&lt;br /&gt;a &amp;lt;- vennCounts(Distribution[ ,c(2, 3,4)]); vennDiagram(a, main="", mar=c(0.5, 0.5, 0.5, 0.5), cex=0.9, lwd=1);&lt;br /&gt;&lt;br /&gt;#########################&lt;br /&gt;Method 2:&lt;br /&gt;&lt;pre class="prettyprint"&gt;&lt;code&gt;&lt;span class="pln"&gt;circle &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="kwd"&gt;function&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;x&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; y&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; r&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="pun"&gt;...)&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="pun"&gt;{&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; ang &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; seq&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;2&lt;/span&gt;&lt;span class="pun"&gt;*&lt;/span&gt;&lt;span class="pln"&gt;pi&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; length &lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;100&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; xx &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; x &lt;/span&gt;&lt;span class="pun"&gt;+&lt;/span&gt;&lt;span class="pln"&gt; r &lt;/span&gt;&lt;span class="pun"&gt;*&lt;/span&gt;&lt;span class="pln"&gt; cos&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;ang&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; yy &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; y &lt;/span&gt;&lt;span class="pun"&gt;+&lt;/span&gt;&lt;span class="pln"&gt; r &lt;/span&gt;&lt;span class="pun"&gt;*&lt;/span&gt;&lt;span class="pln"&gt; sin&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;ang&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; polygon&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;xx&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; yy&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="pun"&gt;...)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span class="pun"&gt;}&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;br /&gt;venndia &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="kwd"&gt;function&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; B&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; C&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; getdata&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;FALSE&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="pun"&gt;...){&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; cMissing &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; missing&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;C&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;&lt;span class="kwd"&gt;if&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;cMissing&lt;/span&gt;&lt;span class="pun"&gt;){&lt;/span&gt;&lt;span class="pln"&gt; C &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; c&lt;/span&gt;&lt;span class="pun"&gt;()&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="pun"&gt;}&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; unionAB &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="kwd"&gt;union&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; B&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; unionAC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="kwd"&gt;union&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; C&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; unionBC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="kwd"&gt;union&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;B&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; C&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; uniqueA &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; setdiff&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; unionBC&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; uniqueB &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; setdiff&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;B&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; unionAC&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; uniqueC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; setdiff&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;C&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; unionAB&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; intersAB &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; setdiff&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;intersect&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; B&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; C&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; intersAC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; setdiff&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;intersect&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; C&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; B&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; intersBC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; setdiff&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;intersect&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;B&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; C&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; A&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; intersABC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; intersect&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;intersect&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; B&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; intersect&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;B&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; C&lt;/span&gt;&lt;span class="pun"&gt;))&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; nA &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; length&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;uniqueA&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;br /&gt;&amp;nbsp; &amp;nbsp; nB &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; length&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;uniqueB&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; nC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; length&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;uniqueC&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; nAB &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; length&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;intersAB&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; nAC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; length&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;intersAC&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; nBC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; length&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;intersBC&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; nABC &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; length&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;intersABC&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt; &amp;nbsp; &lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; par&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;mar&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;2&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;2&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;))&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; plot&lt;/span&gt;&lt;span class="pun"&gt;(-&lt;/span&gt;&lt;span class="lit"&gt;10&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="pun"&gt;-&lt;/span&gt;&lt;span class="lit"&gt;10&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; ylim&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;9&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; xlim&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;9&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; axes&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;FALSE&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="pun"&gt;...)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; circle&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;x&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;3&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; y&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;6&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; r&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;3&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; col&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;rgb&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;1&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,.&lt;/span&gt;&lt;span class="lit"&gt;5&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; border&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;NA&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; circle&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;x&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;6&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; y&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;6&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; r&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;3&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; col&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;rgb&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,.&lt;/span&gt;&lt;span class="lit"&gt;5&lt;/span&gt;&lt;span class="pun"&gt;,.&lt;/span&gt;&lt;span class="lit"&gt;1&lt;/span&gt;&lt;span class="pun"&gt;,.&lt;/span&gt;&lt;span class="lit"&gt;5&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; border&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;NA&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; circle&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;x&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;4.5&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; y&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;3&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; r&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;3&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; col&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;rgb&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="lit"&gt;0&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="lit"&gt;1&lt;/span&gt;&lt;span class="pun"&gt;,.&lt;/span&gt;&lt;span class="lit"&gt;5&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; border&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;NA&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; text&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt; x&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;1.2&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;7.7&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;4.5&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; y&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;7.8&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;7.8&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;0.8&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="str"&gt;"A"&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="str"&gt;"B"&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="str"&gt;"C"&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; cex&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;3&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; col&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="str"&gt;"gray90"&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; text&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; x&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;2&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;7&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;4.5&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;4.5&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;3&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;6&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;4.5&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; &lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; y&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="lit"&gt;7&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;7&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;2&lt;/span&gt;&lt;span class="pln"&gt; &amp;nbsp;&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;7&lt;/span&gt;&lt;span class="pln"&gt; &amp;nbsp;&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;4&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;4&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;/span&gt;&lt;span class="lit"&gt;5&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; &lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; c&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;nA&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; nB&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; nC&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; nAB&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; nAC&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; nBC&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; nABC&lt;/span&gt;&lt;span class="pun"&gt;),&lt;/span&gt;&lt;span class="pln"&gt; &lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; cex&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="lit"&gt;2&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;&lt;span class="kwd"&gt;if&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;getdata&lt;/span&gt;&lt;span class="pun"&gt;){&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; list&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;uniqueA&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; B&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;uniqueB&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; C&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;uniqueC&lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; AB&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;intersAB &lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; AC&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;intersAC &lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; BC&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;intersBC &lt;/span&gt;&lt;span class="pun"&gt;,&lt;/span&gt;&lt;span class="pln"&gt; &lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ABC&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;intersABC&lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;&lt;span class="pun"&gt;}&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span class="pun"&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre class="prettyprint"&gt;&lt;code&gt;&lt;span class="pun"&gt;&amp;nbsp;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre class="prettyprint"&gt;&lt;code&gt;&lt;span class="pun"&gt;&amp;nbsp;&lt;/span&gt;&lt;/code&gt;&lt;code&gt;&lt;span class="pln"&gt;venndia&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;LETTERS&lt;/span&gt;&lt;span class="pun"&gt;[&lt;/span&gt;&lt;span class="lit"&gt;1&lt;/span&gt;&lt;span class="pun"&gt;:&lt;/span&gt;&lt;span class="lit"&gt;15&lt;/span&gt;&lt;span class="pun"&gt;],&lt;/span&gt;&lt;span class="pln"&gt; B&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;LETTERS&lt;/span&gt;&lt;span class="pun"&gt;[&lt;/span&gt;&lt;span class="lit"&gt;5&lt;/span&gt;&lt;span class="pun"&gt;:&lt;/span&gt;&lt;span class="lit"&gt;20&lt;/span&gt;&lt;span class="pun"&gt;])&lt;/span&gt;&lt;span class="pln"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre class="prettyprint"&gt;&lt;code&gt;&lt;span class="pln"&gt;vd &lt;/span&gt;&lt;span class="pun"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="pln"&gt; venndia&lt;/span&gt;&lt;span class="pun"&gt;(&lt;/span&gt;&lt;span class="pln"&gt;A&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;LETTERS&lt;/span&gt;&lt;span class="pun"&gt;[&lt;/span&gt;&lt;span class="lit"&gt;1&lt;/span&gt;&lt;span class="pun"&gt;:&lt;/span&gt;&lt;span class="lit"&gt;15&lt;/span&gt;&lt;span class="pun"&gt;],&lt;/span&gt;&lt;span class="pln"&gt; B&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;LETTERS&lt;/span&gt;&lt;span class="pun"&gt;[&lt;/span&gt;&lt;span class="lit"&gt;5&lt;/span&gt;&lt;span class="pun"&gt;:&lt;/span&gt;&lt;span class="lit"&gt;20&lt;/span&gt;&lt;span class="pun"&gt;],&lt;/span&gt;&lt;span class="pln"&gt; getdata&lt;/span&gt;&lt;span class="pun"&gt;=&lt;/span&gt;&lt;span class="pln"&gt;TRUE&lt;/span&gt;&lt;span class="pun"&gt;)&lt;/span&gt;&lt;span class="pln"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre class="prettyprint"&gt;&lt;code&gt;&lt;span class="pln"&gt;It is from&amp;nbsp;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre class="prettyprint"&gt;&lt;code&gt;&lt;span class="pln"&gt;http://stackoverflow.com/questions/1428946/venn-diagrams-with-r&lt;/span&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;span class="pun"&gt;&lt;/span&gt;&lt;span class="pln"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-6283233650547979392?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/6283233650547979392/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=6283233650547979392&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/6283233650547979392'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/6283233650547979392'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2010/09/venn-diagram.html' title='Venn Diagram'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-948791302334603243</id><published>2009-12-24T21:00:00.001-08:00</published><updated>2009-12-24T21:06:12.046-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pairs'/><title type='text'>R pairs plot</title><content type='html'>RR &amp;lt;- matrix(rnorm(50), 10, 5)&lt;br /&gt;panel.hist &amp;lt;- function(x, ...)&lt;br /&gt;{ usr &amp;lt;- par("usr"); on.exit(par(usr))&lt;br /&gt;par(usr = c(usr[1:2], 0, 1.5) )&lt;br /&gt;h &amp;lt;- hist(x, plot = FALSE)&lt;br /&gt;breaks &amp;lt;- h$breaks; nB &amp;lt;- length(breaks)&lt;br /&gt;y &amp;lt;- h$counts; y &amp;lt;- y/max(y)&lt;br /&gt;rect(breaks[-nB], 0, breaks[-1], y, col="cyan", ...)&lt;br /&gt;}&lt;br /&gt;panel.blank &amp;lt;- function(x, y)&lt;br /&gt;{ }&lt;br /&gt;panel.cor &amp;lt;- function(x, y, digits=2, prefix="", cex.cor)&lt;br /&gt;{&lt;br /&gt;usr &amp;lt;- par("usr"); on.exit(par(usr))&lt;br /&gt;par(usr = c(0, 1, 0, 1))&lt;br /&gt;r &amp;lt;- abs(cor(x, y))&lt;br /&gt;txt &amp;lt;- format(c(r, 0.123456789), digits=digits)[1]&lt;br /&gt;txt &amp;lt;- paste(prefix, txt, sep="")&lt;br /&gt;if(missing(cex.cor)) cex &amp;lt;- 0.6/strwidth(txt)&lt;br /&gt;#text(0.5, 0.5, txt, cex = cex * r)&lt;br /&gt;text(0.5, 0.5, txt, cex = cex)&lt;br /&gt;}&lt;br /&gt;pairs(RR, lower.panel=panel.smooth, upper.panel=panel.cor, diag.panel=panel.hist)&lt;br /&gt;pairs(RR, lower.panel=panel.blank, upper.panel=panel.cor, diag.panel=panel.hist)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_xLDEmoNB_RM/SzRINMDPNbI/AAAAAAAAFDc/N5gUwj7JHWA/s1600-h/screenshot_001.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_xLDEmoNB_RM/SzRINMDPNbI/AAAAAAAAFDc/N5gUwj7JHWA/s640/screenshot_001.png" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-948791302334603243?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/948791302334603243/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=948791302334603243&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/948791302334603243'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/948791302334603243'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2009/12/r-pairs-plot.html' title='R pairs plot'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_xLDEmoNB_RM/SzRINMDPNbI/AAAAAAAAFDc/N5gUwj7JHWA/s72-c/screenshot_001.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-8538701095079025841</id><published>2009-12-24T20:56:00.000-08:00</published><updated>2009-12-24T20:56:33.880-08:00</updated><title type='text'>R animation package</title><content type='html'>It is really interesting and cool: R animation&lt;br /&gt;&lt;br /&gt;=========================&lt;br /&gt;"Description This package consists of various functions for animations in statistics, covering many areas such as probability theory, mathematical statistics, multivariate statistics, nonparametric statistics, sampling survey, linear models, time series, computational statistics, data mining and machine learning. These functions might be of help in teaching statistics and data analysis."&lt;br /&gt;&lt;br /&gt;Example:&lt;br /&gt;# Animations inside R windows graphics devices &lt;br /&gt;# Bootstrapping &lt;br /&gt;oopt = ani.options(interval = 0.3, nmax = 50) &lt;br /&gt;boot.iid() &lt;br /&gt;ani.options(oopt) &lt;br /&gt;&lt;br /&gt;&lt;a href="http://cran.r-project.org/web/packages/animation/animation.pdf"&gt;http://cran.r-project.org/web/packages/animation/animation.pdf&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-8538701095079025841?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/8538701095079025841/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=8538701095079025841&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/8538701095079025841'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/8538701095079025841'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2009/12/r-animation-package.html' title='R animation package'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-504844926883993187</id><published>2009-04-28T13:59:00.000-07:00</published><updated>2009-04-28T14:14:14.318-07:00</updated><title type='text'>Add a table to the Figure</title><content type='html'>library(gplots);&lt;br /&gt;png(filename = "exprs.png", width = 1024, height = 600)&lt;br /&gt;colnames(tmp3) &lt;- c("ORF", "Gene", "sch9/wt.d3", "ras2/wt.d3", "tor1/wt.d3", &lt;br /&gt;  "sch9sir2/wt.d3", "sch9sir2/sch9.d3", "CR 48/24", "CR/wt.24h", &lt;br /&gt;  "CR/wt.48h", "sir2/wt.logphase", "sir2/wt.d3", "sir2hmra/wt.d3")&lt;br /&gt;textplot(tmp3, halign="left", valign="top", cex=0.7, show.rownames = FALSE, &lt;br /&gt;  show.colnames=TRUE, col.data = col.data3, cmar = 0.9, mar = c(0, 0, 0, 0) + 0.1);&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-504844926883993187?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/504844926883993187/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=504844926883993187&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/504844926883993187'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/504844926883993187'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2009/04/add-table-to-figure.html' title='Add a table to the Figure'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-5705755285681831281</id><published>2009-04-24T13:27:00.000-07:00</published><updated>2009-04-24T13:29:31.032-07:00</updated><title type='text'>Rwui</title><content type='html'>Rwui is a nice interface to create a user friendly web interface for an R script.&lt;br /&gt;link: &lt;a href="http://rwui.cryst.bbk.ac.uk/"&gt;http://rwui.cryst.bbk.ac.uk/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Instructions:&lt;br /&gt;&lt;br /&gt;(1) you need a Tomcat installation up and running (http://tomcat.apache.org/). Test Tomcat by typing: http://localhost:8080&lt;br /&gt;(2) copying the file "xxx.war" to the Tomcat webapps directory, " "..\Tomcat\webapps\" where Tomcat will automatically unpack and incorporate it.&lt;br /&gt;(3) Test Tomcat by typing: http://localhost:8080/xxx, then Tomcat will automatically unpack and incorporate it&lt;br /&gt;(4) copy and paste the two Rdata files under the folder "..\Tomcat\webapps\xxx\WEB-INF\"&lt;br /&gt;(5) enjoy to run the data interface by http://localhost:8080/xxx&lt;br /&gt;&lt;br /&gt;note: In the 'System Variables' box scroll down and select variable 'Path' and press 'Edit'. Add the path to R's bin directory eg by adding something like C:\Program Files\R\R-2.1.0\bin to the ';' separated list.&lt;br /&gt;any problems, you can check the instruction in http://rwui.cryst.bbk.ac.uk/tutorial/Instructions.html#SECTION000161000000000000000&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-5705755285681831281?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/5705755285681831281/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=5705755285681831281&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/5705755285681831281'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/5705755285681831281'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2009/04/rwui.html' title='Rwui'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-7169453832038186866</id><published>2009-02-17T14:53:00.000-08:00</published><updated>2009-02-17T15:05:31.723-08:00</updated><title type='text'>combine redundant rows into one</title><content type='html'>&gt; exprs_133a[1:4, ]&lt;br /&gt;          Representative.Public.ID Gene.Symbol Chromosomal.Location CHP_SKN1.CEL   MDA.CEL     ratios                    HGU133.IDs..selected.4..&lt;br /&gt;1007_s_at                   U48705        DDR1            chr6p21.3    10.265215 10.554566  0.2893516 discoidin domain receptor tyrosine kinase 1&lt;br /&gt;1053_at                     M87338        RFC2           chr7q11.23     9.305431  9.463867  0.1584354 replication factor C (activator 1) 2, 40kDa&lt;br /&gt;117_at                      X51757       HSPA6              chr1q23     9.255379  9.053673 -0.2017056         heat shock 70kDa protein 6 (HSP70B)&lt;br /&gt;121_at                      X69699        PAX8          chr2q12-q14    10.405100 10.522243  0.1171425                                paired box 8&lt;br /&gt;&gt; geneID.133a &lt;- as.character(exprs_133a[ ,1]);&lt;br /&gt;&gt; length(geneID.133a)&lt;br /&gt;[1] 22283&lt;br /&gt;&gt; sum(geneID.plus2 %in% geneID.133a);&lt;br /&gt;[1] 22442&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;&gt; tmp1 &lt;- apply(exprs_133a[ ,4:6], 2, function(v) tapply(v, factor(geneID.133a), mean));&lt;br /&gt;&gt; tmp2 &lt;- apply(exprs_133a[ ,c(1, 2, 3, 7)], 2, function(v) tapply(v, factor(geneID.133a), function(v1) v1[1]));&lt;/span&gt;&lt;br /&gt;&gt; tmp1[1:3, ]&lt;br /&gt;         CHP_SKN1.CEL  MDA.CEL     ratios&lt;br /&gt;AA001552     9.414568 9.594727 0.18015902&lt;br /&gt;AA004579     8.302277 8.746708 0.44443149&lt;br /&gt;AA004757     9.328749 9.383631 0.05488192&lt;br /&gt;&gt; tmp2[1:3, ]&lt;br /&gt;         Representative.Public.ID Gene.Symbol Chromosomal.Location HGU133.IDs..selected.4..                                                      &lt;br /&gt;AA001552 "AA001552"               "C19orf54"  "chr19q13.2"         "chromosome 19 open reading frame 54"                                         &lt;br /&gt;AA004579 "AA004579"               "TAF1B"     "chr2p25"            "TATA box binding protein (TBP)-associated factor, RNA polymerase I, B, 63kDa"&lt;br /&gt;AA004757 "AA004757"               "ZNF236"    "chr18q22-q23"       "zinc finger protein 236"                                                     &lt;br /&gt;&gt; &lt;br /&gt;&gt; exprs_133a.unique &lt;- data.frame(tmp1, tmp2);&lt;br /&gt;&gt; exprs_133a.unique[1:3, ];&lt;br /&gt;         CHP_SKN1.CEL  MDA.CEL     ratios Representative.Public.ID Gene.Symbol Chromosomal.Location&lt;br /&gt;AA001552     9.414568 9.594727 0.18015902                 AA001552    C19orf54           chr19q13.2&lt;br /&gt;AA004579     8.302277 8.746708 0.44443149                 AA004579       TAF1B              chr2p25&lt;br /&gt;AA004757     9.328749 9.383631 0.05488192                 AA004757      ZNF236         chr18q22-q23&lt;br /&gt;                                                             HGU133.IDs..selected.4..&lt;br /&gt;AA001552                                          chromosome 19 open reading frame 54&lt;br /&gt;AA004579 TATA box binding protein (TBP)-associated factor, RNA polymerase I, B, 63kDa&lt;br /&gt;AA004757                                                      zinc finger protein 236&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-7169453832038186866?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/7169453832038186866/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=7169453832038186866&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7169453832038186866'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7169453832038186866'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2009/02/combine-redundant-rows-into-one.html' title='combine redundant rows into one'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-4661206233767224944</id><published>2009-02-16T20:06:00.000-08:00</published><updated>2009-02-16T20:07:41.208-08:00</updated><title type='text'>use mtext() to label sub graph</title><content type='html'>windows(10, 4)&lt;br /&gt;ll = matrix(c(1:3), nrow=1, byrow=TRUE)&lt;br /&gt;width  = c(0.4, 0.4, 0.2);&lt;br /&gt;height = c(1);&lt;br /&gt;layout(ll, width, height);&lt;br /&gt;par(mar=c(4, 4, 2, 1));&lt;br /&gt;cols &lt;- c("green", "blue", "red")&lt;br /&gt;&lt;br /&gt;plot(0, type="n", ylim=c(0, 1), xlim=c(0, 10), main=expression("type A"), xlab="Time (hr)", &lt;br /&gt;     ylab="log2(expr)", axes = FALSE);&lt;br /&gt;axis(1, 1:10, (1:10)*12, cex.axis=0.7)&lt;br /&gt;axis(2)&lt;br /&gt;for(i in 1:3){&lt;br /&gt;   lines(1:10, sort(runif(10)), type = "b", col = cols[i], lty = 1, lwd=2, pch=i);&lt;br /&gt;}&lt;br /&gt;box();&lt;br /&gt;mtext("A", at= -1, line = 0.8);&lt;br /&gt;&lt;br /&gt;plot(0, type="n", ylim=c(0, 1), xlim=c(0, 10), main=expression("type B"), xlab="Time (hr)", &lt;br /&gt;     ylab="log2(expr)", axes = FALSE);&lt;br /&gt;axis(1, 1:10, (1:10)*12, cex.axis=0.7)&lt;br /&gt;axis(2)&lt;br /&gt;for(i in 1:3){&lt;br /&gt;   lines(1:10, sort(runif(10)), type = "b", col = cols[i], lty = 1, lwd=2, pch=i);&lt;br /&gt;}&lt;br /&gt;box();&lt;br /&gt;mtext("B", at= -1, line = 0.8);&lt;br /&gt;&lt;br /&gt;genelist &lt;- c("AA", "BB", "CC")&lt;br /&gt;par(mar=c(4, 0, 2, 0));&lt;br /&gt;plot(0, type="n", ylim=c(9, 14), xlim=c(0, 9), xlab="", ylab="", axes = FALSE);&lt;br /&gt;legend("topleft", genelist, col=cols, lty=1, cex=0.9, lwd=2, pch=1:3, box.lwd = 0 , box.lty = 0)&lt;br /&gt;&lt;br /&gt;---------------------------------------------&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_xLDEmoNB_RM/SZo3_nRO7PI/AAAAAAAAEP4/2A1Wc7ICVBU/s1600-h/ttt.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 161px;" src="http://1.bp.blogspot.com/_xLDEmoNB_RM/SZo3_nRO7PI/AAAAAAAAEP4/2A1Wc7ICVBU/s400/ttt.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5303613077194730738" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-4661206233767224944?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/4661206233767224944/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=4661206233767224944&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4661206233767224944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4661206233767224944'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2009/02/use-mtext-to-label-sub-graph.html' title='use mtext() to label sub graph'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_xLDEmoNB_RM/SZo3_nRO7PI/AAAAAAAAEP4/2A1Wc7ICVBU/s72-c/ttt.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-7731964619841113773</id><published>2009-02-09T21:50:00.000-08:00</published><updated>2009-02-09T21:52:49.782-08:00</updated><title type='text'>colors()</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_xLDEmoNB_RM/SZEWBfk8NZI/AAAAAAAAEPw/R7asJSaxjNE/s1600-h/ColorsChart1.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 398px; height: 400px;" src="http://1.bp.blogspot.com/_xLDEmoNB_RM/SZEWBfk8NZI/AAAAAAAAEPw/R7asJSaxjNE/s400/ColorsChart1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5301042451303904658" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;==============&lt;br /&gt;use colors() to specify the color&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-7731964619841113773?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/7731964619841113773/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=7731964619841113773&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7731964619841113773'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7731964619841113773'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2009/02/colors.html' title='colors()'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_xLDEmoNB_RM/SZEWBfk8NZI/AAAAAAAAEPw/R7asJSaxjNE/s72-c/ColorsChart1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-1614875085643786297</id><published>2009-02-09T21:17:00.000-08:00</published><updated>2009-02-09T21:48:44.023-08:00</updated><title type='text'>boxplot</title><content type='html'>x1 &lt;- rnorm(20, 1, 2);&lt;br /&gt;x2 &lt;- rnorm(20, 2, 3);&lt;br /&gt;y &lt;- list(x1, x2)&lt;br /&gt;boxplot(y)&lt;br /&gt;&lt;br /&gt;boxplot(y, horizontal = TRUE)&lt;br /&gt;&lt;br /&gt;boxplot(y, horizontal = TRUE, col=c("red", "green"))&lt;br /&gt;&lt;br /&gt;boxplot(y, horizontal = TRUE, col=c("red", "green"), notch=TRUE)&lt;br /&gt;&lt;br /&gt;boxplot(y, horizontal = TRUE, col=c("pink", "lightgreen"), notch=TRUE, border = c("darkred", "darkgreen"))&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_xLDEmoNB_RM/SZEVK6Pka0I/AAAAAAAAEPo/xBUgLRWiRBY/s1600-h/R+Graphics_+Device+2+(ACTIVE)-2.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 365px;" src="http://3.bp.blogspot.com/_xLDEmoNB_RM/SZEVK6Pka0I/AAAAAAAAEPo/xBUgLRWiRBY/s400/R+Graphics_+Device+2+(ACTIVE)-2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5301041513569217346" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-1614875085643786297?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/1614875085643786297/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=1614875085643786297&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/1614875085643786297'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/1614875085643786297'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2009/02/boxplot.html' title='boxplot'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_xLDEmoNB_RM/SZEVK6Pka0I/AAAAAAAAEPo/xBUgLRWiRBY/s72-c/R+Graphics_+Device+2+(ACTIVE)-2.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-2997539729820997294</id><published>2008-10-26T14:06:00.000-07:00</published><updated>2008-10-26T14:50:39.670-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='bagging boosting SVM'/><title type='text'>Notes for Bagging and Boosting, SVM</title><content type='html'>###########################################################&lt;br /&gt;Bagging and Boosting&lt;br /&gt;The bootstrap approach does the next best thing by taking repeated a random sample, with replacement, of the same size as the original sample.&lt;br /&gt;Bagging: use the bootstrap approach which does the next best thing by taking repeated samples from the training data. We therefore end up with B different training data sets. We can train our method on each data set and then average all the predictions it will reduce the variance by sqrt(B).&lt;br /&gt;Classification bagging then, for any particular X, there are two possible approaches: 1) Record the class that each bootstrapped data set predicts and provide an overall prediction to the most commonly occurring one.(Voting). 2) If our classifier produces probability estimates we can just average the probabilities and then predict to the class with the highest probability. (Vote approach and averaging the probability estimates.)&lt;br /&gt;&lt;br /&gt;library(ipred)&lt;br /&gt;bagging.fit &lt;- bagging(Salary~., data = hitters.noNA, subset = tr, nbagg = 1000)&lt;br /&gt;bagging.pred &lt;- predict(bagging.fit, hitters.noNA, nbagg = 1000)[-tr]&lt;br /&gt;mean((hitters.noNA$Salary[-tr] - bagging.pred)^2);&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;###########################################################&lt;br /&gt;Boosting works in a similar way except that, in each iteration, in the algorithm (i.e. each new data set) it places more weight in the fitting procedure on observations that were misclassified in the previous iterations.&lt;br /&gt;The algorithm:&lt;br /&gt;Boosting allows one to produce a more flexible decision boundary than Bagging. The training error rate will go to zero, if we keep doing the boosting. Even the training error touch zero, the test error still will go down. (the plot)&lt;br /&gt;&lt;br /&gt;Relative Influence Plots: The particular variable is chosen which gives maximum reduction in the RSS over simply fitting a constant over the whole region. call this quality i. the relative influence of Xi us sum of all these i over all region for which it provide the best split over all region. In boosting, we just sum up all the RI in different trees.&lt;br /&gt;&lt;br /&gt;Partial plot: the relationship between one or more predictors after accounting for or averaging out the effects of all the other predictors.&lt;br /&gt;function: &lt;br /&gt;Boosting is often suffered from the overfitting, we also use the shrinkage as before to penalty and get much better prediction on test data.&lt;br /&gt;&lt;br /&gt;library(gbm);&lt;br /&gt;shrinkage.seq = seq(0, 0.5, length = 20);&lt;br /&gt;boost.fit &lt;- boost(Salary~., data = hitters.noNA, shrinkage = shrinkage.seq, subset = tr, n.trees = 1000, trace = F, distribution = "gaussian")&lt;br /&gt;par(mfrow=c(2,2));&lt;br /&gt;plot(boost.fit$shrinkage, boost.fit$error, type = 'b');&lt;br /&gt;# Produce 2 partial dependence plots for the 2 most influential variables. Also produce a joint partial influence plot for these 2 variables&lt;br /&gt;par(mfrow=c(1, 3));&lt;br /&gt;plot(boost.fit, i = "CHmRun");&lt;br /&gt;plot(boost.fit, i = "Walks");&lt;br /&gt;plot(boost.fit, i = c("CHmRun", "Walks"));&lt;br /&gt;&lt;br /&gt;###########################################################&lt;br /&gt;SVM: basic idea of a support vector is to find the straight line that gives the biggest separation between the classes i.e. the points are as far from the line as possible. C is the minimum perpendicular distance between each point and the separating line. We find the line which maximizes C. This line is called the “optimal separating hyperplane”&lt;br /&gt;in practice it is not usually possible to find a hyper-plane that perfectly separates two classes. In this situation we try to find the plane that gives the best separation between the points that are correctly classified subject to the points on the wrong side of the line not being off by too much. Let ξ*i represent the amount that the ith point is on the wrong side of the margin (the dashed line).we want to maximize C subject to restriction: ( &lt;= constant).The constant is a tuning parameter that we choose. &lt;br /&gt;&lt;br /&gt;basic idea of a support vector classifier Instead we can create transformations (or a basis) b1(x), b2(x), …, bM(x) and find the optimal hyper-plane in the space spanned by b1(X), b2(X), …, bM(X). we choose something called a Kernel function which takes the place of the basis. Common kernel functions include: Linear, Polynomial, Radial Basis, Sigmoid&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;library(e1071)&lt;br /&gt;svmfit &lt;- svm(Salary~., SalaryData, subset = tr);&lt;br /&gt;svmpred &lt;- predict(svmfit, SalaryData)[-tr];&lt;br /&gt;table1 &lt;- table(svmpred, SalaryData$Salary[-tr]);&lt;br /&gt;table1&lt;br /&gt;(table1[1, 1] + table1[2, 2])/sum(table1);&lt;br /&gt;(table1[1, 2] + table1[2, 1])/sum(table1);&lt;br /&gt;mean(svmpred != SalaryData$Salary[-tr])&lt;br /&gt;&lt;br /&gt;svmfit &lt;- svm(Salary~., SalaryData, subset = tr, kernel = "linear", cost = opti.cost);&lt;br /&gt;svmfit &lt;- svm(Salary~., SalaryData, subset = tr, kernel = "polynomial", cost = opti.cost);&lt;br /&gt;svmfit &lt;- svm(Salary~., SalaryData, subset = tr, kernel = "radial", cost = opti.cost);&lt;br /&gt;svmfit &lt;- svm(Salary~., SalaryData, subset = tr, kernel = "sigmoid", cost = opti.cost);&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-2997539729820997294?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/2997539729820997294/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=2997539729820997294&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/2997539729820997294'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/2997539729820997294'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/10/bagging-and-boosting.html' title='Notes for Bagging and Boosting, SVM'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-3560731083737498943</id><published>2008-10-21T16:07:00.000-07:00</published><updated>2008-10-26T13:56:52.627-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='nnet tree'/><title type='text'>Notes for nnet, tree</title><content type='html'># use {nnet} to Neural Networks&lt;br /&gt;Neural Networks were first developed as a model for the human brain: Getting a prediction for Y involves 2 steps.&lt;br /&gt;First we get predictions for Z1,….,ZM (hidden units) using the X’s. Second we predict Y using Z1,….,ZM.&lt;br /&gt;&lt;br /&gt;Regression Equation: &lt;br /&gt;the simple example: &lt;br /&gt;One common approach is to find the α’s and β’s that minimize RSS&lt;br /&gt;Classification Equation&lt;br /&gt;algorithm:&lt;br /&gt;Given the Z’s and f(X)’s one can compute derivatives for the change in RSS as the α’s and β’s change. Hence we can either increase or decrease each α and β (according to the sign on the derivative) so as to reduce RSS.&lt;br /&gt;&lt;br /&gt;By making M large enough, neural networks allow one to fit almost arbitrarily flexible. another example of a neural network fit where the test error rate starts to increase as we increase the number of hidden units. &lt;br /&gt;This penalty (called weight decay) forces the neural network fit to be smoother. The penalty function: once we use weight decay in the fit the error rate becomes fairly insensitive to the number of hidden units (as long as we have enough).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;dim(spam)&lt;br /&gt;tr = 1:2500;&lt;br /&gt;library(nnet)&lt;br /&gt;# The nnet() function implements a single hidden layer neural network. The&lt;br /&gt;# syntax is almost identical to that for lm() etc. &lt;br /&gt;# There are five possible additional variables to feed nnet(). First we must tell&lt;br /&gt;# it how many hidden units to use through the command size=?. We can also specify a &lt;br /&gt;# decay value by decay=? (by default this is zero if we don’t specify anything). &lt;br /&gt;# By default the maximum number of iterations it&lt;br /&gt;# does is 100. You can change the default using maxit=?. Finally, if you&lt;br /&gt;# are using nnet for a regression (rather than a classification) problem&lt;br /&gt;# you need to set linout=T to tell nnet to use a linear output (rather&lt;br /&gt;# than a sigmoid output that is used for a classification situation).&lt;br /&gt;&lt;br /&gt;# Fit a neural network to your training data using 2 hidden units and 0 decay.&lt;br /&gt;# decay parameter for weight decay. Default 0. It is the penalty on large coefficient&lt;br /&gt;nnfit = nnet(email ~ ., spam, size = 2, subset = tr);&lt;br /&gt;summary(nnfit)&lt;br /&gt;# Refit the neural network with 10 hidden units and 0 decay.&lt;br /&gt;nnfit = nnet(email ~ ., spam, size = 10, subset = tr);&lt;br /&gt;&lt;br /&gt;# Use the nnet.cv() function (on the training data) to estimate the optimal decay for your data when using 10 hidden units.&lt;br /&gt;decayseq &lt;- 10^seq(-3, 0, length = 10)&lt;br /&gt;nnfit &lt;- nnet.cv(email ~ ., data = spam, size = 10, trace = T, decay = decayseq, cv = 5);&lt;br /&gt;windows()&lt;br /&gt;plot(nnfit$decay, nnfit$cv, type = "l", ylim = c(0.03, 0.08))&lt;br /&gt;lines(nnfit$decay, nnfit$train, col = "red");&lt;br /&gt;## use the optimal decay&lt;br /&gt;nnfit=nnet(email~.,spam,subset=tr,size=10,decay=1.4678)&lt;br /&gt;nnpred &lt;- predict(nnfit, spam, type = "class");&lt;br /&gt;table1 &lt;- table(nnpred[-tr], spam[-tr, "email"])&lt;br /&gt;table1;&lt;br /&gt;(table1[1, 1] + table1[2, 2]) / sum(table1)&lt;br /&gt;mean(nnpred[-tr] == spam[-tr, "email"])&lt;br /&gt;mean(nnpred[-tr] != spam[-tr, "email"])&lt;br /&gt;&lt;br /&gt;#######################################################################&lt;br /&gt;## library(tree)&lt;br /&gt;Tree: make predictions in a regression problem is to divide the predictor space (i.e. all the possible values for X1,X2,…,Xp) into distinct regions, say R1, R2,…,Rk. every X that falls in a particular region (say Rj) we make the same prediction create the partitions by iteratively splitting one of the X variables into two regions. We can always represent them using a tree structure. This provides a very simple way to explain the model to a non-expert&lt;br /&gt;For region Rj the best prediction is simply the average of all the responses from our training data that fell in Rj.  &lt;br /&gt;We consider splitting into two regions, Xj larger than s and Xj less than s for all possible values of s and j. We then choose the s and j that results in the lowest RSS on the training data.&lt;br /&gt;The end points of the tree are called “terminal nodes”.&lt;br /&gt;&lt;br /&gt;Classification tree: For each region (or node) we predict the most common category among the training data within that region. but minimizing RSS no longer makes sense. We use the criteria minimize the gini index or cross-entropy. If the node has similar prob for each category, the entroy will be larger.&lt;br /&gt;We can improve accuracy by “pruning” the tree; cutting off some of the terminal nodes. use cross validation to see which tree has the lowest error rate. &lt;br /&gt;#######################################################################&lt;br /&gt;library(tree)&lt;br /&gt;tr=1:400;&lt;br /&gt;OJ.tree &lt;- tree(Purchase ~., OJdata, subset = tr);&lt;br /&gt;summary(OJ.tree)&lt;br /&gt;OJ.tree&lt;br /&gt;plot(OJ.tree)&lt;br /&gt;text(OJ.tree , pretty = 0, cex = 0.8)&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5k8GS0nAI/AAAAAAAADdY/auIZvij3cuo/s1600-h/tree.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5k8GS0nAI/AAAAAAAADdY/auIZvij3cuo/s400/tree.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5259752398459804674" /&gt;&lt;/a&gt;&lt;br /&gt;tree.pred &lt;- predict(OJ.tree, OJdata, type = "class")&lt;br /&gt;table1 &lt;- table(tree.pred[-tr], OJdata$Purchase[-tr])&lt;br /&gt;table1&lt;br /&gt;(table1[1, 1] + table1[2, 2])/sum(table1)&lt;br /&gt;(table1[1, 2] + table1[2, 1])/sum(table1)&lt;br /&gt;### Pruning the Tree #####################&lt;br /&gt;## Use the cv.tree() function on the tree from question 1 to compute the cross validation statistics for different tree sizes. &lt;br /&gt;# By default K=10 (i.e. the data is divided into 10 parts).&lt;br /&gt;cv.OJ.tree &lt;- cv.tree(OJ.tree, K = 30)&lt;br /&gt;names(cv.OJ.tree);&lt;br /&gt;windows()&lt;br /&gt;par(mfrow = c(1, 2))&lt;br /&gt;plot(cv.OJ.tree$size, cv.OJ.tree$dev, type ='b')&lt;br /&gt;plot(cv.OJ.tree$k, cv.OJ.tree$dev, type ='b')&lt;br /&gt;cv.OJ.tree$k[order(cv.OJ.tree$dev)[1]]&lt;br /&gt;# k: cost-complexity parameter defining either a specific subtree of tree &lt;br /&gt;# The k from the optimal cross validation is around 12.2.&lt;br /&gt;prune.OJ.tree &lt;- prune.tree(OJ.tree, k =12.3)&lt;br /&gt;windows();&lt;br /&gt;plot(prune.OJ.tree)&lt;br /&gt;text(prune.OJ.tree , pretty = 0, cex = 0.8)&lt;br /&gt;&lt;br /&gt;summary(prune.OJ.tree)&lt;br /&gt;tree.pred &lt;- predict(prune.OJ.tree, OJdata, type = "class")&lt;br /&gt;table1 &lt;- table(tree.pred[-tr], OJdata$Purchase[-tr])&lt;br /&gt;table1&lt;br /&gt;(table1[1, 1] + table1[2, 2])/sum(table1)&lt;br /&gt;(table1[1, 2] + table1[2, 1])/sum(table1)&lt;br /&gt;# The mindev part of this line controls how far the tree is grown &lt;br /&gt;# (the default value is 0.01). As we choose larger values for mindev &lt;br /&gt;# the tree is made smaller and as we make mindev larger we get bigger trees.&lt;br /&gt;# Find the best dev&lt;br /&gt;mindev.seq = 10^seq(-3, -1,length=100)&lt;br /&gt;library(tree)&lt;br /&gt;tr=1:800;&lt;br /&gt;test.error &lt;- c();&lt;br /&gt;tree.size &lt;- c();&lt;br /&gt;for(j in 1:100){&lt;br /&gt; OJ.tree &lt;- tree(Purchase ~., OJdata, subset = tr, control = tree.control(nobs=800, mindev=mindev.seq[j]));&lt;br /&gt; tmp1 &lt;- summary(OJ.tree)$size&lt;br /&gt; tree.size &lt;- c(tree.size, tmp1);&lt;br /&gt; tree.pred &lt;- predict(OJ.tree, OJdata, type = "class")&lt;br /&gt; table1 &lt;- table(tree.pred[-tr], OJdata$Purchase[-tr])&lt;br /&gt; tmp2 &lt;- (table1[1, 2] + table1[2, 1])/sum(table1)&lt;br /&gt; test.error &lt;- c(test.error, tmp2);&lt;br /&gt;}&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-3560731083737498943?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/3560731083737498943/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=3560731083737498943&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/3560731083737498943'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/3560731083737498943'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/10/notes-for-nnet.html' title='Notes for nnet, tree'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5k8GS0nAI/AAAAAAAADdY/auIZvij3cuo/s72-c/tree.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-1109927009946158841</id><published>2008-10-21T15:28:00.000-07:00</published><updated>2008-10-26T15:11:30.210-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='regression nonlinear gam'/><title type='text'>notes for Polynomial Regression, Splines and GAM</title><content type='html'>Linear models have significant advantages in terms of interpretation/inference. However, standard linear regression has significant limitations in terms of prediction power. Polynomial Regression is just a particular kind of “Basis Function”.&lt;br /&gt;# use MASS library&lt;br /&gt;# Use the poly() function to fit a polynomial regression&lt;br /&gt;library(MASS)&lt;br /&gt;attach(Boston);&lt;br /&gt;names(Boston)&lt;br /&gt;polyfit2 &lt;- lm(nox ~ poly(dis, 2));&lt;br /&gt;summary(polyfit2)&lt;br /&gt;polyfit3 &lt;- lm(nox ~ poly(dis, 3));&lt;br /&gt;summary(polyfit3)&lt;br /&gt;# The plot of regression fit with power of 3:&lt;br /&gt;windows();&lt;br /&gt;plot(dis, nox);&lt;br /&gt;lines(sort(dis), polyfit3$fit[order(dis)], col = 2, lwd = 3)&lt;br /&gt;###########################################################&lt;br /&gt;Instead of fitting a high dimensional polynomial over the entire range of x.  a spline works by fitting different low dimensional polynomials over different regions of x.  For example a cubic spline works by fitting a cubic y=ax3+bx2+cx+d but the coefficients a, b, c and d may differ depending on which part of x we are looking at. The more knots that are used the more flexible the spline is.&lt;br /&gt;&lt;br /&gt;It appears from this example that there are 8 parameters (or degrees of freedom) for us to choose. However, in reality there are a number of constraints. For example it makes sense to insist that the cubic curves meet at the knots. To make the curve smooth we also insist that the first and second derivatives are equal at the knots. This means there are really only 5 parameters we get to choose. In general there will be 4 + #knots free parameters to choose between.&lt;br /&gt;# Use the bs() function to fit a spline regression to nox (Y) and dis (X). &lt;br /&gt;# require the {gam} and {splines} package&lt;br /&gt;#############################################################&lt;br /&gt;library(gam)&lt;br /&gt;# Generate the B-spline basis matrix for a polynomial spline using bs(), basis function&lt;br /&gt;# df: degrees of freedom; df = length(knots) + 3 + intercept = #knots + 4, since we make equal at knot points, first and second derivatives&lt;br /&gt;# knots the internal breakpoints that define the spline. The default is NULL, &lt;br /&gt;#    which results in a basis for ordinary polynomial regression. &lt;br /&gt;# degree: degree of the piecewise polynomial—default is 3 for cubic splines. &lt;br /&gt;splinefit3 &lt;- lm(nox ~ bs(dis, 3))&lt;br /&gt;summary(splinefit3);&lt;br /&gt;# df = 4, which means 1 knot&lt;br /&gt;splinefit4 &lt;- lm(nox ~ bs(dis, 4))&lt;br /&gt;summary(splinefit4);&lt;br /&gt;&lt;br /&gt;windows();&lt;br /&gt;plot(dis, nox);&lt;br /&gt;lines(sort(dis), splinefit4$fit[order(dis)], col = 2, lwd = 3)&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5aUnxaS8I/AAAAAAAADdI/zQB3rbHIleA/s1600-h/spline.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5aUnxaS8I/AAAAAAAADdI/zQB3rbHIleA/s400/spline.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5259740725135428546" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;# smoothspline &lt;br /&gt;Smooth Spline: What we really want to do is find some function, say g(xi) such that it fits the observed data wells. However, if we don’t put any constraints on g(xi) then we can always set RSS equal to zero simply by choosing a g(xi) that interpolates all the Y’s. What we really want is a g that makes RSS small but is also smooth. A penalty on the integrative of second derivative of g(xi).&lt;br /&gt;# cv ordinary (TRUE) or “generalized” cross-validation (GCV) when FALSE. &lt;br /&gt;smoothfit &lt;- smooth.spline(dis, nox, cv = T)&lt;br /&gt;&lt;br /&gt;###########################################################&lt;br /&gt;## Generalized additive model (GAM)&lt;br /&gt;When we have several predictors and want to achieve a non-linear fit, a natural way to extend the multiple linear regression model is to replace each linear part, ßjXij, with fj(Xij) where fj is some smooth non-linear function.&lt;br /&gt;Pros of GAM: (Generalized additive model) &lt;br /&gt;1. By allowing one to fit a non-linear fj to each Xj, model non-linear relationships. 2. We can potentially make more accurate predictions.&lt;br /&gt;3. Because we are fitting an additive model we can still examine the effect of each Xj on Y individually, still good for inference.&lt;br /&gt;cons: The model is restricted to be additive. a simple interaction between X1 and X2 can’t automatically be modeled using GAM &lt;br /&gt;We can extend logistic regression to allow for non-linear relationships using the GAM framework&lt;br /&gt;###########################################################&lt;br /&gt;gamfit = gam(nox ~ ., data = Boston); #same as lm, all linear functions&lt;br /&gt;summary(gamfit)&lt;br /&gt;par(mfrow=c(4, 4)); plot(gamfit);&lt;br /&gt;# GAM to use non-linear fits for most of the variables. &lt;br /&gt;# To do this we use the notation s(X) instead of X. &lt;br /&gt;gamfit=gam(medv~s(crim)+s(zn)+s(indus)+chas+s(nox)+s(rm)+s(dis)+s(rad)+s(tax)+s(ptratio)+s(black)+s(lstat),data=Boston)&lt;br /&gt;par(mfrow=c(4, 4)); plot(gamfit,se=T)&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5fpmF224I/AAAAAAAADdQ/GXU6q4x_2Mg/s1600-h/gam.PNG"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5fpmF224I/AAAAAAAADdQ/GXU6q4x_2Mg/s400/gam.PNG" border="0" alt=""id="BLOGGER_PHOTO_ID_5259746583019707266" /&gt;&lt;/a&gt;&lt;br /&gt;# Now use the significant non linear varialbes. and test the prediction power&lt;br /&gt;tr=1:400&lt;br /&gt;gamfit=gam(medv~crim+zn+indus+chas+nox+s(rm)+dis+rad+tax + ptratio+black+s(lstat),data=Boston,subset=tr) #s(x, df=4, spar=1)&lt;br /&gt;mean((predict(gamfit,Boston)[-tr]-medv[-tr])^2)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-1109927009946158841?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/1109927009946158841/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=1109927009946158841&amp;isPopup=true' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/1109927009946158841'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/1109927009946158841'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/10/notes-for-polynomial-regression-and.html' title='notes for Polynomial Regression, Splines and GAM'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5aUnxaS8I/AAAAAAAADdI/zQB3rbHIleA/s72-c/spline.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-4708644858750227130</id><published>2008-10-21T14:52:00.000-07:00</published><updated>2008-11-04T15:32:55.684-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='regression'/><title type='text'>note for best subset regression and ridge lasso regression</title><content type='html'>Best Subset Selection: run a linear regression for each possible combination of the X predictors. One simple approach would be to take the subset with the smallest RSS. Unfortunately, one can show that the model that includes all the variables will always have the smallest RSS. There are many measures that people use: Adjusted R2; AIC (Akaike information criterion); BIC (Bayesian information criterion); Cp (equivalent to AIC for linear regression)&lt;br /&gt;&lt;br /&gt;What we would really like to do is to find the set of variables that give the lowest test (not training) error rate. If we have a large data set we can achieve this goal by splitting the data into training, and validation parts. We would then use the training part to build each possible model (i.e. the different combinations of variables) and choose the model that gave the lowest error rate when applied to the validation data. We can also split the data into training, validation and testing parts. We would then use the training part to build each possible model and choose the model that gave the lowest error rate when applied to the validation data. Finally, the error rate on the test data would give us an estimate of how well the method would work on new observations.&lt;br /&gt;&lt;br /&gt;# lasso use {lars} package, ridge and optim.lm use {leaps} package&lt;br /&gt;#######################################################################&lt;br /&gt;# use leaps package to find the best linear model based on BIC AIC&lt;br /&gt;# select 300 out of total 400 samples to be the training set and 100 be the test set.&lt;br /&gt;###########################################################&lt;br /&gt;# names(carseats)&lt;br /&gt;# [1] "Sales" "CompPrice" "Income"  "Advertising" "Population"  "Price" "ShelveLoc" # "Age" "Education" "Urban" "US"&lt;br /&gt;td=sample(400,300,replace=F);&lt;br /&gt;carseat.train &lt;- carseats[td, ]&lt;br /&gt;carseat.test &lt;- carseats[-td, ]&lt;br /&gt;# use optim.lm function to try to select the “best” linear model from my training data:&lt;br /&gt;###########################################################&lt;br /&gt;"optim.lm" &lt;-&lt;br /&gt;function(data,y,val.size=NULL,cv=10,bic=T,test=T,really.big=F){&lt;br /&gt;    library(MASS)&lt;br /&gt;    library(leaps)&lt;br /&gt;    n.plots &lt;- sum(c(bic,test,cv!=0))&lt;br /&gt;    par(mfrow=c(n.plots,1))&lt;br /&gt;    n &lt;- nrow(data)&lt;br /&gt;    names(data)[(names(data)==y)] &lt;- "y"&lt;br /&gt;    X &lt;- lm(y~.,data,x=T)$x&lt;br /&gt;    data &lt;- as.data.frame(cbind(data$y,X[,-1]))&lt;br /&gt;    names(data)[1] &lt;- "y"&lt;br /&gt;    if (is.null(val.size))&lt;br /&gt;      val.size &lt;- round(n/4)&lt;br /&gt;    s &lt;- sample(n,val.size,replace=F)&lt;br /&gt;    data.train &lt;- data[-s,]&lt;br /&gt;    data.test &lt;- data[s,]&lt;br /&gt;    p &lt;- ncol(data)-1&lt;br /&gt;    data.names &lt;- names(data)[names(data)!="y"]&lt;br /&gt;    regfit.full &lt;-&lt;br /&gt;      regsubsets(y~.,data=data,nvmax=p,nbest=1,really.big=really.big)&lt;br /&gt;    bic.lm &lt;-&lt;br /&gt;      lm(as.formula(paste("y~",paste(data.names[summary(regfit.full)$which[order(summary(regfit.full)$bic)[1],][-1]],collapse="+"))),data=data)&lt;br /&gt;    if (bic){&lt;br /&gt;      plot(summary(regfit.full)$bic,type='l',xlab="Number of Predictors",&lt;br /&gt;           ylab="BIC",main="BIC Method")&lt;br /&gt;      points(summary(regfit.full)$bic,pch=20)&lt;br /&gt;      points(order(summary(regfit.full)$bic)[1],min(summary(regfit.full)$bic),col=2,pch=20,cex=1.5)}&lt;br /&gt;    regfit &lt;- regsubsets(y~.,data=data.train,nvmax=p,nbest=1,really.big=really.big)&lt;br /&gt;    cv.rss &lt;- rootmse &lt;- rep(0,p)&lt;br /&gt;    for (i in 1:p){&lt;br /&gt;      data.lmfit &lt;-lm(as.formula(paste("y~",paste(data.names[summary(regfit)$which[i,][-1]],collapse="+"))),data=data.train)&lt;br /&gt;      rootmse[i]&lt;-sqrt(mean((predict(data.lmfit,data.test)-data.test$y)^2))}&lt;br /&gt;    if (test){&lt;br /&gt;      plot(rootmse,type='l',xlab="Number of Predictors",&lt;br /&gt;           ylab="Root Mean RSS on Validation Data",main="Validation Method")&lt;br /&gt;      points(rootmse,pch=20)&lt;br /&gt;      points(order(rootmse)[1],min(rootmse),col=2,pch=20,cex=1.5)}&lt;br /&gt;    validation.lm &lt;-&lt;br /&gt;      lm(as.formula(paste("y~",paste(data.names[summary(regfit)$which[order(rootmse)[1],][-1]],collapse="+"))),data=data)&lt;br /&gt;    if (cv!=0){&lt;br /&gt;      s &lt;- sample(cv,n,replace=T)&lt;br /&gt;      if (cv==n)&lt;br /&gt;        s &lt;- 1:n&lt;br /&gt;      for (i in 1:p)&lt;br /&gt;        for (j in 1:cv){&lt;br /&gt;          data.train &lt;- data[s!=j,]&lt;br /&gt;          data.test &lt;- data[s==j,]&lt;br /&gt;          data.lmfit&lt;-lm(as.formula(paste("y~",paste(data.names[summary(regfit.full)$which[i,][-1]],collapse="+"))),data=data.train)&lt;br /&gt;          cv.rss[i]&lt;-cv.rss[i]+sum((predict(data.lmfit,data.test)-data.test$y)^2)&lt;br /&gt;        }&lt;br /&gt;      cv.rss &lt;- sqrt(cv.rss/n)&lt;br /&gt;      plot(cv.rss,type='l',xlab="Number of Predictors",&lt;br /&gt;           ylab="Cross-Validated Root Mean RSS",main="Cross-Validation Method")&lt;br /&gt;      points(cv.rss,pch=20)&lt;br /&gt;      points(order(cv.rss)[1],min(cv.rss),col=2,pch=20,cex=1.5)&lt;br /&gt;      cv.lm &lt;-&lt;br /&gt;        lm(as.formula(paste("y~",paste(data.names[summary(regfit.full)$which[order(cv.rss)[1],][-1]],collapse="+"))),data=data)}&lt;br /&gt;    else&lt;br /&gt;      cv.lm &lt;- NULL&lt;br /&gt;    list(bic.lm = bic.lm,validation.lm=validation.lm,cv.lm=cv.lm)&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;library(leaps);&lt;br /&gt;optim.lm.train &lt;- optim.lm(carseat.train, "Sales");&lt;br /&gt;names(optim.lm.train);&lt;br /&gt;# Report the model selected by BIC, validation, or CV (cross validation)&lt;br /&gt;summary(optim.lm.train$bic)&lt;br /&gt;summary(optim.lm.train$validation)&lt;br /&gt;summary(optim.lm.train$cv)&lt;br /&gt;# Predict the residual sum of squares (or mean sum of squares) on my test data set. &lt;br /&gt;# If we only use the mean of Y, the mean of sum squares is:&lt;br /&gt;mean((carseats$Sales[-td] - mean(carseats$Sales[td]))^2)&lt;br /&gt;# if we use model selected by BIC:&lt;br /&gt;lm.bic=lm(Sales~CompPrice+Income+Advertising+Price+ShelveLoc+Age,data = carseats,subset=td);&lt;br /&gt;# the mean of sum squares is:&lt;br /&gt;mean((carseats$Sales[-td]-predict(lm.bic,carseats)[-td])^2);&lt;br /&gt;&lt;br /&gt;#### ridge regression #####################################&lt;br /&gt;Ridge Regression add a “penalty” on sum of squared betha. This has the effect of “shrinking” large values of beta towards zero. As a result the ridge regression estimates are often more accurate. Notice when lambda=0 we get OLS but as lambda gets larger the beta’s will get closer to zero: more shrinkage. Because: It turns out that the OLS estimates generally have low bias but can be highly variable. In particular when n and p are a similar size the OLS estimates will be extremely variable. The penalty term makes the ridge regression estimates biased but can also substantially reduce their variance.&lt;br /&gt;&lt;br /&gt;# lambda is the penalty coefficient on the beta squares&lt;br /&gt;# lambda 0, beta same as OLS&lt;br /&gt;###########################################################&lt;br /&gt;lambda.set &lt;- 10^(seq(-2, 8, length = 100));&lt;br /&gt;ridge.train &lt;- lm.ridge(Sales~., carseats, subset = td, lambda = lambda.set)&lt;br /&gt;select(ridge.train);&lt;br /&gt;# modified HKB estimator is 0.6875941 &lt;br /&gt;# modified L-W estimator is 1.311013 &lt;br /&gt;# smallest value of GCV  at 0.5214008&lt;br /&gt;# The best lambda from GCV is 0.5214.&lt;br /&gt;# Using this lambda, we can get the best model:&lt;br /&gt;&lt;br /&gt;ridge.train.cv &lt;- lm.ridge(Sales~., carseats, subset = td, lambda = 0.5214);&lt;br /&gt;ridge.train.cv$coef;&lt;br /&gt;ridge.pred.cv &lt;- pred.ridge(ridge.train.cv, Sales~., carseats)&lt;br /&gt;mean((carseats$Sales[-td] - ridge.pred.cv[-td])^2)&lt;br /&gt;&lt;br /&gt;------ ## iterations ###############&lt;br /&gt;rss.ridge &lt;- rep(0, 100);&lt;br /&gt;for(i in 1:100){&lt;br /&gt;ridge.train &lt;- lm.ridge(Sales~., carseats, subset = td, lambda = lambda.set[i]);&lt;br /&gt;ridge.pred &lt;- pred.ridge(ridge.train, Sales~., carseats);&lt;br /&gt;rss.ridge[i] &lt;- mean((carseats$Sales[-td] - ridge.pred[-td])^2);&lt;br /&gt;}&lt;br /&gt;min(rss.ridge);&lt;br /&gt;plot(rss.ridge, type = "l")&lt;br /&gt;best.lambda &lt;- lambda.set[order(rss.ridge)[1]]&lt;br /&gt;best.lambda;&lt;br /&gt;&lt;br /&gt;ridge.best &lt;- lm.ridge(Sales~., carseats, subset = td, lambda = best.lambda);&lt;br /&gt;ridge.best$coef&lt;br /&gt;&lt;br /&gt;####  LASSO #######################################################&lt;br /&gt;Ridge Regression isn’t perfect. One significant problem is that the squared penalty will never force any of the coefficients to be exactly zero. Hence the final model will include all variables, making it harder to interpret. A very modern alternative is the LASSO. The LASSO works in a similar way to ridge regression except that it uses an L1 penalty. LASSO is not quite as computational efficient as ridge regression, however, there are efficient algorithm exist and still faster than subset selection.&lt;br /&gt;# s is the constraint sum |beta| &lt; s, s infinity, beta same as OLS&lt;br /&gt;###################################################################&lt;br /&gt;"cv.lasso" &lt;-&lt;br /&gt;function(formula,data,subset=NULL,K=10){&lt;br /&gt;  if (!is.null(subset))&lt;br /&gt;    data &lt;- data[subset,]&lt;br /&gt;  y &lt;- data[,names(data)==as.character(formula)[2]]&lt;br /&gt;  x &lt;- model.matrix(as.formula(formula),data)[,-1]&lt;br /&gt;larsfit &lt;- cv.lars(x,y,K=K)&lt;br /&gt;larsfit&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;"lasso" &lt;-&lt;br /&gt;function(formula,data,subset=NULL){&lt;br /&gt;  if (!is.null(subset))&lt;br /&gt;    data &lt;- data[subset,]&lt;br /&gt;  y &lt;- data[,names(data)==as.character(formula)[2]]&lt;br /&gt;  x &lt;- model.matrix(as.formula(formula),data)[,-1]&lt;br /&gt;larsfit &lt;- lars(x,y,type="lasso")&lt;br /&gt;larsfit&lt;br /&gt;}&lt;br /&gt;library(lars);&lt;br /&gt;lasso.fit &lt;- lasso(Sales~., carseats, subset = td);&lt;br /&gt;plot(lasso.fit);&lt;br /&gt;lasso.fit&lt;br /&gt;##  use cv.lasso to get best s:&lt;br /&gt;lasso.cv &lt;- cv.lasso(Sales~., carseats, subset = td);&lt;br /&gt;s &lt;- lasso.cv$fraction[order(lasso.cv$cv)[1]]&lt;br /&gt;s&lt;br /&gt;lasso.pred &lt;- pred.lasso(lasso.fit, Sales~., carseats, s)&lt;br /&gt;mean((carseats$Sales[-td] - lasso.pred[-td])^2);&lt;br /&gt;pred.lasso(lasso.fit, Sales~., carseats, s, "coefficients")&lt;br /&gt;&lt;br /&gt;#### iterations #########&lt;br /&gt;s.set &lt;- seq(0, 1, length = 100);&lt;br /&gt;rss.lasso &lt;- rep(0, 100);&lt;br /&gt;for(i in 1:100){&lt;br /&gt;lasso.pred &lt;- pred.lasso(lasso.fit, Sales~., carseats, s = s.set[i]);&lt;br /&gt;rss.lasso[i] &lt;- mean((carseats$Sales[-td] - lasso.pred[-td])^2);&lt;br /&gt;}&lt;br /&gt;min(rss.lasso)&lt;br /&gt;plot(rss.lasso,type="l")&lt;br /&gt;s &lt;- s.set[order(rss.lasso)[1]];&lt;br /&gt;s&lt;br /&gt;pred.lasso(lasso.fit, Sales~., carseats, s, "coefficients")&lt;br /&gt;###########################################################&lt;br /&gt;# we plot just predict mean, OLS, BIC, ridge regression and the &lt;br /&gt;# LASSO in one plot, it can be explained more clearly.&lt;br /&gt;rss.raw=mean((carseats$Sales[-td]-mean(carseats$Sales[td]))^2)&lt;br /&gt;rss.raw&lt;br /&gt;&lt;br /&gt;lmfit=lm(Sales~.,carseats,subset=td)&lt;br /&gt;rss.ols=mean((carseats$Sales[-td]-predict(lmfit,carseats)[-td])^2)&lt;br /&gt;rss.ols&lt;br /&gt;&lt;br /&gt;plot(1:100,1:100,ylim=c(1,10),ylab="Test Mean RSS",xlab="Tuning Parameter", type="n")&lt;br /&gt;abline(rss.raw,0,lwd=1,lty=2, col = "green")&lt;br /&gt;abline(rss.ols,0,lwd=1,lty=3, col = "blue")&lt;br /&gt;abline(rss.bic,0,lwd=1,lty=4, col = "grey")&lt;br /&gt;lines(rss.lasso,lwd=1,lty=5, col = "red")&lt;br /&gt;lines(rss.ridge,lwd=1,lty=6, col = "orange")&lt;br /&gt;legend(70,7,c("Raw","OLS","BIC","LASSO","Ridge"),col = c("green", "blue", "grey", "red", "orange"), lty=c(2,3,4,5,1),lwd=1)&lt;br /&gt;&lt;br /&gt;# OLS with all the variables give the smallest RSS, &lt;br /&gt;# while the simple mean give the largest RSS&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5Wcl7aTeI/AAAAAAAADcw/_50UWMiV-WU/s1600-h/rss.PNG"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5Wcl7aTeI/AAAAAAAADcw/_50UWMiV-WU/s400/rss.PNG" border="0" alt=""id="BLOGGER_PHOTO_ID_5259736464032943586" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-4708644858750227130?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/4708644858750227130/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=4708644858750227130&amp;isPopup=true' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4708644858750227130'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4708644858750227130'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/10/note-for-ridge-lasso-regression.html' title='note for best subset regression and ridge lasso regression'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_xLDEmoNB_RM/SP5Wcl7aTeI/AAAAAAAADcw/_50UWMiV-WU/s72-c/rss.PNG' height='72' width='72'/><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-3565517835062092090</id><published>2008-10-21T14:25:00.000-07:00</published><updated>2008-11-04T15:37:46.530-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='regression'/><category scheme='http://www.blogger.com/atom/ns#' term='iom'/><title type='text'>note for linear and logistic regression</title><content type='html'># general way of linear and logistic regression&lt;br /&gt;One common measure of accuracy is the residual sum of squares , U shape means non-linear relationship, normal quantile plot for normality, constant variance of error terms, independence of error terms.&lt;br /&gt;&lt;br /&gt;# simple regression&lt;br /&gt;car[1:10, ];&lt;br /&gt;library(MASS)&lt;br /&gt;lm.simple1 &lt;- lm(Mpg ~ Weight, data = car);&lt;br /&gt;summary(lm.simple1);&lt;br /&gt;###########################################################&lt;br /&gt;windows(); # plot the fitted line&lt;br /&gt;plot(car$Weight, car$Mpg, xlab = "Weight", ylab = "Mpg");&lt;br /&gt;abline(lm.simple1, col = "red");&lt;br /&gt;###########################################################&lt;br /&gt;windows(); # resid ~ x plots see any dependency&lt;br /&gt;plot(car$Weight, residuals(lm.simple1), xlab = "Weight", ylab = "resids");&lt;br /&gt;abline(h = 0, col = "red");&lt;br /&gt;lines(loess.smooth(car$Weight, residuals(lm.simple1)), col = "green", lty = 5);&lt;br /&gt;###########################################################&lt;br /&gt;windows(); # std error against the fitted value&lt;br /&gt;plot(fitted(lm.simple1), stdres(lm.simple1), xlab = "Fitted Mpg", ylab = "standardized resids");&lt;br /&gt;abline(h = 0, col = "red");&lt;br /&gt;lines(loess.smooth(fitted(lm.simple1), stdres(lm.simple1)), col = "green", lty = 5);&lt;br /&gt;###########################################################&lt;br /&gt;windows(20 ,20)&lt;br /&gt;par(mfrow=c(2,2))&lt;br /&gt;plot(density(stdres(lm.simple1)), main = "density of standardized residuals")&lt;br /&gt;qqnorm(stdres(lm.simple1)); # QQ plot of the std errors&lt;br /&gt;qqline(stdres(lm.simple1))&lt;br /&gt;###########################################################&lt;br /&gt;# Multiple Linear Regression&lt;br /&gt;###########################################################&lt;br /&gt;pairs(car);&lt;br /&gt;lm.mlti &lt;- lm(Mpg ~ Cylind. + Disp. + Horse. + Weight + Accel. + Year + Origin, data = car)&lt;br /&gt;# dot say use all the variable in data except Mpg to do linear regression&lt;br /&gt;lm.mlti &lt;- lm(Mpg ~ ., data = car) &lt;br /&gt;summary(lm.mlti)&lt;br /&gt;# .-Weight say use all the variable in data except Mpg and Weight to do linear regression&lt;br /&gt;lm.mlti &lt;- lm(Mpg ~ .-Weight, data = car) &lt;br /&gt;# means Disp. + Weight + Disp.:Weight (interactions)&lt;br /&gt;lm.mlti.e &lt;- lm(Mpg ~ Disp. * Weight, data = car)&lt;br /&gt;summary(lm.mlti.e)&lt;br /&gt;###########################################################&lt;br /&gt;## Logistic Regression, use glm function&lt;br /&gt;###########################################################&lt;br /&gt;glm.CYT &lt;- glm(Loc_CYT ~ mcg + gvh + alm + mit + erl + pox + vac + nuc, family = binomial, data = dataset_new);&lt;br /&gt;summary(glm.CYT);&lt;br /&gt;contrasts(dataset_new$Loc_CYT);&lt;br /&gt;table1 &lt;- table(predict(glm.CYT1, type = "response") &gt; 0.5, dataset_new$Loc_CYT); # confusion table&lt;br /&gt;table1&lt;br /&gt;size &lt;- dim(dataset_new);&lt;br /&gt;## check the prediction accuracy&lt;br /&gt;(table1[1, 1] + table1[2, 2]) / size[1];&lt;br /&gt;## use partial of the dataset as training dataset, the others as test dataset:&lt;br /&gt;glm.CYT5 &lt;- glm(Loc_CYT ~ gvh + alm + mit + nuc, family = binomial, &lt;br /&gt;data = dataset_new, subset = 1:900);&lt;br /&gt;table2 &lt;- table(predict(glm.CYT5, dataset_new, type = "response")[-(1:900)] &gt; 0.5, &lt;br /&gt;dataset_new$Loc_CYT[-(1:900)]);&lt;br /&gt;table2&lt;br /&gt;(table2[1, 1] + table2[2, 2]) / (size[1] - 900);&lt;br /&gt;&lt;br /&gt;####################################################&lt;br /&gt;A good website can be find here:&lt;br /&gt;&lt;a href="http://www.statmethods.net/stats/regression.html"&gt;http://www.statmethods.net/stats/regression.html&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-3565517835062092090?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/3565517835062092090/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=3565517835062092090&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/3565517835062092090'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/3565517835062092090'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/10/note-for-linear-and-logistic-regression.html' title='note for linear and logistic regression'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-2075161453317563567</id><published>2008-10-21T13:38:00.001-07:00</published><updated>2008-10-21T14:49:00.554-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='3dplot'/><category scheme='http://www.blogger.com/atom/ns#' term='figure'/><title type='text'>display 3d plots</title><content type='html'>x=seq(-pi,pi,len=50)&lt;br /&gt;y=x;&lt;br /&gt;f=outer(x,y,function(x,y)cos(y)/(1+x^2));&lt;br /&gt;f[1:5, 1:5]&lt;br /&gt;&lt;br /&gt;# The contour() function produces a contour map and is used for three&lt;br /&gt;# dimensional data (like mountains). You feed it three inputs. The first&lt;br /&gt;# is a vector of the x values, the second a vector of the y values and&lt;br /&gt;# the third is a matrix with each element corresponding to the Z value&lt;br /&gt;# (the third dimension) for each pair of (x,y) coordinates. Just like&lt;br /&gt;# plot there are many other things you can feed it to. See the help file.&lt;br /&gt;&lt;br /&gt;contour(x,y,f)&lt;br /&gt;contour(x,y,f,nlevels=15)&lt;br /&gt;&lt;br /&gt;# persp() works the same as image() and contour() but it actually&lt;br /&gt;# produces a 3d plot. theta an phi control the various angles you can look&lt;br /&gt;# at the plot from.&lt;br /&gt;persp(x,y,f)&lt;br /&gt;persp(x,y,f,theta=30)&lt;br /&gt;persp(x,y,f,theta=30,phi=20)&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_xLDEmoNB_RM/SP5AxPr39eI/AAAAAAAADcY/hf7Ujv10AZc/s1600-h/3dplots.JPG"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_xLDEmoNB_RM/SP5AxPr39eI/AAAAAAAADcY/hf7Ujv10AZc/s400/3dplots.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5259712629583640034" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;########################################################################&lt;br /&gt;## another example of 3d plot from my personal reserach, use rgl library&lt;br /&gt;########################################################################&lt;br /&gt;# 3D visualization device system&lt;br /&gt;&lt;br /&gt;library(rgl);&lt;br /&gt;data(volcano)&lt;br /&gt;dim(volcano)&lt;br /&gt;&lt;br /&gt;peak.height &lt;- volcano;&lt;br /&gt;ppm.index &lt;- (1:nrow(volcano));&lt;br /&gt;sample.index &lt;- (1:ncol(volcano));&lt;br /&gt;&lt;br /&gt;zlim &lt;- range(peak.height)&lt;br /&gt;zlen &lt;- zlim[2] - zlim[1] + 1&lt;br /&gt;colorlut &lt;- terrain.colors(zlen) # height color lookup table&lt;br /&gt;col &lt;- colorlut[(peak.height-zlim[1]+1)] # assign colors to heights for each point&lt;br /&gt;open3d()&lt;br /&gt;&lt;br /&gt;ppm.index1 &lt;- ppm.index*zlim[2]/max(ppm.index);&lt;br /&gt;sample.index1 &lt;- sample.index*zlim[2]/max(sample.index)&lt;br /&gt;&lt;br /&gt;title.name &lt;- paste("plot3d ", "volcano", sep = "");&lt;br /&gt;surface3d(ppm.index1, sample.index1, peak.height, color=col, back="lines", main = title.name);&lt;br /&gt;grid3d(c("x", "y+", "z"), n =20)&lt;br /&gt;&lt;br /&gt;sample.name &lt;- paste("col.", 1:ncol(volcano), sep="");&lt;br /&gt;sample.label &lt;- as.integer(seq(1, length(sample.name), length = 5));&lt;br /&gt;&lt;br /&gt;axis3d('y+',at = sample.index1[sample.label], sample.name[sample.label], cex = 0.3);&lt;br /&gt;axis3d('y',at = sample.index1[sample.label], sample.name[sample.label], cex = 0.3)&lt;br /&gt;axis3d('z',pos=c(0, 0, NA))&lt;br /&gt;&lt;br /&gt;ppm.label &lt;- as.integer(seq(1, length(ppm.index), length = 10));&lt;br /&gt;axes3d('x', at=c(ppm.index1[ppm.label], 0, 0), abs(round(ppm.index[ppm.label], 2)), cex = 0.3);&lt;br /&gt;&lt;br /&gt;title3d(main = title.name, sub = "test", xlab = "ppm", ylab = "samples", zlab = "peak")&lt;br /&gt;rgl.bringtotop();&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_xLDEmoNB_RM/SP5FyGxtFxI/AAAAAAAADco/HeS9MLeF4JU/s1600-h/rgl.PNG"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_xLDEmoNB_RM/SP5FyGxtFxI/AAAAAAAADco/HeS9MLeF4JU/s400/rgl.PNG" border="0" alt=""id="BLOGGER_PHOTO_ID_5259718141930182418" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-2075161453317563567?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/2075161453317563567/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=2075161453317563567&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/2075161453317563567'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/2075161453317563567'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/10/some-r-functions.html' title='display 3d plots'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_xLDEmoNB_RM/SP5AxPr39eI/AAAAAAAADcY/hf7Ujv10AZc/s72-c/3dplots.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-7992341355364897865</id><published>2008-10-20T11:09:00.000-07:00</published><updated>2008-10-21T14:49:21.440-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='histogram'/><title type='text'>How to show the distribution of several data set together</title><content type='html'># simulate data&lt;br /&gt;x1 &lt;- rnorm(1000, 0.4, 0.8)&lt;br /&gt;x2 &lt;- rnorm(1000, 0.0, 1.0)&lt;br /&gt;x3 &lt;- rnorm(1000, -1.0, 1.0)&lt;br /&gt;&lt;br /&gt;# density plots&lt;br /&gt;plot(density(x1), xlim=range( c(x1, x2, x3) ), main="", xlab="" )&lt;br /&gt;lines(density(x2), col=2)&lt;br /&gt;lines(density(x3), col=3)&lt;br /&gt;&lt;br /&gt;# rug plots for displaying actual data points &lt;br /&gt;rug(x1, col=1, ticksize=0.01, line=2.5)&lt;br /&gt;rug(x2, col=2, ticksize=0.01, line=3.0)&lt;br /&gt;rug(x3, col=3, ticksize=0.01, line=3.5)&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_xLDEmoNB_RM/SPzJ1NLUxcI/AAAAAAAADcI/8XZAIfoBATU/s1600-h/test.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_xLDEmoNB_RM/SPzJ1NLUxcI/AAAAAAAADcI/8XZAIfoBATU/s400/test.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5259300380769306050" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;#========================================================&lt;br /&gt;# or use stacking box plot&lt;br /&gt;x1 &lt;- rnorm(1000, 0.4, 0.8)&lt;br /&gt;x2 &lt;- rnorm(1000, 0.0, 1.0)&lt;br /&gt;x3 &lt;- rnorm(1000, -1.0, 1.0)&lt;br /&gt;all &lt;- c(x1, x2, x3);&lt;br /&gt;hist(x1, breaks=seq(min(all), max(all)+0.999, by=0.1), &lt;br /&gt; xlim = c(min(all), max(all+0.01)), ylim=c(0, 55),&lt;br /&gt; main = "hist", xlab = "distribution", ylab= "", col=1);&lt;br /&gt;box();&lt;br /&gt;&lt;br /&gt;par(new=T);&lt;br /&gt;hist(x2, breaks=seq(min(all), max(all)+0.999, by=0.1), &lt;br /&gt; xlim = c(min(all), max(all+0.01)), ylim=c(0, 55),&lt;br /&gt; main = "", xlab = "", ylab= "", col=2);&lt;br /&gt;box();&lt;br /&gt;&lt;br /&gt;par(new=T);&lt;br /&gt;hist(x3, breaks=seq(min(all), max(all)+0.999, by=0.1), &lt;br /&gt; xlim = c(min(all), max(all+0.01)), ylim=c(0, 55),&lt;br /&gt; main = "", xlab = "", ylab= "", col=3);&lt;br /&gt;box();&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_xLDEmoNB_RM/SPzOUr9WkII/AAAAAAAADcQ/75kgJFTg4xw/s1600-h/histstack.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_xLDEmoNB_RM/SPzOUr9WkII/AAAAAAAADcQ/75kgJFTg4xw/s400/histstack.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5259305319654658178" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-7992341355364897865?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/7992341355364897865/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=7992341355364897865&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7992341355364897865'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7992341355364897865'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/10/how-to-show-distribution-of-several.html' title='How to show the distribution of several data set together'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_xLDEmoNB_RM/SPzJ1NLUxcI/AAAAAAAADcI/8XZAIfoBATU/s72-c/test.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-6419997482498162082</id><published>2008-10-08T15:06:00.000-07:00</published><updated>2008-11-04T15:40:22.989-08:00</updated><title type='text'>apply, with and by summary</title><content type='html'>apply(X, MARGIN, FUN), MARGIN 1 indicates rows, 2 indicates columns&lt;br /&gt;tapply(X, INDEX, FUN)  Apply a function to each cell of a factored array&lt;br /&gt;lapply returns a list each element of which is the result of applying FUN to the corresponding list&lt;br /&gt;sapply is a “user-friendly” version of lapply by default returning a vector or matrix if appropriate.&lt;br /&gt;mapply is a multivariate version of sapply applies FUN to the first elements of each ... argument&lt;br /&gt;&lt;br /&gt;ex. apply(x, 2, sum), x is a matrix&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;lapply&lt;/span&gt; returns a &lt;span style="font-weight: bold;"&gt;list &lt;/span&gt;of the same length as X, each element of which is the result of &lt;span style="font-weight: bold;"&gt;applying FUN to the corresponding element of X&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&gt; x &lt;- list(a = 1:10, beta = exp(-3:3), logic = c(TRUE,FALSE,FALSE,TRUE)) &gt;&lt;br /&gt;&gt; x&lt;br /&gt;$a&lt;br /&gt;[1]  1  2  3  4  5  6  7  8  9 10&lt;br /&gt;$beta&lt;br /&gt;[1]  0.04978707  0.13533528  0.36787944  1.00000000  2.71828183  7.38905610 20.08553692&lt;br /&gt;$logic&lt;br /&gt;[1]  TRUE FALSE FALSE  TRUE&lt;br /&gt;&gt; # compute the list mean for each list element&lt;br /&gt;&gt; lapply(x,mean)&lt;br /&gt;$a&lt;br /&gt;[1] 5.5&lt;br /&gt;$beta&lt;br /&gt;[1] 4.535125&lt;br /&gt;$logic&lt;br /&gt;[1] 0.5&lt;br /&gt;&lt;br /&gt;&gt; # median and quartiles for each list element&lt;br /&gt;&gt; lapply(x, quantile, probs = 1:3/4)&lt;br /&gt;$a&lt;br /&gt;25%  50%  75%&lt;br /&gt;3.25 5.50 7.75&lt;br /&gt;$beta&lt;br /&gt;   25%       50%       75%&lt;br /&gt;0.2516074 1.0000000 5.0536690&lt;br /&gt;$logic&lt;br /&gt;25% 50% 75%&lt;br /&gt;0.0 0.5 1.0&lt;br /&gt;&lt;br /&gt;&gt; sapply(x, quantile)&lt;br /&gt;      a        beta logic&lt;br /&gt;0%    1.00  0.04978707   0.0&lt;br /&gt;25%   3.25  0.25160736   0.0&lt;br /&gt;50%   5.50  1.00000000   0.5&lt;br /&gt;75%   7.75  5.05366896   1.0&lt;br /&gt;100% 10.00 20.08553692   1.0&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;sapply&lt;/span&gt; is a “user-friendly” version of lapply by default returning a vector or matrix if appropriate.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;mapply&lt;/code&gt; is a multivariate version of &lt;code&gt;&lt;a href="http://www.blogger.com/lapply.html"&gt;sapply&lt;/a&gt;&lt;/code&gt;. &lt;code&gt;mapply&lt;/code&gt; applies  &lt;code&gt;FUN&lt;/code&gt; to the first elements of each ... argument&lt;br /&gt;&gt; mapply(rep, 1:4, 4:1)&lt;br /&gt;[[1]]&lt;br /&gt;[1] 1 1 1 1&lt;br /&gt;[[2]]&lt;br /&gt;[1] 2 2 2&lt;br /&gt;[[3]]&lt;br /&gt;[1] 3 3&lt;br /&gt;[[4]]&lt;br /&gt;[1] 4&lt;br /&gt;&lt;br /&gt;&gt; mapply(rep, times=1:4, x=4:1)&lt;br /&gt;[[1]]&lt;br /&gt;[1] 4&lt;br /&gt;[[2]]&lt;br /&gt;[1] 3 3&lt;br /&gt;[[3]]&lt;br /&gt;[1] 2 2 2&lt;br /&gt;[[4]]&lt;br /&gt;[1] 1 1 1 1&lt;br /&gt;&lt;br /&gt;tapply(X, INDEX, FUN)  Apply a function to each cell of a factored array&lt;br /&gt;&gt; n &lt;- 17; fac &lt;- factor(rep(1:3, len = n), levels = 1:5) &gt; table(fac)&lt;br /&gt;fac&lt;br /&gt;1 2 3 4 5&lt;br /&gt;6 6 5 0 &lt;br /&gt;&gt; tapply(1:n, fac, sum)&lt;br /&gt;1  2  3  4  5&lt;br /&gt;51 57 45 NA NA&lt;br /&gt;&lt;br /&gt;###################################################&lt;br /&gt;From &lt;a href="http://www.statmethods.net/stats/withby.html"&gt;http://www.statmethods.net/stats/withby.html&lt;/a&gt;&lt;br /&gt;With&lt;br /&gt;&lt;br /&gt;The with( ) function applys an expression to a dataset. It is similar to DATA= in SAS.&lt;br /&gt;&lt;br /&gt;# with(data, expression)&lt;br /&gt;# example applying a t-test to dataframe mydata&lt;br /&gt;with(mydata, t.test(y1,y2))&lt;br /&gt;By&lt;br /&gt;&lt;br /&gt;The by( ) function applys a function to each level of a factor or factors. It is similar to BY processing in SAS.&lt;br /&gt;&lt;br /&gt;# by(data, factorlist, function)&lt;br /&gt;# example apply a t-test separately for men and women&lt;br /&gt;by(mydata, gender, t.test(y1,y2))&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-6419997482498162082?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/6419997482498162082/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=6419997482498162082&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/6419997482498162082'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/6419997482498162082'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/10/apply-summary.html' title='apply, with and by summary'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-7525492023675531325</id><published>2008-09-22T15:50:00.001-07:00</published><updated>2008-09-23T13:59:45.508-07:00</updated><title type='text'>Add boxplots to a scatterplot</title><content type='html'>par(fig=c(0,0.8,0,0.8), new=TRUE)&lt;br /&gt;plot(mtcars$wt, mtcars$mpg, xlab="Miles Per Gallon",&lt;br /&gt;  ylab="Car Weight")&lt;br /&gt;par(fig=c(0,0.8,0.55,1), new=TRUE)&lt;br /&gt;boxplot(mtcars$wt, horizontal=TRUE, axes=FALSE)&lt;br /&gt;par(fig=c(0.65,1,0,0.8),new=TRUE)&lt;br /&gt;boxplot(mtcars$mpg, axes=FALSE)&lt;br /&gt;mtext("Enhanced Scatterplot", side=3, outer=TRUE, line=-3) &lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.statmethods.net/advgraphs/images/layout4.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px;" src="http://www.statmethods.net/advgraphs/images/layout4.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To understand this graph, think of the full graph area as going from (0,0) in the lower left corner to (1,1) in the upper right corner. The format of the fig= parameter is a numerical vector of the form c(x1, x2, y1, y2). The first fig= sets up the scatterplot going from 0 to 0.8 on the x axis and 0 to 0.8 on the y axis. The top boxplot goes from 0 to 0.8 on the x axis and 0.55 to 1 on the y axis. I chose 0.55 rather than 0.8 so that the top figure will be pulled closer to the scatter plot. The right hand boxplot goes from 0.65 to 1 on the x axis and 0 to 0.8 on the y axis. Again, I chose a value to pull the right hand boxplot closer to the scatterplot. You have to experiment to get it just right.&lt;br /&gt;&lt;br /&gt;fig= starts a new plot, so to add to an existing plot use new=TRUE.&lt;br /&gt;&lt;br /&gt;You can use this to combine several plots in any arrangement into one graph. &lt;br /&gt;&lt;br /&gt;zz from (&lt;a href="http://www.statmethods.net/advgraphs/layout.html"&gt;Quick R&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;=============================================&lt;br /&gt;== my own method:&lt;br /&gt;&lt;br /&gt;ll = matrix(c(2, 0, 5, 0, 1, 3, 4, 6, 8, 0, 11, 0, 7, 9, 10, 12), nrow=4, byrow=TRUE)&lt;br /&gt;width  = c(0.8, 0.17, 0.8, 0.17)&lt;br /&gt;height = c(0.17, 0.8, 0.17, 0.8) &lt;br /&gt;layout(ll, width, height)&lt;br /&gt;&lt;br /&gt;plot.data.list &lt;- list(&lt;br /&gt;cbind(rnorm(200), runif(200)), &lt;br /&gt;cbind(rgeom(200, prob=0.2), runif(200)),&lt;br /&gt;cbind(rgamma(200, shape=2), runif(200)),&lt;br /&gt;cbind(rpois(200, lambda=5), runif(200))&lt;br /&gt;);&lt;br /&gt;&lt;br /&gt;for(i in 1:4){&lt;br /&gt;plot.data &lt;- plot.data.list[[i]];&lt;br /&gt;xmax &lt;- max(plot.data[ ,1]);&lt;br /&gt;xmin &lt;- min(plot.data[ ,1]);&lt;br /&gt;ymax &lt;- max(plot.data[ ,2]);&lt;br /&gt;ymin &lt;- min(plot.data[ ,2]);&lt;br /&gt;#scatter plot&lt;br /&gt;par(mar=c(4, 4, 0, 0))&lt;br /&gt;plot(plot.data[ ,1], plot.data[ ,2], pch = 20, ylab = "Y", &lt;br /&gt;xlim = c(xmin, xmax), ylim=c(ymin, ymax), xlab = " ", main="");&lt;br /&gt;#boxplot&lt;br /&gt;par(mar=c(0, 4, 1, 1))&lt;br /&gt;boxplot(plot.data[ ,1], horizontal=TRUE, axes=FALSE, ylim = c(xmin, xmax));&lt;br /&gt;mtext(text=expression("NAMESS"), side = 3, line=-0.5);&lt;br /&gt;par(mar=c(4, 0, 1, 1))&lt;br /&gt;boxplot(plot.data[ ,2], axes=FALSE, ylim=c(ymin, ymax))&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;===================================================&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_xLDEmoNB_RM/SNlWGfyWFZI/AAAAAAAADa4/k4hcrcpECRI/s1600-h/sdf.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_xLDEmoNB_RM/SNlWGfyWFZI/AAAAAAAADa4/k4hcrcpECRI/s400/sdf.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5249321510288889234" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-7525492023675531325?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/7525492023675531325/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=7525492023675531325&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7525492023675531325'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7525492023675531325'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/09/add-boxplots-to-scatterplot.html' title='Add boxplots to a scatterplot'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_xLDEmoNB_RM/SNlWGfyWFZI/AAAAAAAADa4/k4hcrcpECRI/s72-c/sdf.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-7913759230916740342</id><published>2008-08-12T15:15:00.000-07:00</published><updated>2008-08-12T15:19:02.663-07:00</updated><title type='text'>a personal notes for SCH9 research projects</title><content type='html'>rm(list = ls());&lt;br /&gt;library(affy);&lt;br /&gt;source("C:/MASS/SVNs/AllCodes.svn/TimeSeries-3/main.funcs/0.0.main.func.R")&lt;br /&gt;input.PATH &lt;- c("C:/MASS/lab/TimeSeries-3/Profs/PTR.ver2");&lt;br /&gt;&lt;br /&gt;setwd(input.PATH);&lt;br /&gt;load(file = "PTR.TS.exprs.data.Rdata");&lt;br /&gt;&lt;br /&gt;WTdata &lt;- wt.exprs.data;&lt;br /&gt;SCHdata &lt;- sch.exprs.data;&lt;br /&gt;###########################################################&lt;br /&gt;genelist &lt;- c("BIO2", "BIO3", "BIO4", "BIO5" ,"PDC6", "HXK1", "HXT4", "HXT5")&lt;br /&gt;Probesetss &lt;- others.gene.lookup(genelist)[[1]];&lt;br /&gt;###########################################################&lt;br /&gt;windows(8, 4)&lt;br /&gt;par(mfrow=c(1,2))&lt;br /&gt;plot(0, type="n", ylim=c(7, 13), xlim=c(0, 10), main="Wild Type", xlab="Time Point (hr)", ylab="log2(expr)", axes = FALSE);&lt;br /&gt;axis(1, 1:10, (1:10)*12, cex.axis=0.7)&lt;br /&gt;axis(2)&lt;br /&gt;for(i in 1:length(Probesetss)){&lt;br /&gt; lines(1:length(WTdata[Probesetss[i], ]), WTdata[Probesetss[i], ], type = "b", col = i, lty = 1,lwd=1, pch=20);&lt;br /&gt;}&lt;br /&gt;box()&lt;br /&gt;text(0.3, 14, "A1");&lt;br /&gt;&lt;br /&gt;plot(0, type="n", ylim=c(7, 13), xlim=c(0, 9), main=expression(paste(italic(Sch9), Delta)), xlab="Time Point (hr)", ylab="log2(expr)", axes = FALSE);&lt;br /&gt;axis(1, 1:9, (1:9)*12, cex.axis=0.7)&lt;br /&gt;axis(2)&lt;br /&gt;for(i in 1:length(Probesetss)){&lt;br /&gt; lines(1:length(SCHdata[Probesetss[i], ]), SCHdata[Probesetss[i], ], type = "b", col = i, lty = 1,lwd=1, pch=20);&lt;br /&gt;}&lt;br /&gt;box()&lt;br /&gt;legend.list &lt;- as.character(ORF_GENE_PROBE[Probesetss, "Gene"]);&lt;br /&gt;legend("topright", legend.list, col = 1:length(Probesetss), lty = 1,lwd=1, pch=20, cex=0.7)&lt;br /&gt;&lt;br /&gt;###########################################################&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-7913759230916740342?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/7913759230916740342/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=7913759230916740342&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7913759230916740342'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/7913759230916740342'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/08/personal-notes-for-sch9-research.html' title='a personal notes for SCH9 research projects'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-369906988451896112</id><published>2008-07-17T16:58:00.000-07:00</published><updated>2008-07-18T12:15:27.285-07:00</updated><title type='text'>One of my R plot</title><content type='html'># a plot I spend one hour to adjust the layout&lt;br /&gt;# Jul 17, 2008;5:00:57 PM&lt;br /&gt;#&lt;br /&gt;# Author: Gary HY Ge, hge AT usc DOT edu; hhggee AT gmail DOT com&lt;br /&gt;# Copyright reserved&lt;br /&gt;###############################################################################&lt;br /&gt;&lt;br /&gt;mat &lt;- matrix(c(1, 0, 0, 2:4, 5, 0, 0, 6:8, 0, 9, 10), 5, 3, byrow = TRUE)&lt;br /&gt;nf &lt;- layout(mat, widths = c(3, 2.5, 3), height = c(0.37, 3, 0.37, 3, 1.7), TRUE)&lt;br /&gt;#layout.show(nf)&lt;br /&gt;&lt;br /&gt;titlenames = c("Sample1", "Sample2"); #Just a name&lt;br /&gt;data1 &lt;- matrix(rnorm(66, mean = 0, sd = 0.5), 11, 6);&lt;br /&gt;data2 &lt;- matrix(rnorm(66, mean = 0.2, sd = 1), 11, 6);&lt;br /&gt;dataset &lt;- list(data1, data2);&lt;br /&gt;&lt;br /&gt;for(j in 1:2){&lt;br /&gt;#j = 1;&lt;br /&gt;titlename = titlenames[j];&lt;br /&gt;data1 &lt;- dataset[[j]];&lt;br /&gt;&lt;br /&gt;# the plot section name&lt;br /&gt;par(mar=c(0.5, 0.5, 0.5, 0)); #define the margin of the first plot&lt;br /&gt;plot(0, type="n", ylim=c(6, 15), xlim=c(0, 9), main="", xlab="", ylab="", axes = FALSE);&lt;br /&gt;texts &lt;- paste("(", titlename, ")", sep="")&lt;br /&gt;text(2, 10, texts, font = 2, cex = 1.3); #font 1: plain; 2: bold; 3: italic&lt;br /&gt;&lt;br /&gt;## histogram of the data:&lt;br /&gt;par(mar=c(4, 4, 3, 1));&lt;br /&gt;resids &lt;- as.vector(data1);&lt;br /&gt;ylabb = "Frequency";&lt;br /&gt;hist(resids, prob=F, breaks=seq(min(resids), max(resids)+0.0599, by=0.06), ylim = c(0, 10), main = "Residual histogram", xlab = "Residuals", ylab= ylabb, xlim = c(-1.5, 1.5));#&lt;br /&gt;#lines(density(resids, kernel = c("gaussian"),adjust = 0.1), col = "red", lty=1);&lt;br /&gt;box();&lt;br /&gt; &lt;br /&gt;## heatmap of the reiduals:&lt;br /&gt;#par(mar=c(3, 4, 0.2, 1));&lt;br /&gt;heatmapcols &lt;- rainbow(31);&lt;br /&gt;namess1 &lt;- paste("P_", 1:nrow(data1), sep="");&lt;br /&gt;namess1 &lt;- c(" ", namess1, " ")&lt;br /&gt; &lt;br /&gt;image.data &lt;- data1;&lt;br /&gt;cutoff1 &lt;- 0.1;&lt;br /&gt;cutoff2 &lt;- 0.5;&lt;br /&gt;index1 &lt;- abs(image.data) &lt; cutoff1 #small numbers&lt;br /&gt;index4 &lt;- image.data &lt; -cutoff2 #large outliers&lt;br /&gt;index5 &lt;- image.data &gt; cutoff2&lt;br /&gt; &lt;br /&gt;image.data[index1]  &lt;- 0;&lt;br /&gt;image.data[index4]  &lt;- -cutoff2;&lt;br /&gt;image.data[index5]  &lt;- cutoff2;&lt;br /&gt;image.data &lt;- t(image.data);&lt;br /&gt; &lt;br /&gt;len &lt;- nrow(image.data);&lt;br /&gt;bounders &lt;- rep(cutoff2, 6);&lt;br /&gt;image.data &lt;- cbind(bounders, image.data, -bounders)&lt;br /&gt;dim(image.data);&lt;br /&gt; &lt;br /&gt;image(x=1:nrow(image.data), y=1:ncol(image.data), image.data, col=heatmapcols, axes = FALSE, xlab="", ylab="", main="Residual heatmap");&lt;br /&gt;grid(6, 13)&lt;br /&gt;abline(v = 3.5, col="black", lwd=2);&lt;br /&gt;par(las=2)&lt;br /&gt;axis(2, 1:ncol(image.data), namess1, cex.axis=0.9, tick = FALSE);&lt;br /&gt;par(las=1)&lt;br /&gt;print(rownames(image.data))&lt;br /&gt;axis(1, 1:nrow(image.data), c("", "A", " ", " ", "B", " "), cex.axis=0.9, tick = FALSE);#Exp4_R1~3&lt;br /&gt; #box();&lt;br /&gt;rect(-0.60, -0.6, 3.475, 1.6, col = "white");&lt;br /&gt;rect(3.525, -0.6, 6.60, 1.6, col = "white");&lt;br /&gt;rect(-0.5, 12.4, 6.6, 13.7, col = "white")&lt;br /&gt;rect(0.5, -0.6, 0.55, 1.6, col = "black");&lt;br /&gt;rect(6.45, -0.6, 6.5, 1.6, col = "black");&lt;br /&gt; &lt;br /&gt;par(las=1)&lt;br /&gt;###########################################################&lt;br /&gt;##  curves&lt;br /&gt;#par(mar=c(4, 4, 0.2, 1));&lt;br /&gt;# good to have a set of plot type controller&lt;br /&gt;setcols &lt;- c(&lt;br /&gt; " ", "blue", "blue", "1", "20",&lt;br /&gt; " ", "darkblue", "blue", "2", "18",&lt;br /&gt; &lt;br /&gt; " ", "cadetblue", "red", "1", "20",&lt;br /&gt; " ", "cadetblue4", "red", "2", "20",&lt;br /&gt; &lt;br /&gt; " ", "darkcyan", "green", "1", "20",&lt;br /&gt; " ", "red", "green", "2", "20",&lt;br /&gt; &lt;br /&gt; " ", "brown", "black", "1", "18",&lt;br /&gt; " ", "darkmagenta", "black", "2", "20",&lt;br /&gt; &lt;br /&gt; " ", "darkred", "cyan", "1", "20",&lt;br /&gt; " ", "purple", "cyan", "2", "20",&lt;br /&gt; &lt;br /&gt; " ", "green", "orange", "1", "20",&lt;br /&gt; " ", "darkolivegreen", "orange", "2", "18",&lt;br /&gt; &lt;br /&gt; " ", "darkgoldenrod1", "grey","1", "20",&lt;br /&gt; " ", "tomato", "grey","2", "18"&lt;br /&gt; )&lt;br /&gt; setcols &lt;- matrix(setcols, ncol=5, byrow=TRUE);&lt;br /&gt; setcols &lt;- rbind(setcols, setcols)&lt;br /&gt; cols &lt;- setcols[ ,3];&lt;br /&gt; lty3 &lt;- as.numeric(setcols[ ,4])&lt;br /&gt; pchs &lt;- as.numeric(setcols[ ,5])&lt;br /&gt; ###########################################################&lt;br /&gt;&lt;br /&gt; #################################&lt;br /&gt; intensities &lt;- data1&lt;br /&gt; ylabb = "Values";&lt;br /&gt; plot(0, type="n", ylim=c(min(intensities), max(intensities)), xlim=c(1, ncol(intensities)), main = "curves", ylab = ylabb, xlab = "index")&lt;br /&gt; for(i in 1:nrow(intensities)){&lt;br /&gt;  lines(1:ncol(intensities), intensities[i, ], col=cols[i], lty=lty3[i], lwd=1)&lt;br /&gt; }&lt;br /&gt; abline(v=3, col="grey");&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;par(mar=c(5.7, 4, 3, 1));&lt;br /&gt;lens &lt;- length(heatmapcols)&lt;br /&gt;ColorLevels &lt;- seq(-5, 5, length=lens)&lt;br /&gt;image(ColorLevels, 1, matrix(data=ColorLevels, ncol=1,nrow=length(ColorLevels)), main="Color scale",&lt;br /&gt;  col=heatmapcols, xlab="resid values",ylab="", xaxt="n", axes = FALSE);&lt;br /&gt;qvals &lt;- c(0.001, 0.01, 0.05)&lt;br /&gt;par(las=1)&lt;br /&gt;axis(1, c(-5, -3, -1, 0, 1, 3, 5), c(-0.5, -0.3, -0.1, 0, 0.1, 0.3, 0.5), cex.axis=0.9, tick = TRUE)&lt;br /&gt;&lt;br /&gt;par(mar=c(0, 4, 3, 1));&lt;br /&gt;plot(0, type="n", ylim=c(6, 15), xlim=c(0, 9), main="Curve legend", &lt;br /&gt;   xlab="", ylab="", axes = FALSE);&lt;br /&gt;box();&lt;br /&gt;legend.list &lt;- paste("P", 1:11, sep="_")&lt;br /&gt;len &lt;- length(legend.list)&lt;br /&gt;selected &lt;- 1:6;&lt;br /&gt;legend("topleft", legend.list[selected], col=cols[selected], lty=lty3[selected], cex=1, lwd=1, box.lwd = 0, box.lty = 0)&lt;br /&gt;selected &lt;- 7:11;&lt;br /&gt;legend("topright", legend.list[selected], col=cols[selected], lty=lty3[selected], cex=1, lwd=1, box.lwd = 0 , box.lty = 0)&lt;br /&gt;&lt;br /&gt;###########################################################&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp1.blogger.com/_xLDEmoNB_RM/SIALf3BsiUI/AAAAAAAADXk/LT2XRBT8xI4/s1600-h/5.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://bp1.blogger.com/_xLDEmoNB_RM/SIALf3BsiUI/AAAAAAAADXk/LT2XRBT8xI4/s400/5.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5224188209724688706" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-369906988451896112?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/369906988451896112/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=369906988451896112&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/369906988451896112'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/369906988451896112'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/07/one-of-my-r-plot.html' title='One of my R plot'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://bp1.blogger.com/_xLDEmoNB_RM/SIALf3BsiUI/AAAAAAAADXk/LT2XRBT8xI4/s72-c/5.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-2135074571028556835</id><published>2008-07-16T17:40:00.001-07:00</published><updated>2008-07-17T16:57:20.507-07:00</updated><title type='text'>color scale in R</title><content type='html'>require(graphics)&lt;br /&gt;# A Color Wheel&lt;br /&gt;pie(rep(1,12), col=rainbow(12))&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp1.blogger.com/_xLDEmoNB_RM/SH_aZNX4KsI/AAAAAAAADXU/WdIzE8PL09I/s1600-h/3.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://bp1.blogger.com/_xLDEmoNB_RM/SH_aZNX4KsI/AAAAAAAADXU/WdIzE8PL09I/s400/3.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5224134219394460354" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;##------ Some palettes ------------##&lt;br /&gt;n = 32;&lt;br /&gt;main.name = paste("color palettes;  n=",n)&lt;br /&gt;ch.col = c("rainbow(n, start=.7, end=.1)", "heat.colors(n)", "terrain.colors(n)", "topo.colors(n)", "cm.colors(n)");&lt;br /&gt;&lt;br /&gt;nt &lt;- length(ch.col)&lt;br /&gt;i &lt;- 1:n; &lt;br /&gt;j &lt;- n/nt; &lt;br /&gt;d &lt;- j/6; &lt;br /&gt;dy &lt;- 2*d;&lt;br /&gt;&lt;br /&gt;plot(i,i+d, type="n", yaxt="n", xaxt="n", ylab="", , xlab ="", main=main.name) #yaxt="n" set no y axie label and tick.&lt;br /&gt;for (k in 1:nt) {&lt;br /&gt;   rect(i-.5, (k-1)*j+ dy, i+.4, k*j, col = eval(parse(text=ch.col[k])), border = "grey");&lt;br /&gt;   text(2.5*j,  k * j + dy/2, ch.col[k])&lt;br /&gt;}&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp1.blogger.com/_xLDEmoNB_RM/SH_cQTvrmII/AAAAAAAADXc/dhYXDspkDKs/s1600-h/4.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://bp1.blogger.com/_xLDEmoNB_RM/SH_cQTvrmII/AAAAAAAADXc/dhYXDspkDKs/s400/4.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5224136265509345410" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-2135074571028556835?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/2135074571028556835/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=2135074571028556835&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/2135074571028556835'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/2135074571028556835'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/07/color-scale-in-r.html' title='color scale in R'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://bp1.blogger.com/_xLDEmoNB_RM/SH_aZNX4KsI/AAAAAAAADXU/WdIzE8PL09I/s72-c/3.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-6111347745040940399</id><published>2008-07-16T14:50:00.000-07:00</published><updated>2008-07-16T17:21:39.689-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='layout'/><title type='text'>layout and arrange the figures on plot</title><content type='html'>## divide the device into two rows and two columns&lt;br /&gt;## allocate figure 1 all of row 1&lt;br /&gt;## allocate figure 2 the intersection of column 2 and row 2&lt;br /&gt;# example 1&lt;br /&gt;nf &lt;- layout(mat = matrix(c(2,0,1,3),2,2,byrow=TRUE), widths = c(3,1), height = c(1,3), TRUE)&lt;br /&gt;layout.show(nf)&lt;br /&gt;&lt;br /&gt;## divide device into two rows and two columns&lt;br /&gt;## allocate figure 1 and figure 2 as above&lt;br /&gt;##-- Create a scatterplot with marginal histograms -----&lt;br /&gt;&lt;br /&gt;x &lt;- rnorm(50);&lt;br /&gt;y &lt;- runif(50);&lt;br /&gt;xhist &lt;- hist(x, plot=FALSE)&lt;br /&gt;yhist &lt;- hist(y, plot=FALSE)&lt;br /&gt;top &lt;- max(c(xhist$counts, yhist$counts))&lt;br /&gt;xrange &lt;- c(-1,1)&lt;br /&gt;yrange &lt;- c(0,1)&lt;br /&gt;&lt;br /&gt;par(mar=c(3,3,1,1))&lt;br /&gt;plot(x, y, xlim=xrange, ylim=yrange, xlab="", ylab="")&lt;br /&gt;par(mar=c(0,3,1,1))&lt;br /&gt;barplot(xhist$counts, axes=FALSE, ylim=c(0, top), space=0)&lt;br /&gt;par(mar=c(3,0,1,1))&lt;br /&gt;barplot(yhist$counts, axes=FALSE, xlim=c(0, top), space=0, horiz=TRUE)&lt;br /&gt;&lt;br /&gt;###############################################&lt;br /&gt;## example 2&lt;br /&gt;## define the figure size on plot&lt;br /&gt;mat &lt;- matrix(c(1:6, 0, 7, 8), 3, 3, byrow = TRUE)&lt;br /&gt;nf &lt;- layout(mat, widths = c(3, 3, 3), height = c(3,3, 1), TRUE)&lt;br /&gt;layout.show(nf)&lt;br /&gt;&lt;br /&gt;###############################################&lt;br /&gt;# example 3&lt;br /&gt;mat &lt;- matrix(c(1, 0, 0, 2:4, 5, 0, 0, 6:8, 0, 9, 10), 5, 3, byrow = TRUE)&lt;br /&gt;nf &lt;- layout(mat, widths = c(3, 2.5, 3), height = c(0.37, 3, 0.37, 3, 1.7), TRUE)&lt;br /&gt;layout.show(nf)&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp1.blogger.com/_xLDEmoNB_RM/SH6QWufrxzI/AAAAAAAADW8/Jwz3xZhQSzE/s1600-h/layouts.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://bp1.blogger.com/_xLDEmoNB_RM/SH6QWufrxzI/AAAAAAAADW8/Jwz3xZhQSzE/s400/layouts.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5223771337908799282" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-6111347745040940399?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/6111347745040940399/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=6111347745040940399&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/6111347745040940399'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/6111347745040940399'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/07/arrange-figures-on-plot.html' title='layout and arrange the figures on plot'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://bp1.blogger.com/_xLDEmoNB_RM/SH6QWufrxzI/AAAAAAAADW8/Jwz3xZhQSzE/s72-c/layouts.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-4370955711988855878</id><published>2008-07-08T13:48:00.000-07:00</published><updated>2008-07-08T13:57:52.142-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unix'/><title type='text'>submit R jobs to Unix cluster nodes</title><content type='html'>Sample 1:&lt;br /&gt;--------------------&lt;br /&gt;#!/bin/sh&lt;br /&gt;R --vanilla &lt;&lt; EOF&lt;br /&gt;cat("Hello", file="test.txt");&lt;br /&gt;EOF&lt;br /&gt;--------------------&lt;br /&gt;&lt;br /&gt;Sample 2:&lt;br /&gt;--------------------&lt;br /&gt;rm(list = ls());&lt;br /&gt;setwd("/auto/cmb-01/hge/");&lt;br /&gt;for(i in 1:3){&lt;br /&gt;filenamess = paste("JOB.", i, ".R", sep=""); &lt;br /&gt;cat("&lt;br /&gt;#!/bin/sh&lt;br /&gt;R --vanilla &lt;&lt; EOF&lt;br /&gt;#source(\"/auto/cmb-01/hge/somecode.R\");&lt;br /&gt;print(", i, ");&lt;br /&gt;}&lt;br /&gt;EOF"&lt;br /&gt;, file=filenamess);&lt;br /&gt;commandlines = paste("qsub -q cmb -j oe -l walltime=120:00:00,nodes=1:myri:ppn=1 ", filenamess, sep = "");&lt;br /&gt;system(commandlines);&lt;br /&gt;}&lt;br /&gt;-----------------------------------------------------&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-4370955711988855878?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/4370955711988855878/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=4370955711988855878&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4370955711988855878'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4370955711988855878'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/07/running-r-on-unix-machine.html' title='submit R jobs to Unix cluster nodes'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-3517845189772312452</id><published>2008-07-08T13:40:00.001-07:00</published><updated>2008-07-08T13:47:06.317-07:00</updated><title type='text'>R 2 Latex</title><content type='html'>rm(list = ls());&lt;br /&gt;#install.packages("xtable")&lt;br /&gt;library(xtable)&lt;br /&gt;object &lt;- matrix(rnorm(50), 5, 10)&lt;br /&gt;newobject&lt;-xtable(object)&lt;br /&gt;setwd("C:/tmp");&lt;br /&gt;print(newobject, type="latex", file="filename.tex")&lt;br /&gt;&lt;br /&gt;-----------------------&lt;br /&gt;OUTPUT:&lt;br /&gt;&lt;br /&gt;% latex table generated in R 2.5.1 by xtable 1.5-1 package&lt;br /&gt;% Tue Jul 08 13:46:27 2008&lt;br /&gt;\begin{table}[ht]&lt;br /&gt;\begin{center}&lt;br /&gt;\begin{tabular}{rrrrr}&lt;br /&gt;  \hline&lt;br /&gt; &amp; 1 &amp; 2 &amp; 3 &amp; 4 \\&lt;br /&gt;  \hline&lt;br /&gt;1 &amp; 1.52 &amp; 1.74 &amp; $-$0.70 &amp; $-$2.77 \\&lt;br /&gt;  2 &amp; $-$1.37 &amp; 1.70 &amp; 1.27 &amp; 1.22 \\&lt;br /&gt;  3 &amp; 0.83 &amp; 0.43 &amp; $-$0.64 &amp; $-$1.85 \\&lt;br /&gt;   \hline&lt;br /&gt;\end{tabular}&lt;br /&gt;\end{center}&lt;br /&gt;\end{table}&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-3517845189772312452?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/3517845189772312452/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=3517845189772312452&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/3517845189772312452'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/3517845189772312452'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/07/r-2-latex.html' title='R 2 Latex'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-1470357187259900737</id><published>2008-07-08T13:29:00.000-07:00</published><updated>2008-07-08T13:34:59.305-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='html'/><title type='text'>R 2 html</title><content type='html'>Generate HTML files with R package R2html&lt;br /&gt;&lt;br /&gt;# Sample Session&lt;br /&gt;# install.packages("R2html")&lt;br /&gt;rm(list = ls());&lt;br /&gt;setwd("C:/tmp");&lt;br /&gt;x = 1:10;&lt;br /&gt;y = 1:10;&lt;br /&gt;&lt;br /&gt;mydata = list(x=x, y=y);&lt;br /&gt;&lt;br /&gt;library(R2HTML)&lt;br /&gt;HTMLStart(outdir="C:/tmp", file="myreport", extension="html", echo=FALSE, HTMLframe=TRUE)&lt;br /&gt;HTML.title("My Report", HR=1)&lt;br /&gt;HTML.title("Description of my data", HR=3)&lt;br /&gt;summary(mydata)&lt;br /&gt;HTMLhr()&lt;br /&gt;HTML.title("X Y Scatter Plot", HR=2)&lt;br /&gt;plot(mydata$y~mydata$x)&lt;br /&gt;HTMLplot()&lt;br /&gt;HTMLStop()&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-1470357187259900737?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/1470357187259900737/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=1470357187259900737&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/1470357187259900737'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/1470357187259900737'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/07/r-2-html.html' title='R 2 html'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-4990047918621602370</id><published>2008-02-29T11:17:00.000-08:00</published><updated>2008-07-08T12:04:31.469-07:00</updated><title type='text'>R Create Venn diagram</title><content type='html'>Using the package limma to draw the venn diagram.&lt;br /&gt;&lt;br /&gt;Code:&lt;br /&gt;-----------------------------------------------&lt;br /&gt;A1 &lt;- c("A", "B");&lt;br /&gt;A2 &lt;- c("A", "B", "C");&lt;br /&gt;A3 &lt;- c("D", "B");&lt;br /&gt;&lt;br /&gt;library(limma);&lt;br /&gt;SET.list &lt;- list(A1=A1, A2=A2, A3=A3);&lt;br /&gt;len &lt;- length(SET.list);&lt;br /&gt;all.unique &lt;- unique(unlist(SET.list));&lt;br /&gt;tmp &lt;- matrix(0, ncol=len, nrow=length(all.unique));&lt;br /&gt;colnames(tmp) &lt;- names(SET.list);&lt;br /&gt;rownames(tmp) &lt;- all.unique;&lt;br /&gt;for(ith in 1:len){&lt;br /&gt;tmp[SET.list[[ith]], ith] &lt;- 1;&lt;br /&gt;}&lt;br /&gt;count.sum &lt;- apply(tmp, 1, sum);&lt;br /&gt;count.sum&lt;br /&gt;if(len &lt; 4){&lt;br /&gt;windows();&lt;br /&gt;vennDiagram(vennCounts(tmp), main = "gene.intersection", cex = 1)&lt;br /&gt;}&lt;br /&gt;-----------------------------------------------&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_xLDEmoNB_RM/SHO6Mjo6qbI/AAAAAAAADVM/zrl7yZCQvTk/s1600-h/venn.JPG"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://2.bp.blogspot.com/_xLDEmoNB_RM/SHO6Mjo6qbI/AAAAAAAADVM/zrl7yZCQvTk/s400/venn.JPG" alt="" id="BLOGGER_PHOTO_ID_5220721117940591026" border="0" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-4990047918621602370?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/4990047918621602370/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=4990047918621602370&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4990047918621602370'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4990047918621602370'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/02/r-create-venn-diagram.html' title='R Create Venn diagram'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_xLDEmoNB_RM/SHO6Mjo6qbI/AAAAAAAADVM/zrl7yZCQvTk/s72-c/venn.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-4601535593488444852</id><published>2008-02-27T10:40:00.000-08:00</published><updated>2008-07-08T12:11:54.495-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='color'/><title type='text'>R colors</title><content type='html'>R colors in a way that is intended to aid finding colors by name, or by index in the It contains 657 kinds of colors:&lt;br /&gt;For example:&lt;br /&gt;--------------------------------------------------------------------------&lt;br /&gt;&gt; colors()[c(552,254,26)]&lt;br /&gt;[1] "red"   "green" "blue"&lt;br /&gt;&gt; colors()[grep("red",colors())]&lt;br /&gt;[1] "darkred"         "indianred"       "indianred1"      "indianred2"&lt;br /&gt;[5] "indianred3"      "indianred4"      "mediumvioletred" "orangered"&lt;br /&gt;[9] "orangered1"      "orangered2"      "orangered3"      "orangered4"&lt;br /&gt;[13] "palevioletred"   "palevioletred1"  "palevioletred2"  "palevioletred3"&lt;br /&gt;[17] "palevioletred4"  "red"             "red1"            "red2"&lt;br /&gt;[21] "red3"            "red4"            "violetred"       "violetred1"&lt;br /&gt;[25] "violetred2"      "violetred3"      "violetred4"&lt;br /&gt;&gt; colors()[grep("sky",colors())]&lt;br /&gt;[1] "deepskyblue"   "deepskyblue1"  "deepskyblue2"  "deepskyblue3"&lt;br /&gt;[5] "deepskyblue4"  "lightskyblue"  "lightskyblue1" "lightskyblue2"&lt;br /&gt;[9] "lightskyblue3" "lightskyblue4" "skyblue"       "skyblue1"&lt;br /&gt;[13] "skyblue2"      "skyblue3"      "skyblue4"&lt;br /&gt;&gt; col2rgb("yellow")&lt;br /&gt;[,1]&lt;br /&gt;red    255&lt;br /&gt;green  255&lt;br /&gt;blue     0&lt;br /&gt;--------------------------------------------------------------------&lt;br /&gt;A full set of color image is here, download from the reference site.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_xLDEmoNB_RM/SHO7neFnw3I/AAAAAAAADVc/H1wxSm3dDTI/s1600-h/color.JPG"&gt;&lt;img style="cursor: pointer;" src="http://2.bp.blogspot.com/_xLDEmoNB_RM/SHO7neFnw3I/AAAAAAAADVc/H1wxSm3dDTI/s400/color.JPG" alt="" id="BLOGGER_PHOTO_ID_5220722679818470258" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The chart can be generate by the code:&lt;br /&gt;&lt;br /&gt;setwd("C:/");&lt;br /&gt;source("http://research.stowers-institute.org/efg/R/Color/Chart/ColorChart.R")&lt;br /&gt;&lt;br /&gt;and check  the pdf file under your your C: directory.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Reference:&lt;br /&gt;&lt;a href="http://research.stowers-institute.org/efg/R/Color/Chart/index.htm"&gt;http://research.stowers-institute.org/efg/R/Color/Chart/index.htm&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.stat.columbia.edu/%7Etzheng/files/Rcolor.pdf"&gt;&lt;span style=""&gt;&lt;span class="a"&gt;http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-4601535593488444852?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/4601535593488444852/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=4601535593488444852&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4601535593488444852'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4601535593488444852'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/02/r-colors.html' title='R colors'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_xLDEmoNB_RM/SHO7neFnw3I/AAAAAAAADVc/H1wxSm3dDTI/s72-c/color.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-304208989093446930</id><published>2008-02-25T21:53:00.001-08:00</published><updated>2008-07-17T16:39:18.612-07:00</updated><title type='text'>Letter counter using R</title><content type='html'>We can use R to do something that they are not designed to do so, like the letter counting using R.&lt;br /&gt;&lt;br /&gt;In R, the package "seqinr" is design to do the data analysis for DNA sequence and protein sequence.&lt;br /&gt;&lt;br /&gt;For example, in a txt file "tmp.txt", we have a&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-304208989093446930?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/304208989093446930/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=304208989093446930&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/304208989093446930'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/304208989093446930'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/02/letter-counter-using-r.html' title='Letter counter using R'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-3124524979407947418</id><published>2008-02-09T22:00:00.000-08:00</published><updated>2008-07-08T11:48:08.576-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='string'/><title type='text'>R string manipulation</title><content type='html'>&gt; x &lt;- c("a", "b", "c"); &gt; paste(x, 1:3, sep="..");&lt;br /&gt;[1] "a..1" "b..2" "c..3"&lt;br /&gt;&gt; paste(x, collapse="..");&lt;br /&gt;[1] "a..b..c"&lt;br /&gt;&gt; strsplit("a.b.c", ".", fixed = TRUE)&lt;br /&gt;[[1]]&lt;br /&gt;[1] "a" "b" "c"&lt;br /&gt;&lt;br /&gt;&gt; unlist(strsplit("a.b.c", ".", fixed = TRUE));&lt;br /&gt;[1] "a" "b" "c"&lt;br /&gt;&gt; strtrim(c("abcdef", "abcdef", "abcdef"), c(1,5,10))&lt;br /&gt;[1] "a"      "abcde"  "abcdef"&lt;br /&gt;&gt; substr("abcdef",2,4)&lt;br /&gt;[1] "bcd"&lt;br /&gt;&gt; substring("abcdef",c(1, 3), c(2, 6))&lt;br /&gt;[1] "ab"   "cdef"&lt;br /&gt;&gt; strtrim(c("abcdef", "abcdef", "abcdef"), c(1,5,10))&lt;br /&gt;[1] "a"      "abcde"  "abcdef"&lt;br /&gt;&lt;br /&gt;#############################################&lt;br /&gt;x &lt;- c("a", "b", "c");&lt;br /&gt;paste(x, 1:3, sep=".."); &lt;br /&gt;paste(x, collapse=".."); &lt;br /&gt;strsplit("a.b.c", ".", fixed = TRUE) &lt;br /&gt;unlist(strsplit("a.b.c", ".", fixed = TRUE)); &lt;br /&gt;strtrim(c("abcdef", "abcdef", "abcdef"), c(1,5,10)) &lt;br /&gt;substr("abcdef",2,4) &lt;br /&gt;substring("abcdef",c(1, 3), c(2, 6)) &lt;br /&gt;strtrim(c("abcdef", "abcdef", "abcdef"), c(1,5,10))&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-3124524979407947418?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/3124524979407947418/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=3124524979407947418&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/3124524979407947418'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/3124524979407947418'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/02/r-string-manipulation.html' title='R string manipulation'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-5028327568359377338</id><published>2008-02-04T11:26:00.000-08:00</published><updated>2009-05-21T09:39:29.658-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='point pch'/><title type='text'>R points</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_xLDEmoNB_RM/SHO-dMjhjlI/AAAAAAAADVk/QVY72RoCtS4/s1600-h/Rpoints.PNG"&gt;&lt;img style="cursor: pointer;" src="http://3.bp.blogspot.com/_xLDEmoNB_RM/SHO-dMjhjlI/AAAAAAAADVk/QVY72RoCtS4/s400/Rpoints.PNG" alt="" id="BLOGGER_PHOTO_ID_5220725801848245842" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;R use pch to control the point symbol.  This can either be a single character or an integer code for one of a set of graphics symbols. The full set of S symbols is available with pch=0:18.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-5028327568359377338?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/5028327568359377338/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=5028327568359377338&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/5028327568359377338'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/5028327568359377338'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/02/r-points.html' title='R points'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_xLDEmoNB_RM/SHO-dMjhjlI/AAAAAAAADVk/QVY72RoCtS4/s72-c/Rpoints.PNG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-4006299122230529711</id><published>2008-01-04T12:20:00.000-08:00</published><updated>2008-10-29T19:58:50.707-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='symbol'/><category scheme='http://www.blogger.com/atom/ns#' term='figure'/><title type='text'>special symbols on R plot</title><content type='html'>Sometimes we need put special symbols, like Greek letters and italic type, on the plots. Sometimes, we also want to put mathematical annotation on the plot. In R we can use the function expression() do this job:&lt;br /&gt;&lt;br /&gt;Sample codes&lt;br /&gt;&lt;br /&gt;xlab.name = expression(paste(italic(vti), Delta,  sep=""))&lt;br /&gt;ylab.name = expression(mu * "ml")&lt;br /&gt;main.name = expression(paste(plain(sin) * phi))&lt;br /&gt;plot(0, 0, xlab=xlab.name, ylab=ylab.name, main=main.name, xlim=c(-pi, pi), ylim=c(-1.5, 1.5),  axes=FALSE)&lt;br /&gt;axis(1, at = c(-pi, -pi/2, 0, pi/2, pi), labels = expression(-pi, -pi/2, 0, pi/2, pi))&lt;br /&gt;axis(2)&lt;br /&gt;box()&lt;br /&gt;text(-pi/2, 0, expression(hat(alpha) == (X^t * X)^{-1} * X^t * y))&lt;br /&gt;text(pi/2, 0, expression(paste(frac(1, sigma*sqrt(2*pi)),                plain(e)^{frac(-(x-mu)^2, 2*sigma^2)}, sep="")), cex = 1.2)&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_xLDEmoNB_RM/SHO_NBFfWPI/AAAAAAAADVs/u7KeD7feCgQ/s1600-h/Rsymbol.jpeg"&gt;&lt;img style="cursor: pointer;" src="http://1.bp.blogspot.com/_xLDEmoNB_RM/SHO_NBFfWPI/AAAAAAAADVs/u7KeD7feCgQ/s400/Rsymbol.jpeg" alt="" id="BLOGGER_PHOTO_ID_5220726623403202802" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;###  #######################&lt;br /&gt;PS: for the subscript expression,&lt;br /&gt;We should use &lt;br /&gt;&gt; ylab.name = expression(sigma[21])&lt;br /&gt;&lt;br /&gt;############################&lt;br /&gt;There are webpages talking about this issue in details. Mathematical Annotation in R: &lt;a href="http://rweb.stat.umn.edu/R/library/grDevices/html/plotmath.html"&gt;http://rweb.stat.umn.edu/R/library/grDevices/html/plotmath.html&lt;/a&gt;&lt;br /&gt;##############################&lt;br /&gt;others:&lt;br /&gt;main=expression(paste(italic(Sch9), Delta, " (18s and 5.8s)")), &lt;br /&gt;&lt;br /&gt;However, if we want to use a variable in the expression, we should use substitute:&lt;br /&gt;e.g.&lt;br /&gt;&lt;br /&gt;n &lt;- 20&lt;br /&gt;plot(0, 0, main = substitute(paste(n[i], " = ", k), list(k = n)));&lt;br /&gt;i=2;&lt;br /&gt;range.name = substitute(paste(italic(Sch9), Delta, " ", p, "/", k, " hr"), list(k = i*12, p = (i+1)*12));&lt;br /&gt;text(0, 0, range.name)&lt;br /&gt;# with a variable, we can use for iteration to produce multiple tags&lt;br /&gt;##############################&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-4006299122230529711?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/4006299122230529711/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=4006299122230529711&amp;isPopup=true' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4006299122230529711'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/4006299122230529711'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/01/special-symbols-on-r-plot.html' title='special symbols on R plot'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_xLDEmoNB_RM/SHO_NBFfWPI/AAAAAAAADVs/u7KeD7feCgQ/s72-c/Rsymbol.jpeg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-147223228516546289</id><published>2008-01-02T14:49:00.000-08:00</published><updated>2008-07-17T16:39:37.326-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='image'/><category scheme='http://www.blogger.com/atom/ns#' term='boxplot'/><category scheme='http://www.blogger.com/atom/ns#' term='axis'/><category scheme='http://www.blogger.com/atom/ns#' term='figure'/><title type='text'>Plot in details</title><content type='html'>When we plot, we can assign the plot title, label.&lt;br /&gt;We can also choose to show axis or not. On axis, we can set notations' letter size by cex.axis, direction by las.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Sample codes:&lt;br /&gt;&lt;br /&gt;image.data = matrix(rnorm(60), nrow=10, ncol=6)&lt;br /&gt;windows()&lt;br /&gt;par(mar=c(5, 5, 4, 1))&lt;br /&gt;image(x=1:nrow(image.data), y=1:ncol(image.data), image.data, axes = FALSE, xlab="X", ylab = "Y", main = "SAMPLE")&lt;br /&gt;par(las=2) #control word's direction on plot, las = 1 or 2 &lt;br /&gt;namess1 = paste("row", 1:ncol(image.data), sep="-")&lt;br /&gt;axis(2, 1:ncol(image.data), namess1, cex.axis=0.9, tick = FALSE)&lt;br /&gt;par(las=2) &lt;br /&gt;namess2 = paste("col", 1:nrow(image.data), sep="-")&lt;br /&gt;axis(1, 1:nrow(image.data), namess2, cex.axis=0.9, tick = TRUE)&lt;br /&gt;box()&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_xLDEmoNB_RM/SH_XwzXaivI/AAAAAAAADXM/idNTv6QqBUI/s1600-h/2.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_xLDEmoNB_RM/SH_XwzXaivI/AAAAAAAADXM/idNTv6QqBUI/s400/2.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5224131326195174130" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;windows()&lt;br /&gt;par(mar=c(5, 5, 4, 1))&lt;br /&gt;boxplot(data.frame(image.data), axes = FALSE, xlab="cols", ylab = "Distribution", main = "SAMPLE")&lt;br /&gt;axis(2, tick = TRUE)&lt;br /&gt;par(las=1)&lt;br /&gt;namess2 = paste("col", 1:nrow(image.data), sep="-")&lt;br /&gt;axis(1, 1:nrow(image.data), namess2, cex.axis=0.9, tick = TRUE)&lt;br /&gt;box();&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_xLDEmoNB_RM/SH_XtTlEOgI/AAAAAAAADXE/xY0EJNCr9XQ/s1600-h/1.png"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_xLDEmoNB_RM/SH_XtTlEOgI/AAAAAAAADXE/xY0EJNCr9XQ/s400/1.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5224131266122889730" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-147223228516546289?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/147223228516546289/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=147223228516546289&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/147223228516546289'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/147223228516546289'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2008/01/plot-in-details.html' title='Plot in details'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_xLDEmoNB_RM/SH_XwzXaivI/AAAAAAAADXM/idNTv6QqBUI/s72-c/2.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1440516860833895494.post-1423570697507966170</id><published>2007-12-29T10:20:00.000-08:00</published><updated>2008-08-06T17:05:26.047-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='general'/><title type='text'>First Post for R</title><content type='html'>I have been using R since 2005. I love it a lot.&lt;br /&gt;There are several tricks we can manipulate the R syntax to do the plot and calculation.&lt;br /&gt;This blog is my note book. I wanna share my experience with other people.&lt;br /&gt;&lt;br /&gt;post some of useful links:&lt;br /&gt;&lt;br /&gt;download R:&lt;br /&gt;&lt;a href="http://www.r-project.org/"&gt;http://www.r-project.org/&lt;/a&gt;&lt;br /&gt;Bioconductor (Bioinf):&lt;br /&gt;&lt;a href="http://www.bioconductor.org/download/"&gt;http://www.bioconductor.org/download/&lt;/a&gt;&lt;br /&gt;R graphics:&lt;br /&gt;&lt;a href="http://addictedtor.free.fr/graphiques/thumbs.php"&gt;http://addictedtor.free.fr/graphiques/thumbs.php&lt;/a&gt;&lt;br /&gt;Eclipse (editor platform):&lt;br /&gt;&lt;a href="http://www.eclipse.org/downloads/"&gt;http://www.eclipse.org/downloads/&lt;/a&gt;&lt;br /&gt;statET (eclipse plug-in):&lt;br /&gt;&lt;a href="http://www.walware.de/goto/statet"&gt;http://www.walware.de/goto/statet&lt;/a&gt;&lt;br /&gt;Rpy (use R in python):&lt;br /&gt;&lt;a href="http://rpy.sourceforge.net/"&gt;http://rpy.sourceforge.net/&lt;/a&gt;&lt;br /&gt;R note:&lt;br /&gt;1: &lt;a href="http://www.math.ncu.edu.tw/~chenwc/R_note/index.php?item=about"&gt;http://www.math.ncu.edu.tw/~chenwc/R_note/index.php?item=about&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1440516860833895494-1423570697507966170?l=statisticsr.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://statisticsr.blogspot.com/feeds/1423570697507966170/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1440516860833895494&amp;postID=1423570697507966170&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/1423570697507966170'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1440516860833895494/posts/default/1423570697507966170'/><link rel='alternate' type='text/html' href='http://statisticsr.blogspot.com/2007/12/first-post-for-r.html' title='First Post for R'/><author><name>Gary H.Y. Ge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
