<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Little World</title>
    <link>/</link>
      <atom:link href="/index.xml" rel="self" type="application/rss+xml" />
    <description>Little World</description>
    <generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>©Yihong WANG 2020</copyright><lastBuildDate>Sat, 01 Jun 2030 13:00:00 +0000</lastBuildDate>
    <image>
      <url>/img/icon-192.png</url>
      <title>Little World</title>
      <link>/</link>
    </image>
    
    <item>
      <title>Example Page 1</title>
      <link>/courses/example/example1/</link>
      <pubDate>Sun, 05 May 2019 00:00:00 +0100</pubDate>
      <guid>/courses/example/example1/</guid>
      <description>&lt;p&gt;In this tutorial, I&#39;ll share my top 10 tips for getting started with Academic:&lt;/p&gt;
&lt;h2 id=&#34;tip-1&#34;&gt;Tip 1&lt;/h2&gt;
&lt;p&gt;Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum. Sed ac faucibus dolor, scelerisque sollicitudin nisi. Cras purus urna, suscipit quis sapien eu, pulvinar tempor diam. Quisque risus orci, mollis id ante sit amet, gravida egestas nisl. Sed ac tempus magna. Proin in dui enim. Donec condimentum, sem id dapibus fringilla, tellus enim condimentum arcu, nec volutpat est felis vel metus. Vestibulum sit amet erat at nulla eleifend gravida.&lt;/p&gt;
&lt;p&gt;Nullam vel molestie justo. Curabitur vitae efficitur leo. In hac habitasse platea dictumst. Sed pulvinar mauris dui, eget varius purus congue ac. Nulla euismod, lorem vel elementum dapibus, nunc justo porta mi, sed tempus est est vel tellus. Nam et enim eleifend, laoreet sem sit amet, elementum sem. Morbi ut leo congue, maximus velit ut, finibus arcu. In et libero cursus, rutrum risus non, molestie leo. Nullam congue quam et volutpat malesuada. Sed risus tortor, pulvinar et dictum nec, sodales non mi. Phasellus lacinia commodo laoreet. Nam mollis, erat in feugiat consectetur, purus eros egestas tellus, in auctor urna odio at nibh. Mauris imperdiet nisi ac magna convallis, at rhoncus ligula cursus.&lt;/p&gt;
&lt;p&gt;Cras aliquam rhoncus ipsum, in hendrerit nunc mattis vitae. Duis vitae efficitur metus, ac tempus leo. Cras nec fringilla lacus. Quisque sit amet risus at ipsum pharetra commodo. Sed aliquam mauris at consequat eleifend. Praesent porta, augue sed viverra bibendum, neque ante euismod ante, in vehicula justo lorem ac eros. Suspendisse augue libero, venenatis eget tincidunt ut, malesuada at lorem. Donec vitae bibendum arcu. Aenean maximus nulla non pretium iaculis. Quisque imperdiet, nulla in pulvinar aliquet, velit quam ultrices quam, sit amet fringilla leo sem vel nunc. Mauris in lacinia lacus.&lt;/p&gt;
&lt;p&gt;Suspendisse a tincidunt lacus. Curabitur at urna sagittis, dictum ante sit amet, euismod magna. Sed rutrum massa id tortor commodo, vitae elementum turpis tempus. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean purus turpis, venenatis a ullamcorper nec, tincidunt et massa. Integer posuere quam rutrum arcu vehicula imperdiet. Mauris ullamcorper quam vitae purus congue, quis euismod magna eleifend. Vestibulum semper vel augue eget tincidunt. Fusce eget justo sodales, dapibus odio eu, ultrices lorem. Duis condimentum lorem id eros commodo, in facilisis mauris scelerisque. Morbi sed auctor leo. Nullam volutpat a lacus quis pharetra. Nulla congue rutrum magna a ornare.&lt;/p&gt;
&lt;p&gt;Aliquam in turpis accumsan, malesuada nibh ut, hendrerit justo. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Quisque sed erat nec justo posuere suscipit. Donec ut efficitur arcu, in malesuada neque. Nunc dignissim nisl massa, id vulputate nunc pretium nec. Quisque eget urna in risus suscipit ultricies. Pellentesque odio odio, tincidunt in eleifend sed, posuere a diam. Nam gravida nisl convallis semper elementum. Morbi vitae felis faucibus, vulputate orci placerat, aliquet nisi. Aliquam erat volutpat. Maecenas sagittis pulvinar purus, sed porta quam laoreet at.&lt;/p&gt;
&lt;h2 id=&#34;tip-2&#34;&gt;Tip 2&lt;/h2&gt;
&lt;p&gt;Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum. Sed ac faucibus dolor, scelerisque sollicitudin nisi. Cras purus urna, suscipit quis sapien eu, pulvinar tempor diam. Quisque risus orci, mollis id ante sit amet, gravida egestas nisl. Sed ac tempus magna. Proin in dui enim. Donec condimentum, sem id dapibus fringilla, tellus enim condimentum arcu, nec volutpat est felis vel metus. Vestibulum sit amet erat at nulla eleifend gravida.&lt;/p&gt;
&lt;p&gt;Nullam vel molestie justo. Curabitur vitae efficitur leo. In hac habitasse platea dictumst. Sed pulvinar mauris dui, eget varius purus congue ac. Nulla euismod, lorem vel elementum dapibus, nunc justo porta mi, sed tempus est est vel tellus. Nam et enim eleifend, laoreet sem sit amet, elementum sem. Morbi ut leo congue, maximus velit ut, finibus arcu. In et libero cursus, rutrum risus non, molestie leo. Nullam congue quam et volutpat malesuada. Sed risus tortor, pulvinar et dictum nec, sodales non mi. Phasellus lacinia commodo laoreet. Nam mollis, erat in feugiat consectetur, purus eros egestas tellus, in auctor urna odio at nibh. Mauris imperdiet nisi ac magna convallis, at rhoncus ligula cursus.&lt;/p&gt;
&lt;p&gt;Cras aliquam rhoncus ipsum, in hendrerit nunc mattis vitae. Duis vitae efficitur metus, ac tempus leo. Cras nec fringilla lacus. Quisque sit amet risus at ipsum pharetra commodo. Sed aliquam mauris at consequat eleifend. Praesent porta, augue sed viverra bibendum, neque ante euismod ante, in vehicula justo lorem ac eros. Suspendisse augue libero, venenatis eget tincidunt ut, malesuada at lorem. Donec vitae bibendum arcu. Aenean maximus nulla non pretium iaculis. Quisque imperdiet, nulla in pulvinar aliquet, velit quam ultrices quam, sit amet fringilla leo sem vel nunc. Mauris in lacinia lacus.&lt;/p&gt;
&lt;p&gt;Suspendisse a tincidunt lacus. Curabitur at urna sagittis, dictum ante sit amet, euismod magna. Sed rutrum massa id tortor commodo, vitae elementum turpis tempus. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean purus turpis, venenatis a ullamcorper nec, tincidunt et massa. Integer posuere quam rutrum arcu vehicula imperdiet. Mauris ullamcorper quam vitae purus congue, quis euismod magna eleifend. Vestibulum semper vel augue eget tincidunt. Fusce eget justo sodales, dapibus odio eu, ultrices lorem. Duis condimentum lorem id eros commodo, in facilisis mauris scelerisque. Morbi sed auctor leo. Nullam volutpat a lacus quis pharetra. Nulla congue rutrum magna a ornare.&lt;/p&gt;
&lt;p&gt;Aliquam in turpis accumsan, malesuada nibh ut, hendrerit justo. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Quisque sed erat nec justo posuere suscipit. Donec ut efficitur arcu, in malesuada neque. Nunc dignissim nisl massa, id vulputate nunc pretium nec. Quisque eget urna in risus suscipit ultricies. Pellentesque odio odio, tincidunt in eleifend sed, posuere a diam. Nam gravida nisl convallis semper elementum. Morbi vitae felis faucibus, vulputate orci placerat, aliquet nisi. Aliquam erat volutpat. Maecenas sagittis pulvinar purus, sed porta quam laoreet at.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Example Page 2</title>
      <link>/courses/example/example2/</link>
      <pubDate>Sun, 05 May 2019 00:00:00 +0100</pubDate>
      <guid>/courses/example/example2/</guid>
      <description>&lt;p&gt;Here are some more tips for getting started with Academic:&lt;/p&gt;
&lt;h2 id=&#34;tip-3&#34;&gt;Tip 3&lt;/h2&gt;
&lt;p&gt;Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum. Sed ac faucibus dolor, scelerisque sollicitudin nisi. Cras purus urna, suscipit quis sapien eu, pulvinar tempor diam. Quisque risus orci, mollis id ante sit amet, gravida egestas nisl. Sed ac tempus magna. Proin in dui enim. Donec condimentum, sem id dapibus fringilla, tellus enim condimentum arcu, nec volutpat est felis vel metus. Vestibulum sit amet erat at nulla eleifend gravida.&lt;/p&gt;
&lt;p&gt;Nullam vel molestie justo. Curabitur vitae efficitur leo. In hac habitasse platea dictumst. Sed pulvinar mauris dui, eget varius purus congue ac. Nulla euismod, lorem vel elementum dapibus, nunc justo porta mi, sed tempus est est vel tellus. Nam et enim eleifend, laoreet sem sit amet, elementum sem. Morbi ut leo congue, maximus velit ut, finibus arcu. In et libero cursus, rutrum risus non, molestie leo. Nullam congue quam et volutpat malesuada. Sed risus tortor, pulvinar et dictum nec, sodales non mi. Phasellus lacinia commodo laoreet. Nam mollis, erat in feugiat consectetur, purus eros egestas tellus, in auctor urna odio at nibh. Mauris imperdiet nisi ac magna convallis, at rhoncus ligula cursus.&lt;/p&gt;
&lt;p&gt;Cras aliquam rhoncus ipsum, in hendrerit nunc mattis vitae. Duis vitae efficitur metus, ac tempus leo. Cras nec fringilla lacus. Quisque sit amet risus at ipsum pharetra commodo. Sed aliquam mauris at consequat eleifend. Praesent porta, augue sed viverra bibendum, neque ante euismod ante, in vehicula justo lorem ac eros. Suspendisse augue libero, venenatis eget tincidunt ut, malesuada at lorem. Donec vitae bibendum arcu. Aenean maximus nulla non pretium iaculis. Quisque imperdiet, nulla in pulvinar aliquet, velit quam ultrices quam, sit amet fringilla leo sem vel nunc. Mauris in lacinia lacus.&lt;/p&gt;
&lt;p&gt;Suspendisse a tincidunt lacus. Curabitur at urna sagittis, dictum ante sit amet, euismod magna. Sed rutrum massa id tortor commodo, vitae elementum turpis tempus. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean purus turpis, venenatis a ullamcorper nec, tincidunt et massa. Integer posuere quam rutrum arcu vehicula imperdiet. Mauris ullamcorper quam vitae purus congue, quis euismod magna eleifend. Vestibulum semper vel augue eget tincidunt. Fusce eget justo sodales, dapibus odio eu, ultrices lorem. Duis condimentum lorem id eros commodo, in facilisis mauris scelerisque. Morbi sed auctor leo. Nullam volutpat a lacus quis pharetra. Nulla congue rutrum magna a ornare.&lt;/p&gt;
&lt;p&gt;Aliquam in turpis accumsan, malesuada nibh ut, hendrerit justo. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Quisque sed erat nec justo posuere suscipit. Donec ut efficitur arcu, in malesuada neque. Nunc dignissim nisl massa, id vulputate nunc pretium nec. Quisque eget urna in risus suscipit ultricies. Pellentesque odio odio, tincidunt in eleifend sed, posuere a diam. Nam gravida nisl convallis semper elementum. Morbi vitae felis faucibus, vulputate orci placerat, aliquet nisi. Aliquam erat volutpat. Maecenas sagittis pulvinar purus, sed porta quam laoreet at.&lt;/p&gt;
&lt;h2 id=&#34;tip-4&#34;&gt;Tip 4&lt;/h2&gt;
&lt;p&gt;Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum. Sed ac faucibus dolor, scelerisque sollicitudin nisi. Cras purus urna, suscipit quis sapien eu, pulvinar tempor diam. Quisque risus orci, mollis id ante sit amet, gravida egestas nisl. Sed ac tempus magna. Proin in dui enim. Donec condimentum, sem id dapibus fringilla, tellus enim condimentum arcu, nec volutpat est felis vel metus. Vestibulum sit amet erat at nulla eleifend gravida.&lt;/p&gt;
&lt;p&gt;Nullam vel molestie justo. Curabitur vitae efficitur leo. In hac habitasse platea dictumst. Sed pulvinar mauris dui, eget varius purus congue ac. Nulla euismod, lorem vel elementum dapibus, nunc justo porta mi, sed tempus est est vel tellus. Nam et enim eleifend, laoreet sem sit amet, elementum sem. Morbi ut leo congue, maximus velit ut, finibus arcu. In et libero cursus, rutrum risus non, molestie leo. Nullam congue quam et volutpat malesuada. Sed risus tortor, pulvinar et dictum nec, sodales non mi. Phasellus lacinia commodo laoreet. Nam mollis, erat in feugiat consectetur, purus eros egestas tellus, in auctor urna odio at nibh. Mauris imperdiet nisi ac magna convallis, at rhoncus ligula cursus.&lt;/p&gt;
&lt;p&gt;Cras aliquam rhoncus ipsum, in hendrerit nunc mattis vitae. Duis vitae efficitur metus, ac tempus leo. Cras nec fringilla lacus. Quisque sit amet risus at ipsum pharetra commodo. Sed aliquam mauris at consequat eleifend. Praesent porta, augue sed viverra bibendum, neque ante euismod ante, in vehicula justo lorem ac eros. Suspendisse augue libero, venenatis eget tincidunt ut, malesuada at lorem. Donec vitae bibendum arcu. Aenean maximus nulla non pretium iaculis. Quisque imperdiet, nulla in pulvinar aliquet, velit quam ultrices quam, sit amet fringilla leo sem vel nunc. Mauris in lacinia lacus.&lt;/p&gt;
&lt;p&gt;Suspendisse a tincidunt lacus. Curabitur at urna sagittis, dictum ante sit amet, euismod magna. Sed rutrum massa id tortor commodo, vitae elementum turpis tempus. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean purus turpis, venenatis a ullamcorper nec, tincidunt et massa. Integer posuere quam rutrum arcu vehicula imperdiet. Mauris ullamcorper quam vitae purus congue, quis euismod magna eleifend. Vestibulum semper vel augue eget tincidunt. Fusce eget justo sodales, dapibus odio eu, ultrices lorem. Duis condimentum lorem id eros commodo, in facilisis mauris scelerisque. Morbi sed auctor leo. Nullam volutpat a lacus quis pharetra. Nulla congue rutrum magna a ornare.&lt;/p&gt;
&lt;p&gt;Aliquam in turpis accumsan, malesuada nibh ut, hendrerit justo. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Quisque sed erat nec justo posuere suscipit. Donec ut efficitur arcu, in malesuada neque. Nunc dignissim nisl massa, id vulputate nunc pretium nec. Quisque eget urna in risus suscipit ultricies. Pellentesque odio odio, tincidunt in eleifend sed, posuere a diam. Nam gravida nisl convallis semper elementum. Morbi vitae felis faucibus, vulputate orci placerat, aliquet nisi. Aliquam erat volutpat. Maecenas sagittis pulvinar purus, sed porta quam laoreet at.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Example Talk</title>
      <link>/talk/example/</link>
      <pubDate>Sat, 01 Jun 2030 13:00:00 +0000</pubDate>
      <guid>/talk/example/</guid>
      <description>&lt;!-- raw HTML omitted --&gt;
&lt;p&gt;Slides can be added in a few ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Create&lt;/strong&gt; slides using Academic&#39;s &lt;a href=&#34;https://sourcethemes.com/academic/docs/managing-content/#create-slides&#34;&gt;&lt;em&gt;Slides&lt;/em&gt;&lt;/a&gt; feature and link using &lt;code&gt;slides&lt;/code&gt; parameter in the front matter of the talk file&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Upload&lt;/strong&gt; an existing slide deck to &lt;code&gt;static/&lt;/code&gt; and link using &lt;code&gt;url_slides&lt;/code&gt; parameter in the front matter of the talk file&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Embed&lt;/strong&gt; your slides (e.g. Google Slides) or presentation video on this page using &lt;a href=&#34;https://sourcethemes.com/academic/docs/writing-markdown-latex/&#34;&gt;shortcodes&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Further talk details can easily be added to this page using &lt;em&gt;Markdown&lt;/em&gt; and $\rm \LaTeX$ math code.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>用R取代Stata与SAS</title>
      <link>/post/2020-01-20-r-stata-workflow/</link>
      <pubDate>Mon, 20 Jan 2020 00:00:00 +0000</pubDate>
      <guid>/post/2020-01-20-r-stata-workflow/</guid>
      <description>
&lt;script src=&#34;./rmarkdown-libs/jquery/jquery.min.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;./rmarkdown-libs/elevate-section-attrs/elevate-section-attrs.js&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#安装stata&#34;&gt;安装Stata&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#在r中调用stata&#34;&gt;在R中调用Stata&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#三种环境下数据互通&#34;&gt;三种环境下数据互通&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;安装stata&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;安装Stata&lt;/h2&gt;
&lt;p&gt;首先安装&lt;code&gt;ncurses5-compat-libs&lt;/code&gt;和&lt;code&gt;libpng12&lt;/code&gt;这两个包，其次&lt;/p&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;% sudo -s

cd /tmp/

mkdir statafiles

cd statafiles

tar -zxf /home/you/Downloads/Stata14Linux64.tar.gz

cd /usr/local

mkdir stata14

cd stata14

/tmp/statafiles/install&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;安完之后把安装目录加到环境变量中去。我选择编辑&lt;code&gt;/etc/profile&lt;/code&gt;加入：&lt;/p&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;export PATH=&amp;quot;$PATH:/usr/local/stata14&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;若想不重启就生效可以&lt;code&gt;source /etc/profile&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Lic文件可以直接COPY到安装目录，或者在目录中放&lt;code&gt;stata.lic.tar.gz&lt;/code&gt;。&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;在r中调用stata&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;在R中调用Stata&lt;/h2&gt;
&lt;p&gt;通过&lt;a href=&#34;https://github.com/lbraglia/RStata&#34;&gt;&lt;code&gt;RStata&lt;/code&gt;&lt;/a&gt;实现&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#run Stata in R----
library(&amp;quot;RStata&amp;quot;)
options(&amp;quot;RStata.StataPath&amp;quot; = &amp;quot;D:\\Stata15\\StataSE-64&amp;quot;) #office
options(&amp;quot;RStata.StataPath&amp;quot; = &amp;quot;/usr/local/stata14/stata&amp;quot;) #linux #cannot use stata-se?
options(&amp;quot;RStata.StataVersion&amp;quot; = 14)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;三种环境下数据互通&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;三种环境下数据互通&lt;/h2&gt;
&lt;p&gt;R下通过两个包&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(haven) #nead read_dta to read dta
library(rio) # rio::import to read sas data
#haven::read_sas can also import sas7bdat
f1 &amp;lt;- str_c(data_loc,&amp;quot;after2007.sas7bdat&amp;quot;,sep = &amp;quot;/&amp;quot;) 
o1 &amp;lt;- str_c(data_loc,&amp;quot;after2007.dta&amp;quot;,sep = &amp;quot;/&amp;quot;) 
after2007_raw &amp;lt;-  import(f1)
after2007 %&amp;gt;% 
  mutate_if(is.numeric, as.integer) %&amp;gt;% 
  write_dta(.,o1, version = 12)
# Because sas only supports Stata 12 files (or earlier) while haven supports stata versions 8-15.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;如以上方法都无法顺利读入sas7bdat，用SAS中转&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#import stata data file, only supports 12 or earlier
PROC IMPORT OUT= WORK.S1 
            DATAFILE= &amp;quot;E:\after2007.dta&amp;quot; 
            DBMS=STATA REPLACE;
RUN;

proc export data=raw1 outfile= &amp;quot;D:\sample.dta&amp;quot; replace;
run;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>The Catcher in Rye</title>
      <link>/post/2019-11-17-catcher-in-rye/</link>
      <pubDate>Sun, 17 Nov 2019 00:00:00 +0000</pubDate>
      <guid>/post/2019-11-17-catcher-in-rye/</guid>
      <description>


&lt;p&gt;原来The Catcher in Rye并不是讲稻草人和乌鸦的故事，是中二少年失败的离家出走尝试。如果我中二期看的这书，应该会很喜欢吧。虽然现在也挺喜欢的。更奇妙的是这么多年了，竟然一点没被剧透。To Kill A Mockingbird也是如此，并不是一个讲猎人的故事，我到底是有多文盲啊！&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Manjaro折腾记</title>
      <link>/post/manjaro/</link>
      <pubDate>Wed, 06 Nov 2019 00:00:00 +0000</pubDate>
      <guid>/post/manjaro/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#缘起&#34;&gt;缘起&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#折腾备忘录&#34;&gt;折腾备忘录&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#安好之后换中国源&#34;&gt;安好之后换中国源&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#n卡驱动&#34;&gt;N卡驱动&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#中文输入法&#34;&gt;中文输入法&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#deepin桌面&#34;&gt;Deepin桌面&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#安装miniconda&#34;&gt;安装miniconda&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#搭建服务并打开端口&#34;&gt;搭建服务并打开端口&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#chrome-remote-desktop&#34;&gt;Chrome Remote Desktop&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#系统备份和恢复&#34;&gt;系统备份和恢复&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;缘起&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;缘起&lt;/h2&gt;
&lt;p&gt;一切的开始应该是从折腾家庭影院开始。最早的解决方案是Windows做服务器，不太理想，于是入手了黑群晖。在黑群晖一路走来，点亮了无数新的技能点。再加上非常幸运的有公网IP，可以折腾的余地大大增加了。在群晖系统里玩了一阵子docker之后，就想着要搞一个Linux来玩玩，不想用自己的台式机折腾，查了查说最好的linux笔记本是Chromebook，说是丝滑般的Chrome体验以及是续航最久的Linux本子，加上便宜，果断入手。到后之后发现真不错，Chrome OS再加上Android再加上Linux简直了，基本上出门的需求可以满足，虽然说我也不爱出门。但玩着玩着看到人家说最好的Linux发行版是WSL，WSL需要更新windows 10，但我家里的台式机一直停留在15年的windows版本，一升级就蓝屏循环中，这回干脆咬牙升级了下系统，玩上了WSL，想用来开docker布服务吧，我看着Windows的防火墙就头疼，还是算了。但在用Chromebook的过程中发现这个触摸板手势真的很爽啊！想在Windows下也有这么爽，入了一个联想触摸板，是旧型号，只支持Windows 8的手势，突然又幻想Linux下对触摸板的驱动是不是好些呢（做梦，最后实践表明Manjaro根本只把它认成鼠标而非触摸板），于是搞起了双系统…&lt;/p&gt;
&lt;p&gt;现在的结果是爽死了，感觉自己省了好多买服务器的钱！我真是太机智了！&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;折腾备忘录&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;折腾备忘录&lt;/h2&gt;
&lt;p&gt;以防将来又需要重装，写下安装的注意事项供未来的我参考。安装iso用的是Manjaro KDE，不要用xfce版。&lt;/p&gt;
&lt;div id=&#34;安好之后换中国源&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;安好之后换中国源&lt;/h3&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;# 中国区镜像排序，一般选择前两个镜像
sudo pacman-mirrors -i -c China -m rank
##更新数据源
sudo pacman -Syy 
## 添加archlinuxcn源
sudo nano /etc/pacman.conf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;在文件最后添加&lt;/p&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;[archlinuxcn]
SigLevel = Optional TrustedOnly
Server = https://mirrors.tuna.tsinghua.edu.cn/archlinuxcn/$arch&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;sudo pacman -Syyu //更新数据源
sudo pacman -S archlinuxcn-keyring //安装导入GPG key&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;n卡驱动&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;N卡驱动&lt;/h3&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;sudo mhwd -a pci nonfree 0300
sudo reboot
nvidia-settings&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;中文输入法&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;中文输入法&lt;/h3&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;#中文字体
sudo pacman -S adobe-source-han-sans-cn-fonts adobe-source-han-serif-cn-fonts
sudo pacman -S fcitx fcitx-googlepinyin fcitx-im fcitx-configtool

# 编辑 ~/.xinitrc sudo nano ~/.xprofile

export GTK_IM_MODULE=fcitx
export QT_IM_MODULE=fcitx
export XMODIFIERS=&amp;quot;@im=fcitx&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;deepin桌面&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Deepin桌面&lt;/h3&gt;
&lt;p&gt;安装dde&lt;/p&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;sudo pacman -S deepin deepin-extra&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;修改 /etc/lightdm/lightdm.conf&lt;/p&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;sudo cp /etc/lightdm/lightdm.conf /etc/lightdm/lightdm.conf.bak

sudo sed -i &amp;#39;s/greeter-session=lightdm-.*/greeter-session=lightdm-deepin-greeter/g&amp;#39; /etc/lightdm/lightdm.conf

sudo sed -i &amp;#39;s/user-session=xfce/user-session=deepin/g&amp;#39; /etc/lightdm/lightdm.conf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;选择桌面:注销账户，在登录界面右下角选择 deepin 桌面图标&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;安装miniconda&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;安装miniconda&lt;/h3&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh 

# 编辑 ~/.bash_profile,在最后添加如下环境变量（注意PATH要在前面）
export PATH=&amp;quot;$PATH:$HOME/miniconda3/bin&amp;quot;

# 编辑完成后
source .bash_profile

# 进入base环境或新建的python环境
source activate

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;之后便可conda和pip安装包了。&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;搭建服务并打开端口&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;搭建服务并打开端口&lt;/h3&gt;
&lt;p&gt;用的是&lt;a href=&#34;https://wiki.archlinux.org/index.php/Uncomplicated_Firewall&#34;&gt;ufw&lt;/a&gt;。&lt;/p&gt;
&lt;div id=&#34;rstudio-server开机自动运行&#34; class=&#34;section level4&#34;&gt;
&lt;h4&gt;Rstudio Server开机自动运行&lt;/h4&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;sudo rstudio-server verify-installation

# 查看狀態
systemctl status rstudio-server
# 啟動
systemctl start rstudio-server
# 關閉
systemctl stop rstudio-server

#auto start
sudo systemctl enable rstudio-server&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;太爽了！这篇post就是在Rstudio Server写就。&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;jupyter-lab&#34; class=&#34;section level4&#34;&gt;
&lt;h4&gt;Jupyter lab&lt;/h4&gt;
&lt;p&gt;在&lt;code&gt;/etc/systemd/system&lt;/code&gt;下添加&lt;code&gt;jupyter.service&lt;/code&gt;文件&lt;/p&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;#sudo nano /etc/systemd/system/jupyter.service
[Unit]
Description=Jupyter Lab

[Service]
Type=simple
PIDFile=/run/jupyter.pid
ExecStart=/home/wyih/anaconda3/bin/jupyter lab --ip 192.168.6.100 --config=/home/wyih/.jupyter/jupyter_notebook_config.py
User=wyih
Group=wyih
WorkingDirectory=/home/wyih/Jupyter Notebook
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;开启服务&lt;/p&gt;
&lt;pre class=&#34;bash&#34;&gt;&lt;code&gt;systemctl enable jupyter.service
systemctl daemon-reload
systemctl restart jupyter.service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;a href=&#34;https://jupyter-notebook.readthedocs.io/en/stable/public_server.html&#34;&gt;&lt;code&gt;jupyter_notebook_config.py&lt;/code&gt;配置&lt;/a&gt;:&lt;/p&gt;
&lt;pre class=&#34;python&#34;&gt;&lt;code&gt;c.NotebookApp.ip = &amp;#39;*&amp;#39;  # 允许访问此服务器的 IP，星号表示任意 IP
c.NotebookApp.password = u&amp;#39;sha1:xxx:xxx&amp;#39; # 之前生成的密码 hash 字串
c.NotebookApp.open_browser = False # 运行时不打开本机浏览器
c.NotebookApp.port = 8889 # 使用的端口
c.NotebookApp.allow_remote_access = True
## 是否允许notebook在root用户下运行.
c.NotebookApp.allow_root = True&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;chrome-remote-desktop&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Chrome Remote Desktop&lt;/h3&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;installed “chrome-remote-desktop” from AUR and Chrome extension.&lt;/li&gt;
&lt;li&gt;Executed &lt;code&gt;crd --setup&lt;/code&gt; in the terminal as normal user - was requested sudo password&lt;/li&gt;
&lt;li&gt;edited “.chrome-remote-desktop-session” file deleting the # in front of “exec /usr/bin/startkde” line&lt;/li&gt;
&lt;li&gt;accepted screen resolution&lt;/li&gt;
&lt;li&gt;executed &lt;code&gt;crd --restart&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;好像还是不能开始自动运行CRD。&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;系统备份和恢复&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;系统备份和恢复&lt;/h3&gt;
&lt;p&gt;还没研究明白。&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Data Vis Chapter 8</title>
      <link>/post/data-vis-chapter-8/</link>
      <pubDate>Wed, 09 Oct 2019 00:00:00 +0000</pubDate>
      <guid>/post/data-vis-chapter-8/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#use-color-palette&#34;&gt;Use Color Palette&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#layer-color-and-text-together&#34;&gt;Layer Color and Text Together&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#themes&#34;&gt;Themes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#use-theme-elements&#34;&gt;Use Theme Elements&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#two-y-axes&#34;&gt;Two y-axes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(asasec)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##                                Section         Sname Beginning Revenues
## 1      Aging and the Life Course (018)         Aging     12752    12104
## 2     Alcohol, Drugs and Tobacco (030) Alcohol/Drugs     11933     1144
## 3 Altruism and Social Solidarity (047)      Altruism      1139     1862
## 4            Animals and Society (042)       Animals       473      820
## 5             Asia/Asian America (024)          Asia      9056     2116
## 6            Body and Embodiment (048)          Body      3408     1618
##   Expenses Ending Journal Year Members
## 1    12007  12849      No 2005     598
## 2      400  12677      No 2005     301
## 3     1875   1126      No 2005      NA
## 4     1116    177      No 2005     209
## 5     1710   9462      No 2005     365
## 6     1920   3106      No 2005      NA&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-
  ggplot(
    data = subset(asasec, Year == 2014),
    mapping = aes(x = Members,
                  y = Revenues, label = Sname)
  )

p + geom_point() + geom_smooth()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-
  ggplot(
    data = subset(asasec, Year == 2014),
    mapping = aes(x = Members,
                  y = Revenues, label = Sname)
  )

p + geom_point(mapping = aes(color = Journal)) + geom_smooth(method = &amp;quot;lm&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p0 &amp;lt;-
  ggplot(
    data = subset(asasec, Year == 2014),
    mapping = aes(x = Members,
                  y = Revenues, label = Sname)
  )

p1 &amp;lt;-
  p0 + geom_smooth(method = &amp;quot;lm&amp;quot;, se = FALSE, color = &amp;quot;gray80&amp;quot;) +
  geom_point(mapping = aes(color = Journal))
library(ggrepel)
p2 &amp;lt;- p1 + geom_text_repel(data = subset(asasec, Year == 2014 &amp;amp;
                                           Revenues &amp;gt; 7000),
                           size = 2)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p3 &amp;lt;- p2 + labs(
  x = &amp;quot;Membership&amp;quot;,
  y = &amp;quot;Revenues&amp;quot;,
  color = &amp;quot;Section has own Journal&amp;quot;,
  title = &amp;quot;ASA Sections&amp;quot;,
  subtitle = &amp;quot;2014 Calendar year.&amp;quot;,
  caption = &amp;quot;Source: ASA annual report.&amp;quot;
)
p4 &amp;lt;- p3 + scale_y_continuous(labels = scales::dollar) +
  theme(legend.position = &amp;quot;bottom&amp;quot;)
p4&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-5-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;div id=&#34;use-color-palette&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Use Color Palette&lt;/h2&gt;
&lt;p&gt;Use the &lt;code&gt;RColorBrewer&lt;/code&gt; package. Access the colors by specifying the &lt;code&gt;scale_color_brewer()&lt;/code&gt; or &lt;code&gt;scale_ﬁll_brewer()&lt;/code&gt; functions, depending on the aesthetic you are mapping.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata,
            mapping = aes(x = roads, y = donors,
                          color = world))
p + geom_point(size = 2) + scale_color_brewer(palette = &amp;quot;Set2&amp;quot;) +
  theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p + geom_point(size = 2) + scale_color_brewer(palette = &amp;quot;Pastel2&amp;quot;) +
  theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-6-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p + geom_point(size = 2) + scale_color_brewer(palette = &amp;quot;Dark2&amp;quot;) +
  theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-6-3.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Specify colors manually, via &lt;code&gt;scale_color_manual()&lt;/code&gt; or &lt;code&gt;scale_fill_manual()&lt;/code&gt;. Try &lt;code&gt;demo(&#39;color&#39;)&lt;/code&gt; to see the color names in R.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;cb_palette &amp;lt;-
  c(
    &amp;quot;#999999&amp;quot;,
    &amp;quot;#E69F00&amp;quot;,
    &amp;quot;#56B4E9&amp;quot;,
    &amp;quot;#009E73&amp;quot;,
    &amp;quot;#F0E442&amp;quot;,
    &amp;quot;#0072B2&amp;quot;,
    &amp;quot;#D55E00&amp;quot;,
    &amp;quot;#CC79A7&amp;quot;
  )

p4 + scale_color_manual(values = cb_palette)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(dichromat)
library(RColorBrewer)

Default &amp;lt;- brewer.pal(5, &amp;quot;Set2&amp;quot;)

types &amp;lt;- c(&amp;quot;deutan&amp;quot;, &amp;quot;protan&amp;quot;, &amp;quot;tritan&amp;quot;)
names(types) &amp;lt;- c(&amp;quot;Deuteronopia&amp;quot;, &amp;quot;Protanopia&amp;quot;, &amp;quot;Tritanopia&amp;quot;)

color_table &amp;lt;- types %&amp;gt;% purrr::map(~ dichromat(Default, .x)) %&amp;gt;%
  as_tibble() %&amp;gt;% add_column(Default, .before = TRUE)

color_table&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 5 x 4
##   Default Deuteronopia Protanopia Tritanopia
##   &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt;        &amp;lt;chr&amp;gt;      &amp;lt;chr&amp;gt;     
## 1 #66C2A5 #AEAEA7      #BABAA5    #82BDBD   
## 2 #FC8D62 #B6B661      #9E9E63    #F29494   
## 3 #8DA0CB #9C9CCB      #9E9ECB    #92ABAB   
## 4 #E78AC3 #ACACC1      #9898C3    #DA9C9C   
## 5 #A6D854 #CACA5E      #D3D355    #B6C8C8&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;layer-color-and-text-together&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Layer Color and Text Together&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Democrat Blue and Republican Red party_colors ← c(&amp;quot;#2E74C0&amp;quot;, &amp;quot;#CB454A&amp;quot;)
p0 &amp;lt;- ggplot(
  data = subset(county_data, flipped == &amp;quot;No&amp;quot;),
  mapping = aes(x = pop, y = black / 100)
)
p1 &amp;lt;-
  p0 + geom_point(alpha = 0.15, color = &amp;quot;gray50&amp;quot;) + scale_x_log10(labels =
                                                                    scales::comma)
p1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-9-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;party_colors &amp;lt;- c(&amp;quot;#2E74C0&amp;quot;, &amp;quot;#CB454A&amp;quot;)
p2 &amp;lt;- p1 + geom_point(
  data = subset(county_data, flipped == &amp;quot;Yes&amp;quot;),
  mapping = aes(x = pop, y = black / 100, color = partywinner16)
) +
  scale_color_manual(values = party_colors) 
p2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-10-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p3 &amp;lt;-
  p2 + scale_y_continuous(labels = scales::percent) + labs(
    color = &amp;quot;County flipped to ... &amp;quot;,
    x = &amp;quot;County Population (log scale)&amp;quot;,
    y = &amp;quot;Percent Black Population&amp;quot;,
    title = &amp;quot;Flipped counties, 2016&amp;quot;,
    caption = &amp;quot;Counties in gray did not flip.&amp;quot;
  )
p3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-11-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p4 &amp;lt;-
  p3 + geom_text_repel(
    data = subset(county_data, flipped == &amp;quot;Yes&amp;quot; &amp;amp; black &amp;gt; 25),
    mapping = aes(x = pop, y = black / 100, label = state),
    size = 2
  )
p4 + theme_minimal() + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-12-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;themes&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Themes&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;theme_set(theme_bw()) 
p4 + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-13-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;theme_set(theme_dark()) 
p4 + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-13-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p4 + theme_gray()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-14-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(ggthemes)
theme_set(theme_economist())
p4 + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-15-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;theme_set(theme_wsj())
p4 + theme(
  plot.title = element_text(size = rel(0.6)),
  legend.title = element_text(size = rel(0.35)),
  plot.caption = element_text(size = rel(0.35)),
  legend.position = &amp;quot;top&amp;quot;
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-15-2.png&#34; width=&#34;672&#34; /&gt;
Claus O. Wilke’s &lt;a href=&#34;https://wilkelab.org/cowplot/articles/introduction.html&#34;&gt;&lt;code&gt;cowplot&lt;/code&gt; package&lt;/a&gt;, contains a well-developed theme suitable for figures whose final destination is a journal article. BobRudis’s &lt;a href=&#34;https://github.com/hrbrmstr/hrbrthemes&#34;&gt;&lt;code&gt;hrbrthemes&lt;/code&gt; package&lt;/a&gt;, has a distinctive and compact look and feel that takes advantage of some freely available typefaces.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(hrbrthemes)
theme_set(theme_ipsum())
p4 + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-16-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p4 + theme(
  legend.position = &amp;quot;top&amp;quot;,
  plot.title = element_text(
    size = rel(2),
    lineheight = .5,
    family = &amp;quot;Times&amp;quot;,
    face = &amp;quot;bold.italic&amp;quot;,
    colour = &amp;quot;orange&amp;quot;
  ),
  axis.text.x = element_text(
    size = rel(1.1),
    family = &amp;quot;Courier&amp;quot;,
    face = &amp;quot;bold&amp;quot;,
    color = &amp;quot;purple&amp;quot;
  )
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-16-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;use-theme-elements&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Use Theme Elements&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;yrs &amp;lt;- c(seq(1972, 1988, 4), 1993, seq(1996, 2016, 4))
mean_age &amp;lt;-
  gss_lon %&amp;gt;% filter(age %nin% NA &amp;amp;&amp;amp;
                       year %in% yrs) %&amp;gt;% group_by(year) %&amp;gt;% summarize(xbar = round(mean(age, na.rm = TRUE), 0))
mean_age$y &amp;lt;- 0.3
yr_labs &amp;lt;- data.frame(x = 85, y = 0.8, year = yrs)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-
  ggplot(data = subset(gss_lon, year %in% yrs),
         mapping = aes(x = age))
p1 &amp;lt;-
  p + geom_density(
    fill = &amp;quot;gray20&amp;quot;,
    color = FALSE,
    alpha = 0.9,
    mapping = aes(y = ..scaled..)
  ) +
  geom_vline(
    data = subset(mean_age, year %in% yrs),
    aes(xintercept = xbar),
    color = &amp;quot;white&amp;quot;,
    size = 0.5
  ) +
  geom_text(
    data = subset(mean_age, year %in% yrs),
    aes(x = xbar, y = y, label = xbar),
    nudge_x = 7.5,
    color = &amp;quot;white&amp;quot;,
    size = 3.5,
    hjust = 1
  ) +
  geom_text(data = subset(yr_labs, year %in% yrs), aes(x = x, y = y, label = year)) +
  facet_grid(year ~ ., switch = &amp;quot;y&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p1 + 
  theme(
    plot.title = element_text(size = 16),
    axis.text.x = element_text(size = 12),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    strip.background = element_blank(),
    strip.text.y = element_blank(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(x = &amp;quot;Age&amp;quot;, y = NULL, title = &amp;quot;Age Distribution of\nGSS Respondents&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-19-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(ggridges)
p &amp;lt;-
  ggplot(data = gss_lon, mapping = aes(x = age, y = factor(
    year, levels = rev(unique(year)), ordered = TRUE
  )))
p + geom_density_ridges(alpha = 0.6,
                        fill = &amp;quot;lightblue&amp;quot;,
                        scale = 1.5) + scale_x_continuous(breaks = c(25, 50, 75)) + scale_y_discrete(expand = c(0.01, 0)) + labs(x = &amp;quot;Age&amp;quot;, y = NULL, title = &amp;quot;Age Distribution of\nGSS Respondents&amp;quot;) +
  theme_ridges() + theme(title = element_text(size = 16, face = &amp;quot;bold&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-20-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;two-y-axes&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Two y-axes&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;head(fredts)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##         date  sp500 monbase  sp500_i monbase_i
## 1 2009-03-11 696.68 1542228 100.0000  100.0000
## 2 2009-03-18 766.73 1693133 110.0548  109.7849
## 3 2009-03-25 799.10 1693133 114.7012  109.7849
## 4 2009-04-01 809.06 1733017 116.1308  112.3710
## 5 2009-04-08 830.61 1733017 119.2240  112.3710
## 6 2009-04-15 852.21 1789878 122.3245  116.0579&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;fredts_m &amp;lt;-
  fredts %&amp;gt;% select(date, sp500_i, monbase_i) %&amp;gt;% gather(key = series, value = score, sp500_i:monbase_i)
head(fredts_m)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##         date  series    score
## 1 2009-03-11 sp500_i 100.0000
## 2 2009-03-18 sp500_i 110.0548
## 3 2009-03-25 sp500_i 114.7012
## 4 2009-04-01 sp500_i 116.1308
## 5 2009-04-08 sp500_i 119.2240
## 6 2009-04-15 sp500_i 122.3245&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-
  ggplot(data = fredts_m,
         mapping = aes(
           x = date,
           y = score,
           group = series,
           color = series
         ))
p1 &amp;lt;-
  p + geom_line() + theme(legend.position = &amp;quot;top&amp;quot;) + labs(x = &amp;quot;Date&amp;quot;, y = &amp;quot;Index&amp;quot;, color = &amp;quot;Series&amp;quot;)
p &amp;lt;-
  ggplot(data = fredts,
         mapping = aes(x = date, y = sp500_i - monbase_i))
p2 &amp;lt;- p + geom_line() + labs(x = &amp;quot;Date&amp;quot;, y = &amp;quot;Difference&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;cowplot::plot_grid(p1, p2, nrow = 2, rel_heights = c(0.75, 0.25), align = &amp;quot;v&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-24-1.png&#34; width=&#34;672&#34; /&gt;
Using two y-axes gives you an extra degree of freedom to mess about with the data that, in most cases, you really should not take advantage of.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = yahoo, mapping = aes(x = Employees, y = Revenue))
p + geom_path(color = &amp;quot;gray80&amp;quot;) + geom_text(aes(color = Mayer, label = Year),
                                            size = 3,
                                            fontface = &amp;quot;bold&amp;quot;) +
  theme(legend.position = &amp;quot;bottom&amp;quot;) + labs(
    color = &amp;quot;Mayer is CEO&amp;quot;,
    x = &amp;quot;Employees&amp;quot;,
    y = &amp;quot;Revenue (Millions)&amp;quot;,
    title = &amp;quot;Yahoo Employees vs Revenues, 2004-2014&amp;quot;
  ) + scale_y_continuous(labels = scales::dollar) + scale_x_continuous(labels = scales::comma)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-25-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-
  ggplot(data = yahoo,
         mapping = aes(x = Year, y = Revenue / Employees))
p + geom_vline(xintercept = 2012) + geom_line(color = &amp;quot;gray60&amp;quot;, size = 2) + annotate(
  &amp;quot;text&amp;quot;,
  x = 2013,
  y = 0.44,
  label = &amp;quot; Mayer becomes CEO&amp;quot;,
  size = 2.5
) +
  labs(x = &amp;quot;Year\n&amp;quot;, y = &amp;quot;Revenue/Employees&amp;quot;, title = &amp;quot;Yahoo Revenue to Employee Ratio, 2004-2014&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-26-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Saying no to pie&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p_xlab &amp;lt;-
  &amp;quot;Amount Owed, in thousands of Dollars&amp;quot; 
p_title &amp;lt;- &amp;quot;Outstanding Student Loans&amp;quot; 
p_subtitle &amp;lt;- &amp;quot;44 million borrowers owe a total of $1.3 trillion&amp;quot; 
p_caption &amp;lt;- &amp;quot;Source: FRB NY&amp;quot;
f_labs &amp;lt;-
  c(`Borrowers` = &amp;quot;Percent of\nall Borrowers&amp;quot;, `Balances` = &amp;quot;Percent of\nall Balances&amp;quot;)
p &amp;lt;-
  ggplot(data = studebt,
         mapping = aes(x = Debt, y = pct / 100, fill = type))
p + geom_bar(stat = &amp;quot;identity&amp;quot;) + scale_fill_brewer(type = &amp;quot;qual&amp;quot;, palette = &amp;quot;Dark2&amp;quot;) + scale_y_continuous(labels = scales::percent) + guides(fill = FALSE) + theme(strip.text.x = element_text(face = &amp;quot;bold&amp;quot;)) + labs(
  y = NULL,
  x = p_xlab,
  caption = p_caption,
  title = p_title,
  subtitle = p_subtitle
) + facet_grid( ~ type, labeller = as_labeller(f_labs)) + coord_flip()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-27-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(viridis)
p &amp;lt;-
  ggplot(studebt, aes(y = pct / 100, x = type, fill = Debtrc)) 
p + geom_bar(stat = &amp;quot;identity&amp;quot;, color = &amp;quot;gray80&amp;quot;) + scale_x_discrete(labels = as_labeller(f_labs)) + scale_y_continuous(labels = scales::percent) + scale_fill_viridis(discrete = TRUE) + guides(
    fill = guide_legend(
      reverse = TRUE,
      title.position = &amp;quot;top&amp;quot;,
      label.position = &amp;quot;bottom&amp;quot;,
      keywidth = 3,
      nrow = 1
    )
  ) +
  labs(
    x = NULL,
    y = NULL,
    fill = &amp;quot;Amount Owed, in thousands of dollars&amp;quot;,
    caption = p_caption,
    title = p_title,
    subtitle = p_subtitle
  ) +
  theme(
    legend.position = &amp;quot;top&amp;quot;,
    axis.text.y = element_text(face = &amp;quot;bold&amp;quot;, hjust = 1, size = 12),
    axis.ticks.length = unit(0, &amp;quot;cm&amp;quot;),
    panel.grid.major.y = element_blank()
  ) +
  coord_flip()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-10-09-data-vis-chapter-8_files/figure-html/unnamed-chunk-28-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;http://r-graph-gallery.com/&#34; class=&#34;uri&#34;&gt;http://r-graph-gallery.com/&lt;/a&gt; for more examples&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Data Vis Chapter 6</title>
      <link>/post/data-vis-chapter-6/</link>
      <pubDate>Thu, 26 Sep 2019 00:00:00 +0000</pubDate>
      <guid>/post/data-vis-chapter-6/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#show-several-fits-at-once-with-a-legend&#34;&gt;Show Several Fits at Once, with a Legend&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#model-based-graphics&#34;&gt;Model-based Graphics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#tidy-model-objects-with-broom&#34;&gt;Tidy Model Objects with Broom&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#get-component-level-statistics-with-tidy&#34;&gt;get component-level statistics with tidy()&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#get-observation-level-statistics-with-augment&#34;&gt;Get observation-level statistics with augment()&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#grouped-analysis&#34;&gt;Grouped Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#plots-for-surveys&#34;&gt;Plots for Surveys&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-  ggplot(data = gapminder,
             mapping = aes(x = log(gdpPercap), y = lifeExp))

p + geom_point(alpha = 0.1) +
  geom_smooth(color = &amp;quot;tomato&amp;quot;,
              fill = &amp;quot;tomato&amp;quot;,
              method = MASS::rlm) + #robust regression line
  geom_smooth(color = &amp;quot;steelblue&amp;quot;,
              fill = &amp;quot;steelblue&amp;quot;,
              method = &amp;quot;lm&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-1-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p + geom_point(alpha = 0.1) +
  geom_smooth(
    color = &amp;quot;tomato&amp;quot;,
    method = &amp;quot;lm&amp;quot;,
    size = 1.2,
    formula = y ~ splines::bs(x, 3),
    se = FALSE
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-1-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p + geom_point(alpha = 0.1) +
  geom_quantile( # specialized version of geom)smooth that can fit quantile regression
    color = &amp;quot;tomato&amp;quot;,
    size = 1.2,
    method = &amp;quot;rqss&amp;quot;,
    lambda = 1,
    quantiles = c(0.20, 0.5, 0.85)
  )&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Smoothing formula not specified. Using: y ~ qss(x, lambda = 1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-1-3.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;div id=&#34;show-several-fits-at-once-with-a-legend&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Show Several Fits at Once, with a Legend&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model_colors &amp;lt;- RColorBrewer::brewer.pal(3, &amp;quot;Set1&amp;quot;)
model_colors&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;#E41A1C&amp;quot; &amp;quot;#377EB8&amp;quot; &amp;quot;#4DAF4A&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p0 &amp;lt;- ggplot(data = gapminder,
             mapping = aes(x = log(gdpPercap), y = lifeExp))

p1 &amp;lt;- p0 + geom_point(alpha = 0.2) +
  geom_smooth(method = &amp;quot;lm&amp;quot;, aes(color = &amp;quot;OLS&amp;quot;, fill = &amp;quot;OLS&amp;quot;)) +
  geom_smooth(
    method = &amp;quot;lm&amp;quot;,
    formula = y ~ splines::bs(x, df = 3),
    aes(color = &amp;quot;Cubic Spline&amp;quot;, fill = &amp;quot;Cubic Spline&amp;quot;)
  ) +
  geom_smooth(method = &amp;quot;loess&amp;quot;,
              aes(color = &amp;quot;LOESS&amp;quot;, fill = &amp;quot;LOESS&amp;quot;))

p1 + scale_color_manual(name = &amp;quot;Models&amp;quot;, values = model_colors) +
  scale_fill_manual(name = &amp;quot;Models&amp;quot;, values = model_colors) +
  theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;model-based-graphics&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Model-based Graphics&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;min_gdp &amp;lt;- min(gapminder$gdpPercap)
max_gdp &amp;lt;- max(gapminder$gdpPercap)
med_pop &amp;lt;- median(gapminder$pop)

pred_df &amp;lt;- expand.grid(gdpPercap = (seq(from = min_gdp, to = max_gdp,
length.out = 100)), pop = med_pop, continent = c(&amp;quot;Africa&amp;quot;,
&amp;quot;Americas&amp;quot;, &amp;quot;Asia&amp;quot;, &amp;quot;Europe&amp;quot;, &amp;quot;Oceania&amp;quot;))

dim(pred_df)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 500   3&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;out &amp;lt;- lm(formula = lifeExp ~ gdpPercap + pop + continent, data = gapminder)

pred_out &amp;lt;- predict(object = out, newdata = pred_df, interval = &amp;quot;predict&amp;quot;)
pred_df &amp;lt;- cbind(pred_df, pred_out)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-
  ggplot(
    data = subset(pred_df, continent %in% c(&amp;quot;Europe&amp;quot;, &amp;quot;Africa&amp;quot;)),
    aes(
      x = gdpPercap,
      y = fit,
      ymin = lwr,
      ymax = upr,
      color = continent,
      fill = continent,
      group = continent
    )
  )

p + geom_point(
  data = subset(gapminder,
                continent %in% c(&amp;quot;Europe&amp;quot;, &amp;quot;Africa&amp;quot;)),
  aes(x = gdpPercap, y = lifeExp,
      color = continent),
  alpha = 0.5,
  inherit.aes = FALSE
) +
  geom_line() +
  geom_ribbon(alpha = 0.2, color = FALSE) +
  scale_x_log10(labels = scales::dollar)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;tidy-model-objects-with-broom&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Tidy Model Objects with Broom&lt;/h2&gt;
&lt;div id=&#34;get-component-level-statistics-with-tidy&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;get component-level statistics with tidy()&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(broom)
out_comp &amp;lt;- tidy(out)
out_comp %&amp;gt;% round_df()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 7 x 5
##   term              estimate std.error statistic p.value
##   &amp;lt;chr&amp;gt;                &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;
## 1 (Intercept)          47.8      0.34     141.         0
## 2 gdpPercap             0        0         19.2        0
## 3 pop                   0        0          3.33       0
## 4 continentAmericas    13.5      0.6       22.5        0
## 5 continentAsia         8.19     0.570     14.3        0
## 6 continentEurope      17.5      0.62      28.0        0
## 7 continentOceania     18.1      1.78      10.2        0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;“not in” &lt;code&gt;%nin%&lt;/code&gt; is availabe via &lt;code&gt;socviz&lt;/code&gt;.
&lt;code&gt;prefix_strip&lt;/code&gt; from &lt;code&gt;socviz&lt;/code&gt; drops prefixes&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#confidence interval
out_conf &amp;lt;- tidy(out, conf.int = TRUE)
out_conf &amp;lt;- subset(out_conf, term %nin% &amp;quot;(Intercept)&amp;quot;)
out_conf$nicelabs &amp;lt;- prefix_strip(out_conf$term, &amp;quot;continent&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(out_conf,
            mapping = aes(
              x = reorder(nicelabs, estimate),
              y = estimate,
              ymin = conf.low,
              ymax = conf.high
            ))
p + geom_pointrange() + coord_flip() + labs(x = &amp;quot;&amp;quot;, y = &amp;quot;OLS Estimate&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;get-observation-level-statistics-with-augment&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Get observation-level statistics with augment()&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;out_aug &amp;lt;- augment(out)
p &amp;lt;- ggplot(data = out_aug, mapping = aes(x = .fitted, y = .resid))
p + geom_point()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-8-1.png&#34; width=&#34;672&#34; /&gt;
### Get model-level statistics with glance()
Broom is able to &lt;code&gt;tidy&lt;/code&gt; (and &lt;code&gt;augment&lt;/code&gt;, and &lt;code&gt;glance&lt;/code&gt; at) a wide range of model types.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(survival)

out_cph &amp;lt;- coxph(Surv(time, status) ~ age + sex, data = lung)
out_surv &amp;lt;- survfit(out_cph)
out_tidy &amp;lt;- tidy(out_surv)
p &amp;lt;- ggplot(data = out_tidy, mapping = aes(time, estimate))
p + geom_line() + geom_ribbon(mapping = aes(ymin = conf.low,
                                            ymax = conf.high),
                              alpha = 0.2)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-9-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;grouped-analysis&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Grouped Analysis&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;nest&lt;/code&gt; and &lt;code&gt;unnest&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;out_le &amp;lt;- gapminder %&amp;gt;%
  group_by(continent, year) %&amp;gt;%
  nest()&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;fit_ols &amp;lt;- function(df) {
  lm(lifeExp ~ log(gdpPercap), data = df)
}

out_le &amp;lt;- gapminder %&amp;gt;%
  group_by(continent, year) %&amp;gt;%
  nest() %&amp;gt;%
  mutate(model = map(data, fit_ols))



out_tidy &amp;lt;- gapminder %&amp;gt;%
  group_by(continent, year) %&amp;gt;%
  nest() %&amp;gt;%
  mutate(model = map(data, fit_ols),
         tidied = map(model, tidy)) %&amp;gt;%
  unnest(tidied, .drop = TRUE) %&amp;gt;%
  filter(term %nin% &amp;quot;(Intercept)&amp;quot; &amp;amp;
           continent %nin% &amp;quot;Oceania&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: The `.drop` argument of `unnest()` is deprecated as of tidyr 1.0.0.
## All list-columns are now preserved.
## This warning is displayed once per session.
## Call `lifecycle::last_warnings()` to see where this warning was generated.&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(
  data = out_tidy,
  mapping = aes(
    x = year,
    y = estimate,
    ymin = estimate - 2 * std.error,
    ymax = estimate + 2 * std.error,
    group = continent,
    color = continent
  )
)

p + geom_pointrange(position = position_dodge(width = 1)) +
  scale_x_continuous(breaks = unique(gapminder$year)) +
  theme(legend.position = &amp;quot;top&amp;quot;) +
  labs(x = &amp;quot;Year&amp;quot;, y = &amp;quot;Estimate&amp;quot;, color = &amp;quot;Continent&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-11-1.png&#34; width=&#34;672&#34; /&gt;
## Plot Marginal Effects&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(margins)
gss_sm$polviews_m &amp;lt;- relevel(gss_sm$polviews, ref = &amp;quot;Moderate&amp;quot;)
out_bo &amp;lt;- glm(obama ~ polviews_m + sex * race,
              family = &amp;quot;binomial&amp;quot;,
              data = gss_sm)
summary(out_bo)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Call:
## glm(formula = obama ~ polviews_m + sex * race, family = &amp;quot;binomial&amp;quot;, 
##     data = gss_sm)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.9045  -0.5541   0.1772   0.5418   2.2437  
## 
## Coefficients:
##                                   Estimate Std. Error z value Pr(&amp;gt;|z|)    
## (Intercept)                       0.296493   0.134091   2.211  0.02703 *  
## polviews_mExtremely Liberal       2.372950   0.525045   4.520 6.20e-06 ***
## polviews_mLiberal                 2.600031   0.356666   7.290 3.10e-13 ***
## polviews_mSlightly Liberal        1.293172   0.248435   5.205 1.94e-07 ***
## polviews_mSlightly Conservative  -1.355277   0.181291  -7.476 7.68e-14 ***
## polviews_mConservative           -2.347463   0.200384 -11.715  &amp;lt; 2e-16 ***
## polviews_mExtremely Conservative -2.727384   0.387210  -7.044 1.87e-12 ***
## sexFemale                         0.254866   0.145370   1.753  0.07956 .  
## raceBlack                         3.849526   0.501319   7.679 1.61e-14 ***
## raceOther                        -0.002143   0.435763  -0.005  0.99608    
## sexFemale:raceBlack              -0.197506   0.660066  -0.299  0.76477    
## sexFemale:raceOther               1.574829   0.587657   2.680  0.00737 ** 
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2247.9  on 1697  degrees of freedom
## Residual deviance: 1345.9  on 1686  degrees of freedom
##   (1169 observations deleted due to missingness)
## AIC: 1369.9
## 
## Number of Fisher Scoring iterations: 6&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bo_m &amp;lt;- margins(out_bo)
summary(bo_m)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##                            factor     AME     SE        z      p   lower
##            polviews_mConservative -0.4119 0.0283 -14.5394 0.0000 -0.4674
##  polviews_mExtremely Conservative -0.4538 0.0420 -10.7971 0.0000 -0.5361
##       polviews_mExtremely Liberal  0.2681 0.0295   9.0996 0.0000  0.2103
##                 polviews_mLiberal  0.2768 0.0229  12.0736 0.0000  0.2319
##   polviews_mSlightly Conservative -0.2658 0.0330  -8.0596 0.0000 -0.3304
##        polviews_mSlightly Liberal  0.1933 0.0303   6.3896 0.0000  0.1340
##                         raceBlack  0.4032 0.0173  23.3568 0.0000  0.3694
##                         raceOther  0.1247 0.0386   3.2297 0.0012  0.0490
##                         sexFemale  0.0443 0.0177   2.5073 0.0122  0.0097
##    upper
##  -0.3564
##  -0.3714
##   0.3258
##   0.3218
##  -0.2011
##   0.2526
##   0.4371
##   0.2005
##   0.0789&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The margins library comes with several plot methods of its own. If you wish, at this point you can just try &lt;code&gt;plot(bo_m)&lt;/code&gt; to see a plot of the average marginal effects, produced with the general look of a Stata graphic. Other plot methods in the margins
library include &lt;code&gt;cplot()&lt;/code&gt;, which visualizes marginal effects conditional on a second variable, and &lt;code&gt;image()&lt;/code&gt;, which shows predictions or marginal effects as a filled heatmap or contour plot.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bo_gg &amp;lt;- as_tibble(summary(bo_m))
prefixes &amp;lt;- c(&amp;quot;polviews_m&amp;quot;, &amp;quot;sex&amp;quot;)
bo_gg$factor &amp;lt;- prefix_strip(bo_gg$factor, prefixes)
bo_gg$factor &amp;lt;- prefix_replace(bo_gg$factor, &amp;quot;race&amp;quot;, &amp;quot;Race: &amp;quot;)

bo_gg %&amp;gt;% select(factor, AME, lower, upper)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 9 x 4
##   factor                     AME    lower   upper
##   &amp;lt;chr&amp;gt;                    &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;
## 1 Conservative           -0.412  -0.467   -0.356 
## 2 Extremely Conservative -0.454  -0.536   -0.371 
## 3 Extremely Liberal       0.268   0.210    0.326 
## 4 Liberal                 0.277   0.232    0.322 
## 5 Slightly Conservative  -0.266  -0.330   -0.201 
## 6 Slightly Liberal        0.193   0.134    0.253 
## 7 Race: Black             0.403   0.369    0.437 
## 8 Race: Other             0.125   0.0490   0.200 
## 9 Female                  0.0443  0.00967  0.0789&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = bo_gg, aes(
  x = reorder(factor, AME),
  y = AME,
  ymin = lower,
  ymax = upper
))

p + geom_hline(yintercept = 0, color = &amp;quot;gray80&amp;quot;) +
  geom_pointrange() + coord_flip() +
  labs(x = NULL, y = &amp;quot;Average Marginal Effect&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-13-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pv_cp &amp;lt;- cplot(out_bo, x = &amp;quot;sex&amp;quot;, draw = FALSE)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##    xvals     yvals     upper     lower
## 1   Male 0.5735849 0.6378653 0.5093045
## 2 Female 0.6344507 0.6887845 0.5801169&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = pv_cp, aes(
  x = reorder(xvals, yvals),
  y = yvals,
  ymin = lower,
  ymax = upper
))

p + geom_hline(yintercept = 0, color = &amp;quot;gray80&amp;quot;) +
  geom_pointrange() + coord_flip() +
  labs(x = NULL, y = &amp;quot;Conditional Effect&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-14-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;plots-for-surveys&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Plots for Surveys&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(survey)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Loading required package: grid&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Loading required package: Matrix&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Attaching package: &amp;#39;Matrix&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The following objects are masked from &amp;#39;package:tidyr&amp;#39;:
## 
##     expand, pack, unpack&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Attaching package: &amp;#39;survey&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The following object is masked from &amp;#39;package:graphics&amp;#39;:
## 
##     dotchart&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(srvyr)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Attaching package: &amp;#39;srvyr&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The following object is masked from &amp;#39;package:stats&amp;#39;:
## 
##     filter&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;options(survey.lonely.psu = &amp;quot;adjust&amp;quot;)
options(na.action = &amp;quot;na.pass&amp;quot;)

gss_wt &amp;lt;- subset(gss_lon, year &amp;gt; 1974) %&amp;gt;%
  mutate(stratvar = interaction(year, vstrat)) %&amp;gt;%
  as_survey_design(
    ids = vpsu,
    strata = stratvar,
    weights = wtssall,
    nest = TRUE
  )

out_grp &amp;lt;- gss_wt %&amp;gt;%
  filter(year %in% seq(1976, 2016, by = 4)) %&amp;gt;%
  group_by(year, race, degree) %&amp;gt;%
  summarize(prop = survey_mean(na.rm = TRUE)) # calculate  properly weighted survey means&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Factor `degree` contains implicit NA, consider using
## `forcats::fct_explicit_na`&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;out_mrg &amp;lt;- gss_wt %&amp;gt;%
  filter(year %in% seq(1976, 2016, by = 4)) %&amp;gt;%
  mutate(racedeg = interaction(race, degree)) %&amp;gt;%
  group_by(year, racedeg) %&amp;gt;%
  summarize(prop = survey_mean(na.rm = TRUE))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Factor `racedeg` contains implicit NA, consider using
## `forcats::fct_explicit_na`&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;out_mrg &amp;lt;- gss_wt %&amp;gt;%  filter(year %in% seq(1976, 2016, by = 4)) %&amp;gt;%
  mutate(racedeg = interaction(race, degree)) %&amp;gt;% group_by(year,
                                                           racedeg) %&amp;gt;% 
  summarize(prop = survey_mean(na.rm = TRUE)) %&amp;gt;%
  separate(racedeg, sep = &amp;quot;\\.&amp;quot;, into = c(&amp;quot;race&amp;quot;, &amp;quot;degree&amp;quot;)) &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Factor `racedeg` contains implicit NA, consider using
## `forcats::fct_explicit_na`&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(
  data = subset(out_grp, race %nin% &amp;quot;Other&amp;quot;),
  mapping = aes(
    x = degree,
    y = prop,
    ymin = prop - 2 * prop_se,
    ymax = prop + 2 * prop_se,
    fill = race,
    color = race,
    group = race
  )
)

dodge &amp;lt;- position_dodge(width = 0.9)

p + geom_col(position = dodge, alpha = 0.2) +
  geom_errorbar(position = dodge, width = 0.2) +
  scale_x_discrete(labels = scales::wrap_format(10)) +
  scale_y_continuous(labels = scales::percent) +
  scale_color_brewer(type = &amp;quot;qual&amp;quot;, palette = &amp;quot;Dark2&amp;quot;) +
  scale_fill_brewer(type = &amp;quot;qual&amp;quot;, palette = &amp;quot;Dark2&amp;quot;) +
  labs(
    title = &amp;quot;Educational Attainment by Race&amp;quot;,
    subtitle = &amp;quot;GSS 1976-2016&amp;quot;,
    fill = &amp;quot;Race&amp;quot;,
    color = &amp;quot;Race&amp;quot;,
    x = NULL,
    y = &amp;quot;Percent&amp;quot;
  ) +
  facet_wrap( ~ year, ncol = 2) +
  theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 13 rows containing missing values (geom_col).&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 13 rows containing missing values (geom_errorbar).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-15-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(
  data = subset(out_grp, race %nin% &amp;quot;Other&amp;quot;),
  mapping = aes(
    x = year,
    y = prop,
    ymin = prop - 2 * prop_se,
    ymax = prop + 2 * prop_se,
    fill = race,
    color = race,
    group = race
  )
)

p + geom_ribbon(alpha = 0.3, aes(color = NULL)) + #Use ribbon to show the error range
  geom_line() + #Use line to show a time trend
  facet_wrap( ~ degree, ncol = 1) +
  scale_y_continuous(labels = scales::percent) +
  scale_color_brewer(type = &amp;quot;qual&amp;quot;, palette = &amp;quot;Dark2&amp;quot;) +
  scale_fill_brewer(type = &amp;quot;qual&amp;quot;, palette = &amp;quot;Dark2&amp;quot;) +
  labs(
    title = &amp;quot;Educational Attainment by Race&amp;quot;,
    subtitle = &amp;quot;GSS 1976-2016&amp;quot;,
    fill = &amp;quot;Race&amp;quot;,
    color = &amp;quot;Race&amp;quot;,
    x = NULL,
    y = &amp;quot;Percent&amp;quot;
  ) +
  theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 13 rows containing missing values (geom_path).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-vis-chapter-6_files/figure-html/unnamed-chunk-16-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Other useful packages: &lt;code&gt;infer&lt;/code&gt;, &lt;code&gt;ggally&lt;/code&gt;&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Data Visualization Chapter 2-4</title>
      <link>/post/test/</link>
      <pubDate>Thu, 26 Sep 2019 00:00:00 +0000</pubDate>
      <guid>/post/test/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#chapter-2&#34;&gt;Chapter 2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#chapter-3&#34;&gt;Chapter 3&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#wrong-way-to-set-color&#34;&gt;Wrong way to set color&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#aesthetics-can-be-mapped-per-geom&#34;&gt;Aesthetics Can Be Mapped per Geom&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#save-plots&#34;&gt;Save plots&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#chapter-4&#34;&gt;Chapter 4&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#group-data-and-the-group-aesthetic&#34;&gt;Group data and the “Group” Aesthetic&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#facet-to-make-small-multiples&#34;&gt;Facet to make small multiples&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#geoms-can-transform-data&#34;&gt;Geoms can transform data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#histgrams-and-density-plots&#34;&gt;Histgrams and Density Plots&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#avoid-transformations-when-necessary&#34;&gt;Avoid Transformations When Necessary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;chapter-2&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Chapter 2&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;geom_point&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/cars-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;chapter-3&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Chapter 3&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;geom_smooth&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## `geom_smooth()` using method = &amp;#39;gam&amp;#39; and formula &amp;#39;y ~ s(x, bs = &amp;quot;cs&amp;quot;)&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/pressure-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point() + geom_smooth()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## `geom_smooth()` using method = &amp;#39;gam&amp;#39; and formula &amp;#39;y ~ s(x, bs = &amp;quot;cs&amp;quot;)&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-1-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;scale_x_log10&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point() + geom_smooth(method = &amp;quot;gam&amp;quot;) + scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;scales::dollar&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point() +
geom_smooth(method = &amp;quot;gam&amp;quot;) +
scale_x_log10(labels = scales::dollar)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;div id=&#34;wrong-way-to-set-color&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Wrong way to set color&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp,
color = &amp;quot;purple&amp;quot;))
p + geom_point() + geom_smooth(method = &amp;quot;loess&amp;quot;) + scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;aes()&lt;/code&gt; function is for mappings only. Do not use it to change properties to a particular value. If we want to set a property, we do it in the geom_ we are using, and outside the &lt;code&gt;mapping =aes(...)&lt;/code&gt;step.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point(color = &amp;quot;purple&amp;quot;) + geom_smooth(method = &amp;quot;loess&amp;quot;) + scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-5-1.png&#34; width=&#34;768&#34; /&gt;
The various &lt;code&gt;geom_&lt;/code&gt; functions can take many other arguments that will affect how the plot looks but do not involve mapping variables to aesthetic elements.
“alpha” is an aesthetic property that points (and some other plot elements) have, and to which variables can be mapped. It controls how transparent the object will appear when drawn. It’s measured on a scale of zero to one.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point(alpha = 0.3) + geom_smooth(color = &amp;quot;orange&amp;quot;, se = FALSE,
                                          size = 8, method = &amp;quot;lm&amp;quot;) + scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y=lifeExp))
p + geom_point(alpha = 0.3) +
  geom_smooth(method = &amp;quot;gam&amp;quot;) +
  scale_x_log10(labels = scales::dollar) +
  labs(x = &amp;quot;GDP Per Capita&amp;quot;, y = &amp;quot;Life Expectancy in Years&amp;quot;,
       title = &amp;quot;Economic Growth and Life Expectancy&amp;quot;,
       subtitle = &amp;quot;Data points are country-years&amp;quot;,
       caption = &amp;quot;Source: Gapminder.&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp,
                                            color = continent))
p + geom_point() + geom_smooth(method = &amp;quot;loess&amp;quot;) + scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-8-1.png&#34; width=&#34;768&#34; /&gt;
The color of the standard error ribbon is controlled by the fill aesthetic.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp,
                                            color = continent, fill = continent))
p + geom_point() + geom_smooth(method = &amp;quot;loess&amp;quot;) + scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-9-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;aesthetics-can-be-mapped-per-geom&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Aesthetics Can Be Mapped per Geom&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point(mapping = aes(color = factor(year))) + 
  geom_smooth(method = &amp;quot;loess&amp;quot;) +
  scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-10-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Order doesn’t matter!!!
Besides &lt;code&gt;scale_x_log10()&lt;/code&gt;, you can try &lt;code&gt;scale_x_sqrt()&lt;/code&gt; and &lt;code&gt;scale_x_reverse()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = pop, y = lifeExp))
p + geom_smooth(method = &amp;quot;loess&amp;quot;) + 
  geom_point(mapping = aes(color = continent)) + 
  scale_x_reverse(labels = scales::number)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-11-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point(mapping = aes(color = log(pop))) + scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-12-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;save-plots&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Save plots&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p_out &amp;lt;-  p + geom_point() + geom_smooth(method = &amp;quot;loess&amp;quot;) + scale_x_log10()
ggsave(&amp;quot;my_figure.pdf&amp;quot;, plot = p_out)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;chapter-4&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Chapter 4&lt;/h2&gt;
&lt;div id=&#34;group-data-and-the-group-aesthetic&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Group data and the “Group” Aesthetic&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = year, y = gdpPercap))
p + geom_line()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-14-1.png&#34; width=&#34;768&#34; /&gt;
use the &lt;code&gt;group&lt;/code&gt; aesthetic to tell ggplot explicitly about this country-level structure&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = year, y = gdpPercap))
p + geom_line(aes(group = country))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-15-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;facet-to-make-small-multiples&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Facet to make small multiples&lt;/h3&gt;
&lt;p&gt;use &lt;code&gt;facet_wrap()&lt;/code&gt; to split our plot by &lt;code&gt;continent&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = year, y = gdpPercap))
p + geom_line(aes(group = country)) + facet_wrap(~continent)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-16-1.png&#34; width=&#34;768&#34; /&gt;
Add another enhancements&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gapminder, mapping = aes(x = year, y = gdpPercap))
p + geom_line(color=&amp;quot;gray70&amp;quot;, aes(group = country)) + 
  geom_smooth(size= 1.1, method = &amp;quot;loess&amp;quot;, se = FALSE) +
  scale_y_log10(labels=scales::dollar) +
  facet_wrap(~continent , ncol = 5) +
  labs(x = &amp;quot;Year&amp;quot;,
       y = &amp;quot;GDP per capita on Five Continents&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-17-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Use &lt;code&gt;facet_grid&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = age, y = childs))
p + geom_point(alpha = 0.2) +
  geom_smooth() + 
  facet_grid(sex ~ race)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## `geom_smooth()` using method = &amp;#39;gam&amp;#39; and formula &amp;#39;y ~ s(x, bs = &amp;quot;cs&amp;quot;)&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 18 rows containing non-finite values (stat_smooth).&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 18 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-18-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = age, y = childs))
p + geom_point(alpha = 0.2) +
  geom_smooth() + 
  facet_grid(sex ~ race + degree)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## `geom_smooth()` using method = &amp;#39;loess&amp;#39; and formula &amp;#39;y ~ x&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 18 rows containing non-finite values (stat_smooth).&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : span too small. fewer data values than degrees of freedom.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : pseudoinverse used at 62.87&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : neighborhood radius 2.13&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : reciprocal condition number 0&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : There are other near singularities as well. 582.26&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in predLoess(object$y, object$x, newx = if
## (is.null(newdata)) object$x else if (is.data.frame(newdata))
## as.matrix(model.frame(delete.response(terms(object)), : span too small.
## fewer data values than degrees of freedom.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in predLoess(object$y, object$x, newx = if
## (is.null(newdata)) object$x else if (is.data.frame(newdata))
## as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used
## at 62.87&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in predLoess(object$y, object$x, newx = if
## (is.null(newdata)) object$x else if (is.data.frame(newdata))
## as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius
## 2.13&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in predLoess(object$y, object$x, newx = if
## (is.null(newdata)) object$x else if (is.data.frame(newdata))
## as.matrix(model.frame(delete.response(terms(object)), : reciprocal
## condition number 0&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in predLoess(object$y, object$x, newx = if
## (is.null(newdata)) object$x else if (is.data.frame(newdata))
## as.matrix(model.frame(delete.response(terms(object)), : There are other
## near singularities as well. 582.26&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 18 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-19-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;geoms-can-transform-data&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Geoms can transform data&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = bigregion))
p + geom_bar()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-20-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;geom_bar&lt;/code&gt; called the default &lt;code&gt;stat_&lt;/code&gt; function associated with it,&lt;code&gt;stat_count()&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = bigregion))
p + geom_bar(mapping = aes(y = ..prop..))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-21-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = bigregion))
p + geom_bar(mapping = aes(y = ..prop.., group = 1))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-22-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table(gss_sm$religion)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
## Protestant   Catholic     Jewish       None      Other 
##       1371        649         51        619        159&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = religion, color = religion))
p + geom_bar()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-24-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = religion, fill = religion))
p + geom_bar() + guides(fill = FALSE)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-24-2.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p + geom_bar()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-24-3.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = bigregion, fill = religion))
p + geom_bar()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-25-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = bigregion, fill = religion))
p + geom_bar(position = &amp;quot;fill&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-26-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;p&gt;if you want separate bars&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = bigregion, fill = religion))
p + geom_bar(position = &amp;quot;dodge&amp;quot;, mapping = aes(y = ..prop..,
                                               group = religion))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-27-1.png&#34; width=&#34;768&#34; /&gt;
However, they don’t sum to one within each region. They sum to one across regions.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = gss_sm, mapping = aes(x = religion))
p + geom_bar(position = &amp;quot;dodge&amp;quot;, mapping = aes(y = ..prop..,
                                               group = bigregion)) +
  facet_wrap(~bigregion, ncol=1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-28-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;histgrams-and-density-plots&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Histgrams and Density Plots&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = midwest, mapping = aes( x = area))
p + geom_histogram()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-29-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = midwest, mapping = aes( x = area))
p + geom_histogram(bins = 10)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-30-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;oh_wi &amp;lt;- c(&amp;quot;OH&amp;quot;, &amp;quot;WI&amp;quot;)
p &amp;lt;- ggplot(data = subset(midwest, subset = state %in% oh_wi),
            mapping = aes(x = percollege, fill = state))
p + geom_histogram(alpha = 0.4, bins = 20)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-31-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = midwest, mapping = aes( x = area))
p + geom_density()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-32-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = midwest, mapping = aes( x = area, fill = state,
                                           color = state))
p + geom_density(alpha = 0.3)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-33-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;avoid-transformations-when-necessary&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Avoid Transformations When Necessary&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = titanic, mapping = aes(x = fate, y = percent,
                                          fill = sex))
p + geom_bar(position = &amp;quot;dodge&amp;quot;, stat = &amp;quot;identity&amp;quot;) + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-34-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = oecd_sum,
            mapping = aes(x = year, y = diff, fill = hi_lo))
p + geom_col() + guides(fill = FALSE) + 
  labs(x = NULL, y = &amp;quot;Difference in Years&amp;quot;,
       title = &amp;quot;The US Life Expectancy Gap&amp;quot;,
       subtitle = &amp;quot;Difference between US and OECD 
       average life expectancies, 1960-2015&amp;quot;,
       caption = &amp;quot;Data: OECD. After a chart by Christopher Ingraham,
       Washington Post, December 27th 2017.&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 1 rows containing missing values (position_stack).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-Data-Visualization-Chapter-2-4_files/figure-html/unnamed-chunk-35-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Data Visualization Chapter 5</title>
      <link>/post/data-visualization-chapter-5/</link>
      <pubDate>Thu, 26 Sep 2019 00:00:00 +0000</pubDate>
      <guid>/post/data-visualization-chapter-5/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#chapter-5&#34;&gt;Chapter 5&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#use-pipes-to-summerize-data&#34;&gt;Use Pipes to Summerize Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#continuous-variables-by-group-or-category&#34;&gt;Continuous Variables by Group or Category&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#write-and-draw-in-the-plot-area&#34;&gt;Write and Draw in the Plot Area&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#scales-guides-and-themes&#34;&gt;Scales, Guides, and Themes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;chapter-5&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Chapter 5&lt;/h2&gt;
&lt;div id=&#34;use-pipes-to-summerize-data&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Use Pipes to Summerize Data&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;rel_by_region &amp;lt;- gss_sm %&amp;gt;%
  group_by(bigregion, religion) %&amp;gt;%
  summarize(N = n()) %&amp;gt;%
  mutate(freq = N / sum(N),
         pct = round((freq*100), 0))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Factor `religion` contains implicit NA, consider using
## `forcats::fct_explicit_na`&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;rel_by_region&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 24 x 5
## # Groups:   bigregion [4]
##    bigregion religion       N    freq   pct
##    &amp;lt;fct&amp;gt;     &amp;lt;fct&amp;gt;      &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
##  1 Northeast Protestant   158 0.324      32
##  2 Northeast Catholic     162 0.332      33
##  3 Northeast Jewish        27 0.0553      6
##  4 Northeast None         112 0.230      23
##  5 Northeast Other         28 0.0574      6
##  6 Northeast &amp;lt;NA&amp;gt;           1 0.00205     0
##  7 Midwest   Protestant   325 0.468      47
##  8 Midwest   Catholic     172 0.247      25
##  9 Midwest   Jewish         3 0.00432     0
## 10 Midwest   None         157 0.226      23
## # … with 14 more rows&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;rel_by_region %&amp;gt;% group_by(bigregion) %&amp;gt;% summarize(total = sum(pct))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 4 x 2
##   bigregion total
##   &amp;lt;fct&amp;gt;     &amp;lt;dbl&amp;gt;
## 1 Northeast   100
## 2 Midwest     101
## 3 South       100
## 4 West        101&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(rel_by_region, aes(x = bigregion, y = pct, fill = religion))
p + geom_col(position = &amp;quot;dodge2&amp;quot;) +
  labs(x = &amp;quot;Region&amp;quot;,y = &amp;quot;Percent&amp;quot;, fill = &amp;quot;Religion&amp;quot;) +
  theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;768&#34; /&gt;
Use &lt;code&gt;coord_flip()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(rel_by_region, aes(x = bigregion, y = pct, fill = religion))
p + geom_col(position = &amp;quot;dodge2&amp;quot;) +
  labs(x = &amp;quot;Region&amp;quot;,y = &amp;quot;Percent&amp;quot;, fill = &amp;quot;Religion&amp;quot;) +
  guides(fill = FALSE) + 
  coord_flip() + 
  facet_grid(~ bigregion)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(rel_by_region, aes(x = religion, y = pct, fill = religion))
p + geom_col(position = &amp;quot;dodge2&amp;quot;) +
  labs(x = NULL,y = &amp;quot;Percent&amp;quot;, fill = &amp;quot;Religion&amp;quot;) +
  guides(fill = FALSE) + 
  coord_flip() + 
  facet_grid(~ bigregion)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-5-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;continuous-variables-by-group-or-category&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Continuous Variables by Group or Category&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata, mapping = aes(x = year, y = donors))
p + geom_line(aes(group = country)) + facet_wrap(~country)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_path).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata, mapping = aes(x = country, y = donors))
p + geom_boxplot() + coord_flip()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing non-finite values (stat_boxplot).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata, mapping = aes(x = reorder(country,
                                                        donors, na.rm = TRUE), y = donors))
p + geom_boxplot() + labs(x = NULL) + coord_flip()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing non-finite values (stat_boxplot).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-8-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata, mapping = aes(x = reorder(country, donors, na.rm = TRUE), 
                                            y = donors, fill = world))
p + geom_boxplot() + labs(x = NULL) + 
  coord_flip() + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing non-finite values (stat_boxplot).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-9-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata, mapping = aes(x = reorder(country, donors, na.rm = TRUE), 
                                            y = donors, color = world))
p + geom_point() + labs(x = NULL) + 
  coord_flip() + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-10-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-
  ggplot(data = organdata,
         mapping = aes(
           x = reorder(country, donors, na.rm = TRUE),
           y = donors,
           color = world
         ))
p + geom_jitter() + labs(x = NULL) +
  coord_flip() + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-11-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;-
  ggplot(data = organdata,
         mapping = aes(
           x = reorder(country, donors, na.rm = TRUE),
           y = donors,
           color = world
         ))
p + geom_jitter(position = position_jitter(width = 0.15)) + labs(x = NULL) +
  coord_flip() + theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-12-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;by_country &amp;lt;-
  organdata %&amp;gt;% group_by(consent_law, country) %&amp;gt;% summarize(
    donors_mean = mean(donors, na.rm = TRUE),
    donors_sd = sd(donors, na.rm = TRUE),
    gdp_mean = mean(gdp, na.rm = TRUE),
    health_mean = mean(health, na.rm = TRUE),
    roads_mean = mean(roads, na.rm = TRUE),
    cerebvas_mean = mean(cerebvas, na.rm = TRUE)
  )&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;by_country&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 17 x 8
## # Groups:   consent_law [2]
##    consent_law country donors_mean donors_sd gdp_mean health_mean
##    &amp;lt;chr&amp;gt;       &amp;lt;chr&amp;gt;         &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;       &amp;lt;dbl&amp;gt;
##  1 Informed    Austra…        10.6     1.14    22179.       1958.
##  2 Informed    Canada         14.0     0.751   23711.       2272.
##  3 Informed    Denmark        13.1     1.47    23722.       2054.
##  4 Informed    Germany        13.0     0.611   22163.       2349.
##  5 Informed    Ireland        19.8     2.48    20824.       1480.
##  6 Informed    Nether…        13.7     1.55    23013.       1993.
##  7 Informed    United…        13.5     0.775   21359.       1561.
##  8 Informed    United…        20.0     1.33    29212.       3988.
##  9 Presumed    Austria        23.5     2.42    23876.       1875.
## 10 Presumed    Belgium        21.9     1.94    22500.       1958.
## 11 Presumed    Finland        18.4     1.53    21019.       1615.
## 12 Presumed    France         16.8     1.60    22603.       2160.
## 13 Presumed    Italy          11.1     4.28    21554.       1757 
## 14 Presumed    Norway         15.4     1.11    26448.       2217.
## 15 Presumed    Spain          28.1     4.96    16933        1289.
## 16 Presumed    Sweden         13.1     1.75    22415.       1951.
## 17 Presumed    Switze…        14.2     1.71    27233        2776.
## # … with 2 more variables: roads_mean &amp;lt;dbl&amp;gt;, cerebvas_mean &amp;lt;dbl&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;by_country &amp;lt;- organdata %&amp;gt;% group_by(consent_law, country) %&amp;gt;%
  summarize_if(is.numeric, lst(mean, sd), na.rm = TRUE) %&amp;gt;%
  ungroup()
by_country&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 17 x 28
##    consent_law country donors_mean pop_mean pop_dens_mean gdp_mean
##    &amp;lt;chr&amp;gt;       &amp;lt;chr&amp;gt;         &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;         &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;
##  1 Informed    Austra…        10.6   18318.         0.237   22179.
##  2 Informed    Canada         14.0   29608.         0.297   23711.
##  3 Informed    Denmark        13.1    5257.        12.2     23722.
##  4 Informed    Germany        13.0   80255.        22.5     22163.
##  5 Informed    Ireland        19.8    3674.         5.23    20824.
##  6 Informed    Nether…        13.7   15548.        37.4     23013.
##  7 Informed    United…        13.5   58187.        24.0     21359.
##  8 Informed    United…        20.0  269330.         2.80    29212.
##  9 Presumed    Austria        23.5    7927.         9.45    23876.
## 10 Presumed    Belgium        21.9   10153.        30.7     22500.
## 11 Presumed    Finland        18.4    5112.         1.51    21019.
## 12 Presumed    France         16.8   58056.        10.5     22603.
## 13 Presumed    Italy          11.1   57360.        19.0     21554.
## 14 Presumed    Norway         15.4    4386.         1.35    26448.
## 15 Presumed    Spain          28.1   39666.         7.84    16933 
## 16 Presumed    Sweden         13.1    8789.         1.95    22415.
## 17 Presumed    Switze…        14.2    7037.        17.0     27233 
## # … with 22 more variables: gdp_lag_mean &amp;lt;dbl&amp;gt;, health_mean &amp;lt;dbl&amp;gt;,
## #   health_lag_mean &amp;lt;dbl&amp;gt;, pubhealth_mean &amp;lt;dbl&amp;gt;, roads_mean &amp;lt;dbl&amp;gt;,
## #   cerebvas_mean &amp;lt;dbl&amp;gt;, assault_mean &amp;lt;dbl&amp;gt;, external_mean &amp;lt;dbl&amp;gt;,
## #   txp_pop_mean &amp;lt;dbl&amp;gt;, donors_sd &amp;lt;dbl&amp;gt;, pop_sd &amp;lt;dbl&amp;gt;, pop_dens_sd &amp;lt;dbl&amp;gt;,
## #   gdp_sd &amp;lt;dbl&amp;gt;, gdp_lag_sd &amp;lt;dbl&amp;gt;, health_sd &amp;lt;dbl&amp;gt;, health_lag_sd &amp;lt;dbl&amp;gt;,
## #   pubhealth_sd &amp;lt;dbl&amp;gt;, roads_sd &amp;lt;dbl&amp;gt;, cerebvas_sd &amp;lt;dbl&amp;gt;,
## #   assault_sd &amp;lt;dbl&amp;gt;, external_sd &amp;lt;dbl&amp;gt;, txp_pop_sd &amp;lt;dbl&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = by_country,
            mapping = aes(
              x = donors_mean,
              y = reorder(country, donors_mean),
              color = consent_law
            ))
p + geom_point(size = 3) +
  labs(x = &amp;quot;Donor Procurement Rate&amp;quot;,
       y = &amp;quot;&amp;quot;, color = &amp;quot;Consent Law&amp;quot;) +
  theme(legend.position = &amp;quot;top&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-16-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = by_country,
            mapping = aes(x = donors_mean,
                          y = reorder(country, donors_mean)))

p + geom_point(size = 3) +
  facet_wrap( ~ consent_law, scales = &amp;quot;free_y&amp;quot;, ncol = 1) +
  labs(x = &amp;quot;Donor Procurement Rate&amp;quot;,
       y = &amp;quot;&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-17-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = by_country,
            mapping = aes(x = reorder(country,
                                      donors_mean), y = donors_mean))

p + geom_pointrange(mapping = aes(ymin = donors_mean - donors_sd,
                                  ymax = donors_mean + donors_sd)) +
  labs(x = &amp;quot;&amp;quot;, y = &amp;quot;Donor Procurement Rate&amp;quot;) + coord_flip()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-18-1.png&#34; width=&#34;768&#34; /&gt;
### Plot Text Directly&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = by_country,
            mapping = aes(x = roads_mean,
                          y = donors_mean))
p + geom_point() + geom_text(mapping = aes(label = country))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-19-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = by_country,
            mapping = aes(x = roads_mean,
                          y = donors_mean))
p + geom_point() + geom_text(mapping = aes(label = country), hjust = 0)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-20-1.png&#34; width=&#34;768&#34; /&gt;
ggrepel is better than &lt;code&gt;geom_text()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(ggrepel)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p_title &amp;lt;-
  &amp;quot;Presidential Elections: Popular &amp;amp; Electoral College Margins&amp;quot;
p_subtitle &amp;lt;- &amp;quot;1824-2016&amp;quot;
p_caption &amp;lt;- &amp;quot;Data for 2016 are provisional.&amp;quot;
x_label &amp;lt;- &amp;quot;Winner&amp;#39;s share of Popular Vote&amp;quot;
y_label &amp;lt;- &amp;quot;Winner&amp;#39;s share of Electoral College Votes&amp;quot;

p &amp;lt;- ggplot(elections_historic,
            aes(x = popular_pct, y = ec_pct,
                label = winner_label))

p + geom_hline(yintercept = 0.5,
               size = 1.4,
               color = &amp;quot;gray80&amp;quot;) +
  geom_vline(xintercept = 0.5,
             size = 1.4,
             color = &amp;quot;gray80&amp;quot;) +
  geom_point() +
  geom_text_repel() +
  scale_x_continuous(labels = scales::percent) +
  scale_y_continuous(labels = scales::percent) +
  labs(
    x = x_label,
    y = y_label,
    title = p_title,
    subtitle = p_subtitle,
    caption = p_caption
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-22-1.png&#34; width=&#34;768&#34; /&gt;
### Label Outliers&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = by_country,
            mapping = aes(x = gdp_mean, y = health_mean))

p + geom_point() +
  geom_text_repel(data = subset(by_country, gdp_mean &amp;gt; 25000),
                  mapping = aes(label = country))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-23-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = by_country,
            mapping = aes(x = gdp_mean, y = health_mean))

p + geom_point() +
  geom_text_repel(
    data = subset(
      by_country,
      gdp_mean &amp;gt; 25000 | health_mean &amp;lt; 1500 |
        country %in% &amp;quot;Belgium&amp;quot;
    ),
    mapping = aes(label = country)
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-23-2.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;organdata$ind &amp;lt;- organdata$ccode %in% c(&amp;quot;Ita&amp;quot;, &amp;quot;Spa&amp;quot;) &amp;amp;
  organdata$year &amp;gt; 1998

p &amp;lt;- ggplot(data = organdata,
            mapping = aes(x = roads,
                          y = donors, color = ind))
p + geom_point() +
  geom_text_repel(data = subset(organdata, ind),
                  mapping = aes(label = ccode)) +
  guides(label = FALSE, color = FALSE)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-24-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;div id=&#34;write-and-draw-in-the-plot-area&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Write and Draw in the Plot Area&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata, mapping = aes(x = roads, y = donors))
p + geom_point() + annotate(
  geom = &amp;quot;text&amp;quot;,
  x = 91,
  y = 33,
  label = &amp;quot;A surprisingly high \n recovery rate.&amp;quot;,
  hjust = 0
)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-25-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata,
            mapping = aes(x = roads, y = donors))
p + geom_point() +
  annotate(
    geom = &amp;quot;rect&amp;quot;,
    xmin = 125,
    xmax = 155,
    ymin = 30,
    ymax = 35,
    fill = &amp;quot;red&amp;quot;,
    alpha = 0.2
  ) +
  annotate(
    geom = &amp;quot;text&amp;quot;,
    x = 157,
    y = 33,
    label = &amp;quot;A surprisingly high \n recovery rate.&amp;quot;,
    hjust = 0
  )&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-26-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;scales-guides-and-themes&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Scales, Guides, and Themes&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata,
            mapping = aes(x = roads,
                          y = donors,
                          color = world))
p + geom_point()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-27-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata,
            mapping = aes(x = roads,
                          y = donors,
                          color = world))
p + geom_point() + scale_x_log10() + scale_y_continuous(breaks = c(5,
                                                                   15, 25),
                                                        labels = c(&amp;quot;Five&amp;quot;, &amp;quot;Fifteen&amp;quot;, &amp;quot;Twenty Five&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-28-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata,
            mapping = aes(x = roads, y = donors,
                          color = world))
p + geom_point() + scale_color_discrete(labels = c(&amp;quot;Corporatist&amp;quot;,
                                                   &amp;quot;Liberal&amp;quot;, &amp;quot;Social Democratic&amp;quot;, &amp;quot;Unclassified&amp;quot;)) + 
  labs(x = &amp;quot;Road Deaths&amp;quot;,
       y = &amp;quot;Donor Procurement&amp;quot;, color = &amp;quot;Welfare State&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-29-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;p &amp;lt;- ggplot(data = organdata,
            mapping = aes(x = roads, y = donors,
                          color = world))
p + geom_point() + labs(x = &amp;quot;Road Deaths&amp;quot;, y = &amp;quot;Donor Procurement&amp;quot;) +
  guides(color = FALSE)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: Removed 34 rows containing missing values (geom_point).&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;./post/2019-09-26-data-visualization-chapter-5_files/figure-html/unnamed-chunk-30-1.png&#34; width=&#34;768&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Meta-Analysis Note 1</title>
      <link>/post/meta-analysis-note-1/</link>
      <pubDate>Wed, 25 Sep 2019 00:00:00 +0000</pubDate>
      <guid>/post/meta-analysis-note-1/</guid>
      <description>&lt;p&gt;本书第一章主要对一些术语进行了界定，把元分析同其它种文献综述的方式进行了区分。元分析同其它定性的总结以及定量的（Informal vote counting-一般采用多数原则来总结结论与formal vote counting-在前者基础之上采用了一些统计分析以期得到统计上显著的结论）一些分析的不同之处在于：元分析的关注点除了关注效果是否存在之外，主要关注效果的大小(effect size)。&lt;/p&gt;
&lt;p&gt;元分析的工作步骤分为五个阶段：&lt;/p&gt;
&lt;h2 id=&#34;确定问题formulate-a-problem&#34;&gt;确定问题(formulate a problem)&lt;/h2&gt;
&lt;p&gt;在确定问题开始综述工作时，要把关注的重点厘清。比如希望是一个更概括的结论或样本，还是一个适用范围有限定的结论和样本，这将决定第二三阶段对文献的取舍。&lt;/p&gt;
&lt;h2 id=&#34;取得相关文献&#34;&gt;取得相关文献&lt;/h2&gt;
&lt;p&gt;要从尽可能多的样本里采样，确保文献是有代表性的(representive)和无偏的(unbiased)。后者因为学术刊物对于结果显著的论文的发表偏好而很难实现，因此别忘了未发表的工作论文或毕业论文等。&lt;/p&gt;
&lt;h2 id=&#34;对文献进行评估精选&#34;&gt;对文献进行评估精选&lt;/h2&gt;
&lt;p&gt;本阶段对上一阶段取得的文献进行相关性评估。在此阶段会对第一阶段确定下来的研究问题进行进一步的提炼。&lt;/p&gt;
&lt;h2 id=&#34;对文献进行分析和解释&#34;&gt;对文献进行分析和解释&lt;/h2&gt;
&lt;p&gt;最花时间和最难的阶段，在此阶段中需要将文献的数据整理和输入。&lt;/p&gt;
&lt;h2 id=&#34;文献综述的写作&#34;&gt;文献综述的写作&lt;/h2&gt;
&lt;p&gt;有几点需要注意的：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;对于整个文献综述的工作过程要完整描述，记录在此过程中所做的各项取舍&lt;/li&gt;
&lt;li&gt;关键在与要回答感兴趣的问题，假如不能回答，也要解释为什么以及未来需要做什么来回答这个问题&lt;/li&gt;
&lt;li&gt;要避免文献列表的堆砌&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;作者的几点建议&#34;&gt;作者的几点建议&lt;/h2&gt;
&lt;p&gt;因为文献的收集、评估、整理直到分析很花时间和精力，所以一定要做到有规划。一些具体的建议如下：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;在文献收集过程中做记录&lt;/li&gt;
&lt;li&gt;系统的存放文献，假如同别人合作，确保你们用同一套系统来存放，处理文献&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;我对于工具的建议&#34;&gt;我对于工具的建议&lt;/h2&gt;
&lt;p&gt;&lt;del&gt;文献收集可以用&lt;a href=&#34;https://www.mendeley.com/&#34;&gt;Mendeley&lt;/a&gt;的private group功能，这样加入同一组的成员可以直接在客户端上打开PDF和加标注。虽然mendeley对免费用户private group数有限制，但可以通过在group底下再加子目录的方式来绕过限制。&lt;/del&gt;&lt;/p&gt;
&lt;h2 id=&#34;最新的建议&#34;&gt;最新的建议&lt;/h2&gt;
&lt;p&gt;转向&lt;a href=&#34;https://www.zotero.org/&#34;&gt;Zotero&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>SEM and GSEM</title>
      <link>/post/sem-and-gsem/</link>
      <pubDate>Wed, 25 Sep 2019 00:00:00 +0000</pubDate>
      <guid>/post/sem-and-gsem/</guid>
      <description>&lt;h2 id=&#34;sem&#34;&gt;SEM&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;sem bmi &amp;lt;- age children incomeln educ quickfood
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This would give us the unstandardized solution. This command uses &lt;strong&gt;maximum likelihood estimation&lt;/strong&gt; ather than the ordinary least-squares (OLS) estimation used by the &lt;code&gt;regress&lt;/code&gt; command. Add &lt;code&gt;,standardized&lt;/code&gt; just like add &lt;code&gt;,beta&lt;/code&gt; to &lt;code&gt;regress&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;option &lt;code&gt;method(mlmv)&lt;/code&gt; (maximum likelihood with missing values):
Estimation is less robust to the assumption of multivariate normality when using the method(mlmv) option than when using maximum likelihood estimation with listwise deletion of observations with missing values. Because some of the five variables in our model are not normally distributed, the method(mlmv) option needs to be used with caution. The estimation performed when we use the method(mlmv) option also assumes that the missing values are MAR&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; . By contrast, when listwise deletion is used we are assuming that missing values are MCAR&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;, and this is a much more restrictive assumption.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sem bmi &amp;lt;- age children incomeln educ quickfood, method(mlmv) standardized

estat eqgof
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The OLS regression solution and the SEM solution without MLMV, which uses listwise deletion, are producing the same standardized parameter estimates and $R^2$s. As noted, the z values are slightly larger than the t-values, and the p-values are slightly smaller. The z tests for the SEM solution are directly testing the standardized solution. The regress solution’s  t tests are testing the significance of the unstandardized B coefficients and do not directly test the significance of the Betas. The regress command does not provide such a direct test for the significance of Betas.&lt;/p&gt;
&lt;p&gt;Notice that the $R^2$ using sem with method(mlmv) is actually slightly smaller. Using all the available information in the SEM solution with MLMV is not cheating if the assumptions are met. The &lt;strong&gt;MAR&lt;/strong&gt; assumption for the SEM solution is more realistic than the &lt;strong&gt;MCAR&lt;/strong&gt; assumption required for listwise deletion to be unbiased.&lt;/p&gt;
&lt;p&gt;There are three rules to follow when using the maximum likelihood with missing values estimation.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Generate an indicator variable for each variable in your model to reflect whether an observation has a missing value.&lt;/li&gt;
&lt;li&gt;Correlate potential auxiliary variables to see whether they predict missing value indicator variables.&lt;/li&gt;
&lt;li&gt;Include additional auxiliary variables that are substantially correlated with a person’s score on a variable that has missing values.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Getting auxiliary variables into your SEM command？？？没懂&lt;/p&gt;
&lt;h2 id=&#34;gsem&#34;&gt;GSEM&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;logit obese age children incomeln educ quickfood
listcoef
glm obese age children incomeln educ quickfood, family(binomial) link(logit)
glm, eform
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The logit command is a special application of the generalized linear model. We can obtain the same results by using the glm command. The glm command requires us to specify the family of our model, family(binomial), and the link function, link(logit). To obtain the odds ratio, we can replay these results by using glm, eform.&lt;/p&gt;
&lt;p&gt;后面没看懂，以后再说吧。&lt;/p&gt;
&lt;section class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34; role=&#34;doc-endnote&#34;&gt;
&lt;p&gt;Missing at Random (MAR)This is where the unfortunate names come in.Missing at Random means  the propensity for a data point to be missing is not related to the missing data, but it is related to some of the observed data. &lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/section&gt;
</description>
    </item>
    
    <item>
      <title>Panel data in R vs in Stata</title>
      <link>/post/panel-data-in-r-vs-in-stata/</link>
      <pubDate>Tue, 27 Aug 2019 00:00:00 +0000</pubDate>
      <guid>/post/panel-data-in-r-vs-in-stata/</guid>
      <description>&lt;h2 id=&#34;panel-data-with-one-way-fixed-effect&#34;&gt;Panel data with one way fixed effect&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-R&#34; data-lang=&#34;R&#34;&gt;mm1 &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; invforward &lt;span style=&#34;color:#f92672&#34;&gt;~&lt;/span&gt; TOBINQ &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; inv &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; top3 &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; size &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; lev &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; cash &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; loss &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; lnage &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; cfo &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; sd &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; ic &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;factor&lt;/span&gt;(year)
zzz &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;plm&lt;/span&gt;(mm1,data&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;sample,model&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;within&amp;#34;&lt;/span&gt;,index&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;stkcd&amp;#34;&lt;/span&gt;))
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;same as xtreg i.year fe , without robust vcetype
用这种方法算出来$R^2$和Stata报告$R^2$ within的一致&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-R&#34; data-lang=&#34;R&#34;&gt;m1 &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; invforward &lt;span style=&#34;color:#f92672&#34;&gt;~&lt;/span&gt; TOBINQ &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; inv &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; top3 &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; size &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; lev &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; cash &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; loss &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; lnage &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; cfo &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; sd &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; ic
zz &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;plm&lt;/span&gt;(m1,data&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;sample,model&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;within&amp;#34;&lt;/span&gt;,index&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;stkcd&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;year&amp;#34;&lt;/span&gt;),effect &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;twoways&amp;#34;&lt;/span&gt;)
&lt;span style=&#34;color:#a6e22e&#34;&gt;summary&lt;/span&gt;(zz)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;same sa xtreg i.year, fe , without robust vcetype，但$R^2$较Stata报告$R^2$ within小&lt;/p&gt;
&lt;h2 id=&#34;vcetype-robust&#34;&gt;vcetype robust&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-R&#34; data-lang=&#34;R&#34;&gt;zz_r &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;coeftest&lt;/span&gt;(zz, vcov.&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;function&lt;/span&gt;(x) &lt;span style=&#34;color:#a6e22e&#34;&gt;vcovHC&lt;/span&gt;(x, type&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;sss&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#75715e&#34;&gt;# same as stata xtreg i.year, fe r&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;# OR&lt;/span&gt;
zzz_r &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;coeftest&lt;/span&gt;(zzz, vcov.&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;function&lt;/span&gt;(x) &lt;span style=&#34;color:#a6e22e&#34;&gt;vcovHC&lt;/span&gt;(x, type&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;sss&amp;#34;&lt;/span&gt;))
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;组间系数比较&#34;&gt;组间系数比较&lt;/h2&gt;
&lt;p&gt;OLS可用&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-R&#34; data-lang=&#34;R&#34;&gt;sur_diff &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt;  MVBV &lt;span style=&#34;color:#f92672&#34;&gt;~&lt;/span&gt; (Dm &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; Dh &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; EBV &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; DmEBV &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;DhEBV)&lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt;g_layer
h2t &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; h2 &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#a6e22e&#34;&gt;filter&lt;/span&gt;(g_layer &lt;span style=&#34;color:#f92672&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;)&lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#a6e22e&#34;&gt;mutate&lt;/span&gt;(g_layer &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;ifelse&lt;/span&gt;(g_layer &lt;span style=&#34;color:#f92672&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;))
mm &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;lm&lt;/span&gt;(sur_diff,data&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;h2t)
ttt &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;coeftest&lt;/span&gt;(mm, vcov.&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;function&lt;/span&gt;(x) &lt;span style=&#34;color:#a6e22e&#34;&gt;vcovHC&lt;/span&gt;(x, cluster&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;group&amp;#34;&lt;/span&gt;, type&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;HC1&amp;#34;&lt;/span&gt;))

&lt;span style=&#34;color:#a6e22e&#34;&gt;stargazer&lt;/span&gt;(fpm,models_growth_layer,type &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;text&amp;#34;&lt;/span&gt;, column.labels &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; table4_label)
&lt;span style=&#34;color:#a6e22e&#34;&gt;stargazer&lt;/span&gt;(fpm_r,robusts_growth_layer,type &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;text&amp;#34;&lt;/span&gt;, column.labels &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; table4_label,
          add.lines&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;DhEBV(4)-(2)&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;str_c&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;round&lt;/span&gt;(ttt[12,&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;],&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;),&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;**(p=&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#a6e22e&#34;&gt;round&lt;/span&gt;(ttt[12,&lt;span style=&#34;color:#ae81ff&#34;&gt;4&lt;/span&gt;],&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;),&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;)&amp;#34;&lt;/span&gt;)))
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Panel Data不行！One way, two way fixed effect都不行！
建议直接加interaction&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Difference in Difference</title>
      <link>/post/difference-in-difference/</link>
      <pubDate>Wed, 10 Jul 2019 00:00:00 +0000</pubDate>
      <guid>/post/difference-in-difference/</guid>
      <description>&lt;h2 id=&#34;效應評估模型&#34;&gt;效應評估模型&lt;/h2&gt;

&lt;p&gt;“提高最低工資是否會減少就業？”&lt;/p&gt;

&lt;p&gt;“最低工資提高是否餐廳的全職員工數會減少？”&lt;/p&gt;

&lt;p&gt;假設 $MinWage$為「最低工資有提高」的虛擬變數， $FEmp$為餐廳全職員工數。&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
FEmp_i=FEmp_{0,i}+\beta^*MinWage_i
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
FEmp_i=\beta_0+\beta_1 MinWage_i+\epsilon_i
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;「沒有受到最低工資提高影響下的員工數」$FEmp_{0,i}$與「有無受到最低工資提高影響」无關时OLS是一致估计。&lt;/p&gt;

&lt;p&gt;令 $s$表示餐廳所屬的州，則原本的效應模型可以寫成：
&lt;span  class=&#34;math&#34;&gt;\(
\begin{eqnarray}
FEmp_{is}=FEmp_{0,is}+\beta^*MinWage_{s}
\tag{7.1}
\end{eqnarray}
\)&lt;/span&gt;&lt;/p&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Pre&lt;/th&gt;
&lt;th&gt;Post&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Control&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;$MinWage=1$:PA&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Treatment&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;$MinWage=1$:NJ&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&#34;複迴歸模型&#34;&gt;複迴歸模型&lt;/h2&gt;

&lt;p&gt;餐廳的型態（大型連鎖、咖啡店、小吃店等等）會影響員工僱用量。
&lt;span  class=&#34;math&#34;&gt;\(
\begin{eqnarray}
FEmp_{is} =FEmp_{0,-type,is}+\beta^*MinWage_s+\gamma&#39;type_{is}
\tag{7.2}
\end{eqnarray}
\)&lt;/span&gt;
其中
&lt;span  class=&#34;math&#34;&gt;\(
FEmp_{0,-type,is}=FEmp_{0,is}-\mathbb{E}(FEmp_{0,is}|type_{is})
\)&lt;/span&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;在思考怱略變數偏誤(omitted variable bias)時，可能的confounder都必需放在（依實驗組/控制組分的）加總層級來思考。&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&#34;固定效果&#34;&gt;固定效果&lt;/h2&gt;

&lt;h3 id=&#34;組固定效果&#34;&gt;組固定效果&lt;/h3&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
FEmp_{is}=FEmp_{0,is}+\beta^*MinWage_{s}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;多數時候實驗組/控制組在政策還沒施行前，他們就存在組間的特質差異，也就是
&lt;span  class=&#34;math&#34;&gt;\(
FEmp_{0,is}=FEmp_{0,-\alpha_s,is}+\alpha_s
\)&lt;/span&gt;
其中$\alpha_s$ 代表因組而異的confounder效果。&lt;/p&gt;

&lt;p&gt;若沒有其他confounder，我們可以估計以下迴歸模型：
&lt;span  class=&#34;math&#34;&gt;\(
FEmp_{ist}=\alpha_s+\beta^* MinWage_{st}+\epsilon_{ist}
\)&lt;/span&gt;&lt;/p&gt;

&lt;h3 id=&#34;時間固定效果&#34;&gt;時間固定效果&lt;/h3&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
FEmp_{ist}=FEmp_{0,-(\alpha_s,\delta_t),ist}+\alpha_s+\delta_t+\beta^*MinWage_{st}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;所對應的迴歸模型為：
&lt;span  class=&#34;math&#34;&gt;\(
FEmp_{ist}=\alpha_s+\delta_t+\beta^* MinWage_{st}+\epsilon_{ist}
\)&lt;/span&gt;&lt;/p&gt;

&lt;h3 id=&#34;資料追踪不追踪&#34;&gt;資料追踪/不追踪&lt;/h3&gt;

&lt;p&gt;雖然$FEmp_{ist}$ 有到個別餐廳（即有下標 $i$），然而固定效果只到組層級（即下標 $s$)，因此在估計上我們並不需要追踪同一家餐廳——各期抽樣的餐廳可以不同。&lt;/p&gt;

&lt;h2 id=&#34;did-估计法&#34;&gt;DiD 估计法&lt;/h2&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{eqnarray}
FEmp_{ist}=\alpha_s+\delta_t+\beta^*MinWage_{st}+\epsilon_{ist}
\tag{7.3}
\end{eqnarray}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
FEmp_{ist}=\beta_0+\alpha_1D1_s+\delta_1B1_t+\beta_1MinWage_{st}+\epsilon_{ist}
\]&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;令$D1=1$代表來自第1個州（NJ）的虛擬變數。&lt;/li&gt;
&lt;li&gt;令$B1 = 1$代表政策施行「後」的虛擬變數。&lt;/li&gt;
&lt;li&gt;$MinWage_{st}=D1_s\times B1_t$&lt;/li&gt;
&lt;/ul&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;State&lt;/th&gt;
&lt;th&gt;t=0&lt;/th&gt;
&lt;th&gt;T=1&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;NJ&lt;/td&gt;
&lt;td&gt;D1=1,B1=0&lt;/td&gt;
&lt;td&gt;D1=1,B1=1&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;PA&lt;/td&gt;
&lt;td&gt;D1=0,B1=0&lt;/td&gt;
&lt;td&gt;D1=0,B1=1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&#34;cluster-standard-error&#34;&gt;cluster standard error&lt;/h2&gt;

&lt;p&gt;我們有G1-G4共四群誤差項的變異數及跨群間的共變異數需要去留意，當誤差項有聚類（clustering）可能時，必需要適當的調整估計式標準誤。&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Panel Data</title>
      <link>/post/panel-data/</link>
      <pubDate>Wed, 10 Jul 2019 00:00:00 +0000</pubDate>
      <guid>/post/panel-data/</guid>
      <description>&lt;h2 id=&#34;效應評估模型&#34;&gt;效應評估模型&lt;/h2&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall=mrall_{-BeerTax}+\beta^*BeerTax
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;提高啤酒稅（BeerTax）是否有助減低車禍死亡率（mrall）？&lt;/p&gt;

&lt;h2 id=&#34;固定效應模型&#34;&gt;固定效應模型&lt;/h2&gt;

&lt;p&gt;令 $W$代表「州愛喝酒程度」。&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$W$與 $mrall_{-BeerTax}+$有關&lt;/li&gt;
&lt;li&gt;$W$與 $BeerTax$有關&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall=(mrall_{-BT}-\mathbb{E}(mrall_{-BT}|W))+\mathbb{E}(mrall_{-BT}|W) + \beta^*BeerTax
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall_{-BT,-W}\equiv mrall_{-BT}-\mathbb{E}(mrall_{-BT}|W)
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall=mrall_{-BT,-W}+\mathbb{E}(mrall_{-BT}|W)+\beta^*BeerTax
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;$mrall_{-BT,-W}$為「去除」 $W$影響的「非啤酒稅造成的車禍死亡因素」：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;它與 $W$無關。&lt;/li&gt;
&lt;li&gt;若兩筆obs有相同飲酒文化，即$W$相同，他們的 $\mathbb{E}(mrall_{-BT}|W)$
會相同。&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;「假設」一個地方的飲酒文化「不隨時間改變」，即同一州在不同時點的$W$相同。&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;令&lt;span  class=&#34;math&#34;&gt;\(\mathbb{E}(mrall_{-BT,it}|W_i)=\alpha_i\)&lt;/span&gt;， 故我們的效應模型可以寫成：
&lt;span  class=&#34;math&#34;&gt;\(
mrall_{it}=mrall_{-BT,-W,it}+\alpha_i+\beta^*BeerTax_{it}
\)&lt;/span&gt;
其中$\alpha_i$為第 $i$ 個州的固定效果：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$BearTax$與$mrall_{-BT,-W}$無關&lt;/li&gt;
&lt;li&gt;$BearTax$與$\alpha$有關&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;組內差異最小平方法&#34;&gt;組內差異最小平方法&lt;/h2&gt;

&lt;p&gt;差分OLS解决$\alpha_i$不可得的阻碍&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall_{i1}-mrall_{i0}=\beta^* (BeerTax_{i1}-BearTax_{i0})+(mrall_{-BT,-W,i1}-mrall_{-BT,-W,i0})
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;如果$t$超過兩期，考慮用組內平均為差分比較的點。&lt;/p&gt;

&lt;p&gt;即&lt;span  class=&#34;math&#34;&gt;\(x_1-\bar{x},x_2-\bar{x},...,x_n-\bar{x}, \bar{x}=\sum_{i=1}^n x_i/n\)&lt;/span&gt;
&lt;span  class=&#34;math&#34;&gt;\(
\bar{mrall}_i=\sum_{t=1}^T mrall_{it}/T \\
\bar{BeerTax}_i=\sum_{t=1}^T BeerTax_{it}/T\\
\bar{mrall}_{-BT,-W,i}=\sum_{t=1}^T mrall_{-BT,-W,it}/T
\)&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall_{it}-\bar{mrall}_i=\beta^*\left( BeerTax_{it}-\bar{BeerTax}_i\right)+(mrall_{-BT,-W,it}-\bar{mrall}_{-BT,-W,i})
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;固定效果模型下，我們可以以最小平方法估計下面的迴歸式：
&lt;span  class=&#34;math&#34;&gt;\(
mrall_{it}-\bar{mrall}_i=\beta_0+\beta_1\left( BeerTax_{it}-\bar{BeerTax}_i\right)+\epsilon_{it}
\)&lt;/span&gt;
其中$\hat{\beta}_1$即為$\beta^*$的一致性估計&lt;/p&gt;

&lt;h2 id=&#34;常見的固定效果模型&#34;&gt;常見的固定效果模型&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Identity fixed effect:$\alpha_i$&lt;/li&gt;
&lt;li&gt;Time fixed effect:  $\delta_i$&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall_{-BT,it}=mrall_{-BT,-W_i,-Z_t}+\alpha_i+\delta_t
\]&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$W_i$為造成效應係數估計偏誤的變數，它在$i$面向固定不變。&lt;/li&gt;
&lt;li&gt;$Z_t$為造成效應係數估計偏誤的變數，它在$t$面向固定不變。&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;如$Z_t$為全美國的景氣狀況。&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;對應的迴歸模型：
&lt;span  class=&#34;math&#34;&gt;\(
mrall_{it}=\alpha_i+\delta_t+\beta_1 BeerTax_{it}+\epsilon_{it}
\)&lt;/span&gt;&lt;/p&gt;

&lt;h2 id=&#34;廣義的固定效果模型&#34;&gt;廣義的固定效果模型&lt;/h2&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall=mrall_{-BeerTax}+\beta^*BeerTax
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;但
&lt;span  class=&#34;math&#34;&gt;\(
\begin{equation}
  mrall_{-BT,it}\not\perp BeerTax_{it}
  \tag{5.1}
\end{equation}
\)&lt;/span&gt;&lt;/p&gt;

&lt;h3 id=&#34;複迴歸控制&#34;&gt;複迴歸控制&lt;/h3&gt;

&lt;p&gt;先思考造成(5.1)的變數有哪些——統計上稱這些變數為混淆變數(confounder)。Confounder中有資料的（令為$Z$）可進一步用來擴充模型成為：
&lt;span  class=&#34;math&#34;&gt;\(
mrall_{it}=mrall_{-BT,-Z,it}+\beta^*BeerTax_{it}+\gamma&#39;Z_{it}
\)&lt;/span&gt;
其中：
&lt;span  class=&#34;math&#34;&gt;\(
mrall_{-BT,-Z}=mrall_{-BT}-\mathbb{E}(mrall_{-BT}|Z)
\)&lt;/span&gt;&lt;/p&gt;

&lt;h3 id=&#34;固定效果模型&#34;&gt;固定效果模型&lt;/h3&gt;

&lt;p&gt;Confounder中沒有資料但在某些面向固定的，假設分成以下兩類：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$W_i$：在同個identity下固定。&lt;/li&gt;
&lt;li&gt;$V_t$：在同個time下固定。&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{eqnarray}
mrall_{it}=mrall_{-BT,-(Z,W,V),it}+\beta^*BeerTax_{it}+\\
\alpha_i+\delta_t+\gamma&#39;Z_{it}
\tag{5.2}
\end{eqnarray}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;(5.2)是相當廣義的固定效果效應模型——有兩個面向的固定效果及控制變數。&lt;/p&gt;

&lt;h2 id=&#34;隨機效果模型&#34;&gt;隨機效果模型&lt;/h2&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
mrall_{it}=mrall_{-BT,-Z,it}+\beta^*BeerTax_{it}+\gamma&#39;Z_{it}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;隨機效果模型(Random Effect model)的設定：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;使用迴歸模型：&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{eqnarray}
  mrall_{it}=\beta_0+\beta_{1}BeerTax_{it}+\gamma&#39;Z_{it}+\nu_{it}
  \tag{5.3}
\end{eqnarray}
\]&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;假設$\nu_{it}$ 具有某種結構。&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;其中假设：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$\nu_{it}\perp BeerTax_{it}$&lt;/li&gt;
&lt;li&gt;&lt;span  class=&#34;math&#34;&gt;\(var(\alpha_i|X)=\sigma_{\alpha}^2\)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;$var(\epsilon_{it}|X)=\sigma^2$&lt;/li&gt;
&lt;li&gt;$cov(\epsilon_{it},\epsilon_{is}|X)=0$&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;隨機效果模型帶有高度誤差項假設，故不建議使用。&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&#34;hausman檢定&#34;&gt;Hausman檢定&lt;/h2&gt;

&lt;h3 id=&#34;固定效果模型fe&#34;&gt;固定效果模型(FE)&lt;/h3&gt;

&lt;p&gt;表示使用組內差異最小平法方去估算以下迴歸模型中的&lt;span  class=&#34;math&#34;&gt;\(\beta_1\)&lt;/span&gt;:
&lt;span  class=&#34;math&#34;&gt;\(
mrall_{it}=\beta_0+\beta_{1}BeerTax_{it}+\gamma&#39;Z_{it}+\alpha_i+\epsilon_{it}
\)&lt;/span&gt;&lt;/p&gt;

&lt;h3 id=&#34;隨機效果模型re&#34;&gt;隨機效果模型(RE)&lt;/h3&gt;

&lt;p&gt;表示使用GLS去估算以下迴歸模型中的&lt;span  class=&#34;math&#34;&gt;\(\beta_1\)&lt;/span&gt;:
&lt;span  class=&#34;math&#34;&gt;\(
mrall_{it}=\beta_0+\beta_{1}BeerTax_{it}+\gamma&#39;Z_{it}+\nu_{it}
\)&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;span  class=&#34;math&#34;&gt;\(\nu_{it}=\alpha_i+\epsilon_{it}\)&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;假設&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RE下「關於variance、covariance的假設」都成立。&lt;/li&gt;
&lt;li&gt;&lt;span  class=&#34;math&#34;&gt;\(\epsilon_{it} \perp BeerTax_{it} | \alpha_i,Z_{it}\)&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;H0:&lt;/strong&gt; &lt;span  class=&#34;math&#34;&gt;\(\alpha_i \perp BeerTax_{it} |Z_{it}\)&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;H0为RE，拒绝则为FE&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Linear Regression</title>
      <link>/post/linear-regression/</link>
      <pubDate>Thu, 04 Jul 2019 00:00:00 +0000</pubDate>
      <guid>/post/linear-regression/</guid>
      <description>&lt;h2 id=&#34;ols-estimator&#34;&gt;OLS estimator&lt;/h2&gt;

&lt;p&gt;The method to compute (or &lt;em&gt;estimate&lt;/em&gt;) $b_0$ and $b_1$ we illustrated above is called &lt;em&gt;Ordinary Least Squares&lt;/em&gt;, or OLS. $b_0$ and $b_1$ are therefore also often called the &lt;em&gt;OLS coefficients&lt;/em&gt;. By solving problem&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{align}
e_i &amp; = y_i - \hat{y}_i = y_i - \underbrace{\left(b_0 + b_1 x_i\right)}_\text{prediction}\\
e_1^2 + \dots + e_N^2 &amp;= \sum_{i=1}^N e_i^2 \equiv \text{SSR}(b_0,b_1) \\
(b_0,b_1) &amp;= \arg \min_{\text{int},\text{slope}} \sum_{i=1}^N \left[y_i - \left(\text{int} + \text{slope } x_i\right)\right]^2 
\end{align}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;one can derive an explicit formula for them:&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\(
\begin{equation}
b_1 = \frac{cov(x,y)}{var(x)}
\end{equation}
\)&lt;/span&gt;
i.e. the estimate of the slope coefficient is the covariance between $x$ and $y$ divided by the variance of $x$, both computed from our sample of data. With $b_1$ in hand, we can get the estimate for the intercept as&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[\begin{equation}
b_0 = \bar{y} - b_1 \bar{x}
\end{equation}\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;where $\bar{z}$ denotes the sample mean of variable $z$. The interpretation of the OLS slope coefficient $b_1$ is as follows. Given a line as in $y = b_0 + b_1 x$,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$b_1 = \frac{d y}{d x}$ measures the change in $y$ resulting from a one unit change in $x$&lt;/li&gt;
&lt;li&gt;For example, if $y$ is wage and $x$ is years of education, $b_1$ would measure the effect of an additional year of education on wages.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is an alternative representation for the OLS slope coefficient which relates to the &lt;em&gt;correlation coefficient&lt;/em&gt; $r$. Remember that $r = \frac{cov(x,y)}{s_x s_y}$, where $s_z$ is the standard deviation of variable $z$. With this in hand, we can derive the OLS slope coefficient as&lt;/p&gt;

&lt;p&gt;$$
\begin{align}
b_1 &amp;amp;= \frac{cov(x,y)}{var(x)}\&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;= \frac{cov(x,y)}{s_x s_x} \\
&amp;= r\frac{s_y}{s_x} \end{align}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;$$&lt;/p&gt;

&lt;p&gt;In other words, the slope coefficient is equal to the correlation coefficient $r$ times the ratio of standard deviations of $y$ and $x$.&lt;/p&gt;

&lt;h3 id=&#34;linear-regression-without-regressor&#34;&gt;Linear Regression without Regressor&lt;/h3&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{equation}
y = b_0
\end{equation}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;This means that our minimization problem becomes very simple: We only have to choose $b_0$! We have&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\(
b_0 = \arg\min_{\text{int}} \sum_{i=1}^N \left[y_i - \text{int}\right]^2,
\)&lt;/span&gt;
which is a quadratic equation with a unique optimum such that
&lt;span  class=&#34;math&#34;&gt;\(
b_0 = \frac{1}{N} \sum_{i=1}^N y_i = \overline{y}.
\)&lt;/span&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Least Squares &lt;strong&gt;without regressor&lt;/strong&gt; $x$ estimates the sample mean of the outcome variable $y$, i.e. it produces $\overline{y}$.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3 id=&#34;regression-without-an-intercept&#34;&gt;Regression without an Intercept&lt;/h3&gt;

&lt;p&gt;We follow the same logic here, just that we miss another bit from our initial equation and the minimisation problem now becomes:
&lt;span  class=&#34;math&#34;&gt;\(
\begin{align}
b_1 &amp;= \arg\min_{\text{slope}} \sum_{i=1}^N \left[y_i - \text{slope } x_i \right]^2\\
\mapsto b_1 &amp;= \frac{\frac{1}{N}\sum_{i=1}^N x_i y_i}{\frac{1}{N}\sum_{i=1}^N x_i^2} = \frac{\bar{x} \bar{y}}{\overline{x^2}} 
\end{align}
\)&lt;/span&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Least Squares &lt;strong&gt;without intercept&lt;/strong&gt; (i.e. with $b_0=0$) is a line that passes through the origin.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In this case we only get to choose the slope $b_1$ of this anchored line.&lt;sup class=&#34;footnote-ref&#34; id=&#34;fnref:fn1&#34;&gt;&lt;a class=&#34;footnote&#34; href=&#34;#fn:fn1&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h3 id=&#34;centering-a-regression&#34;&gt;Centering A Regression&lt;/h3&gt;

&lt;p&gt;By &lt;em&gt;centering&lt;/em&gt; or &lt;em&gt;demeaning&lt;/em&gt; a regression, we mean to substract from both $y$ and $x$ their respective averages to obtain $\tilde{y}_i = y_i - \bar{y}$ and $\tilde{x}_i = x_i - \bar{x}$. We then run a regression &lt;em&gt;without intercept&lt;/em&gt; as above. That is, we use $\tilde{x}_i,\tilde{y}_i$ instead of $x_i,y_i$ in&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{align}
b_1 &amp;= \arg\min_{\text{slope}} \sum_{i=1}^N \left[y_i - \text{slope } x_i \right]^2\\
\mapsto b_1 &amp;= \frac{\frac{1}{N}\sum_{i=1}^N x_i y_i}{\frac{1}{N}\sum_{i=1}^N x_i^2} = \frac{\bar{x} \bar{y}}{\overline{x^2}} 
\end{align}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;to obtain our slope estimate &lt;span  class=&#34;math&#34;&gt;\(b_1\)&lt;/span&gt;:&lt;/p&gt;

&lt;p&gt;$$
\begin{align}
b&lt;em&gt;1 &amp;amp;= \frac{\frac{1}{N}\sum&lt;/em&gt;^N \tilde{x}_i \tilde{y}&lt;em&gt;i}{\frac{1}{N}\sum&lt;/em&gt;^N \tilde{x}_i^2}\&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;= \frac{\frac{1}{N}\sum_{i=1}^N (x_i - \bar{x}) (y_i - \bar{y})}{\frac{1}{N}\sum_{i=1}^N (x_i - \bar{x})^2} \\
&amp;= \frac{cov(x,y)}{var(x)}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;\end{align}
$$&lt;/p&gt;

&lt;p&gt;This last expression is &lt;em&gt;identical&lt;/em&gt; to the one in OLS estimate! It&#39;s the standard OLS estimate for the slope coefficient. We note the following:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Adding a constant to a regression produces the same result as centering all variables and estimating without intercept. So, unless all variables are centered, &lt;strong&gt;always&lt;/strong&gt; include an intercept in the regression.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3 id=&#34;reg-standard&#34;&gt;Standardizing A Regression&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Standardizing&lt;/em&gt; a variable $z$ means to demean as above, but in addition to divide the demeaned value by its own standard deviation. Similarly to what we did above for &lt;em&gt;centering&lt;/em&gt;, we define transformed variables $\breve{y}_i = \frac{y_i-\bar{y}}{\sigma_y}$ and $\breve{x}_i = \frac{x_i-\bar{x}}{\sigma_x}$ where $\sigma_z$ is the standard deviation of variable $z$. From here on, you should by now be used to what comes next! As above, we use $\breve{x}_i,\breve{y}_i$ instead of $x_i,y_i$:&lt;/p&gt;

&lt;p&gt;$$
\begin{align}
b&lt;em&gt;1 &amp;amp;= \frac{\frac{1}{N}\sum&lt;/em&gt;^N \breve{x}_i \breve{y}&lt;em&gt;i}{\frac{1}{N}\sum&lt;/em&gt;^N \breve{x}_i^2}\&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;= \frac{\frac{1}{N}\sum_{i=1}^N \frac{x_i - \bar{x}}{\sigma_x} \frac{y_i - \bar{y}}{\sigma_y}}{\frac{1}{N}\sum_{i=1}^N \left(\frac{x_i - \bar{x}}{\sigma_x}\right)^2} \\
&amp;= \frac{Cov(x,y)}{\sigma_x \sigma_y} \\
&amp;= Corr(x,y)  &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;\end{align}
$$&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;After we standardize both $y$ and $x$, the slope coefficient $b_1$ in the regression without intercept is equal to the &lt;strong&gt;correlation coefficient&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&#34;pred-resids&#34;&gt;Predictions and Residuals&lt;/h2&gt;

&lt;p&gt;Now we want to ask how our residuals $e_i$ relate to the prediction $\hat{y_i}$. Let us first think about the average of all predictions &lt;span  class=&#34;math&#34;&gt;\(\hat{y_i}\)&lt;/span&gt;, i.e. the number &lt;span  class=&#34;math&#34;&gt;\(\frac{1}{N} \sum_{i=1}^N \hat{y_i}\)&lt;/span&gt;. Let&#39;s just take&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{equation}
\hat{y}_i = b_0 + b_1 x_i 
\end{equation}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;and plug this into this average, so that we get&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{align}
\frac{1}{N} \sum_{i=1}^N \hat{y_i} &amp;= \frac{1}{N} \sum_{i=1}^N b_0 + b_1 x_i \\
&amp;= b_0 + b_1  \frac{1}{N} \sum_{i=1}^N x_i \\
&amp;= b_0 + b_1  \bar{x} \\
\end{align}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;But that last line is just equal to the formula for the OLS intercept  $b_0 = \bar{y} - b_1 \bar{x}$! That means of course that&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\(
\frac{1}{N} \sum_{i=1}^N \hat{y_i}  = b_0 + b_1  \bar{x} = \bar{y}
\)&lt;/span&gt;
in other words:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The average of our predictions $\hat{y_i}$ is identically equal to the mean of the outcome $y$. This implies that the average of the residuals is equal to zero.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Related to this result, we can show that the prediction $\hat{y}$ and the residuals are &lt;em&gt;uncorrelated&lt;/em&gt;, something that is often called &lt;strong&gt;orthogonality&lt;/strong&gt; between $\hat{y}_i$ and $e_i$. We would write this as&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{align}
Cov(\hat{y},e) &amp;=\frac{1}{N} \sum_{i=1}^N (\hat{y}_i-\bar{y})(e_i-\bar{e}) =   \frac{1}{N} \sum_{i=1}^N (\hat{y}_i-\bar{y})e_i \\
&amp;=  \frac{1}{N} \sum_{i=1}^N \hat{y}_i e_i-\bar{y} \frac{1}{N} \sum_{i=1}^N e_i = 0
\end{align}
\]&lt;/span&gt;&lt;/p&gt;

&lt;h2 id=&#34;correlation-covariance-and-linearity&#34;&gt;Correlation, Covariance and Linearity&lt;/h2&gt;

&lt;p&gt;It is important to keep in mind that Correlation and Covariance relate to a &lt;em&gt;linear&lt;/em&gt; relationship between &lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt;. Given how the regression line is estimated by OLS (see just above), you can see that the regression line inherits this property from the Covariance.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Always &lt;strong&gt;visually inspect&lt;/strong&gt; your data, and don&#39;t rely exclusively on summary statistics like &lt;em&gt;mean, variance, correlation and regression line&lt;/em&gt;. All of those assume a &lt;strong&gt;linear&lt;/strong&gt; relationship between the variables in your data.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&#34;analysing-vary&#34;&gt;Analysing $Var(y)$&lt;/h2&gt;

&lt;p&gt;Analysis of Variance (ANOVA) refers to a method to decompose variation in one variable as a function of several others. We can use this idea on our outcome $y$. Suppose we wanted to know the variance of $y$, keeping in mind that, by definition, $y_i = \hat{y}_i + e_i$. We would write&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{align}Var(y) &amp;= Var(\hat{y} + e)\\ &amp;= Var(\hat{y}) + Var(e) + 2 Cov(\hat{y},e)\\ &amp;= Var(\hat{y}) + Var(e) \end{align}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;We have seen that the covariance between prediction $\hat{y}$ and error $e$ is zero, that&#39;s why we have $Cov(\hat{y},e)=0$. What this tells us in words is that we can decompose the variance in the observed outcome $y$ into a part that relates to variance as &lt;em&gt;explained by the model&lt;/em&gt; and a part that comes from unexplained variation. Finally, we know the definition of &lt;em&gt;variance&lt;/em&gt;, and can thus write down the respective formulae for each part:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[Var(y) = \frac{1}{N}\sum_{i=1}^N (y_i - \bar{y})^2\]&lt;/span&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\(Var(\hat{y}) = \frac{1}{N}\sum_{i=1}^N (\hat{y_i} - \bar{y})^2\)&lt;/span&gt;, because the mean of $\hat{y}$ is $\bar{y}$ as we know.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Finally, &lt;span  class=&#34;math&#34;&gt;\(Var(e) = \frac{1}{N}\sum_{i=1}^N e_i^2\)&lt;/span&gt;, because the mean of $e$ is zero.
We can thus formulate how the total variation in outcome $y$ is apportioned between model and unexplained variation:&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;The total variation in outcome $y$ (often called SST, or &lt;em&gt;total sum of squares&lt;/em&gt;) is equal to the sum of explained squares (SSE) plus the sum of residuals (SSR). We have thus &lt;strong&gt;SST = SSE + SSR&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&#34;assessing-the-goodness-of-fit&#34;&gt;Assessing the &lt;em&gt;Goodness of Fit&lt;/em&gt;&lt;/h2&gt;

&lt;p&gt;In our setup, there exists a convenient measure for how good a particular statistical model fits the data. It is called $R^2$ (&lt;em&gt;R squared&lt;/em&gt;), also called the &lt;em&gt;coefficient of determination&lt;/em&gt;. We make use of the just introduced decomposition of variance, and write the formula as&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{equation}R^2 = \frac{\text{variance explained}}{\text{total variance}} = \frac{SSE}{SST} = 1 - \frac{SSR}{SST}\in[0,1]  \end{equation}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;It is easy to see that a &lt;em&gt;good fit&lt;/em&gt; is one where the sum of &lt;em&gt;explained&lt;/em&gt; squares (SSE) is large relative to the total variation (SST). In such a case, we observe an $R^2$ close to one. In the opposite case, we will see an $R^2$ close to zero. Notice that a small $R^2$ does not imply that the model is useless, just that it explains a small fraction of the observed variation.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34;&gt;

&lt;hr&gt;

&lt;ol&gt;
&lt;li id=&#34;fn:fn1&#34;&gt;This slope is related to the angle between vectors $\mathbf{a} =(\overline{x},\overline{y})$, and $\mathbf{b} = (\overline{x},0)$. Hence, it&#39;s related to the &lt;a href=&#34;https://en.wikipedia.org/wiki/Scalar_projection&#34;&gt;scalar projection&lt;/a&gt; of $\mathbf{a}$ on $\mathbf{b}$]
 &lt;a class=&#34;footnote-return&#34; href=&#34;#fnref:fn1&#34;&gt;&lt;sup&gt;^&lt;/sup&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>工具变量</title>
      <link>/post/iv/</link>
      <pubDate>Thu, 04 Jul 2019 00:00:00 +0000</pubDate>
      <guid>/post/iv/</guid>
      <description>&lt;h2 id=&#34;效應評估模型&#34;&gt;效應評估模型&lt;/h2&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[Y_{i}={Y}_{-p,i}+\beta_i P_{i}\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
Y_i=Y_{-P,i}+\beta^* P_i
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{equation}
Y_i=\beta_0+\beta_1P_i+w_i&#39;\gamma+\varepsilon
\tag{3.2}
\end{equation}
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;在$w_{i}$條件下，「香煙售價」$P_{i}$必需要與「非價格效應的香煙銷售量」$Y_{-P}$獨立，即：&lt;span  class=&#34;math&#34;&gt;\(P_i\perp Y_{-p,i} | w_i\)&lt;/span&gt; 另一個同義說法是：「香煙售價」$P_{i}$必需要與「控制$w_{i}$條件後的非價格效應香煙銷售量」獨立。&lt;/p&gt;

&lt;p&gt;对$Y_{-P}$进行$rincome$下分解
&lt;span  class=&#34;math&#34;&gt;\(
\begin{equation}
Y_{i}=Y_{-P,i}-\mathbb{E}(Y_{-P,i}|rincome_{i})+\beta^{*}P_{i}+\mathbb{E}(Y_{-P,i}|rincome_{i})
\tag{3.3}
\end{equation}
\)&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;把資料依$w_{i}$條件變數不同, 分群觀察「香煙售價」$P_{i}$與「香煙銷售量」$Y_{i}$之間的斜率。如果$w_{i}$變數選得好，同一群資料$P_{i}$與$Y_{i}$間的關連會反映應有的效應斜率——雖然有時$Y_{i}$會因為$Y_{-P,i}$的干擾影響我們對斜率高低的觀察，但因為$Y_{-P,i}$不會與$P_{i}$有關了，這些觀察干擾在大樣本下會互相抵消掉而還原應有的效應斜率值。&lt;/p&gt;

&lt;p&gt;如果不管我們怎麼選擇$w_{i}$還是無法控制住$Y_{-P,i}$對與關連$Y_{i}$的干擾，那我們就要進行【資料轉換】直接從原始資料中【去除這些干擾】，其中最常見的兩種去除法為：工具變數法、追蹤資料固定效果模型。&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;工具變數法：透過工具變數&lt;strong&gt;留下&lt;/strong&gt;$P_{i}$&lt;strong&gt;不與&lt;/strong&gt;$Y_{-P,i}$相關的部份。&lt;/li&gt;
&lt;li&gt;追蹤資料：透過變數轉換&lt;strong&gt;去除&lt;/strong&gt;$P_{i}$中&lt;strong&gt;與&lt;/strong&gt;$Y_{-P,i}$相關的部份。&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
Y_i=Y_{-p,i}+\beta\mathbb{E}(P_i|z_i)+\beta (P_i-\mathbb{E}(P_i|z_i))
\]&lt;/span&gt;&lt;/p&gt;

&lt;h3 id=&#34;relevance-condition&#34;&gt;Relevance condition&lt;/h3&gt;

&lt;p&gt;$\mathbb{E}(P|z)\neq 常数$即$z$对$P$具有解释力&lt;/p&gt;

&lt;h3 id=&#34;exclusion-condition&#34;&gt;Exclusion condition&lt;/h3&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\(Y_{-p,i}+\beta(P_i-\mathbb{E}(P_i|z_i))\)&lt;/span&gt;与&lt;span  class=&#34;math&#34;&gt;\(z_{i}\)&lt;/span&gt;无关&lt;/p&gt;

&lt;h2 id=&#34;三个假设&#34;&gt;三个假设&lt;/h2&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
\begin{equation}
Y_i=\beta_0+\beta_1 P_i + \gamma_1 rincome_i + \epsilon_i
\tag{3.5}
\end{equation}
\]&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Q1: 我的工具變數有滿足排除條件（或外生條件）嗎?&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;香煙稅是否與控制條件下的「非售價因素銷售」無關？&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[
Y =\underset{(\times k)}{X}\beta+\underset{(\times p)}{W}\gamma +\epsilon
\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;其中$X$為要進行效應評估的變數群，$W$為控制變數群，故$ϵ$為「$W$控制條件下排除$X$效果的Y值」。另外，我們額外找了工具變數: $\underset{\times m)}{Z}$, 要驗證：&lt;/p&gt;

&lt;p&gt;$H_{0}$: 工具變數$Z$與迴歸模型誤差項$ϵ$無關&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;進行TSLS，取得 &lt;span  class=&#34;math&#34;&gt;\( \hat{\epsilon}_{_{TSLS}}=Y-\hat{Y}_{TSLS} \)&lt;/span&gt;.&lt;/li&gt;
&lt;li&gt;將 &lt;span  class=&#34;math&#34;&gt;\( \hat{\epsilon}_{_{TSLS}} \)&lt;/span&gt; 迴歸在總工具變數群（即$Z$與$W$）並進行所有係數為0的聯立檢定，計算檢定量 $J=mF\sim\chi^{2}(m-k)$，其中F係數聯立檢定的F檢定值。&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;此檢定的自由度為$m−k$，所以$m$要&lt;strong&gt;大於&lt;/strong&gt;$k$。“等於”時是無法進行檢定的。&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;Q2: 我的工具變數關聯性夠強嗎？&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;香煙稅真的與「售價」很有關連嗎？&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;工具變數$Z$必需要與效應解釋變數$X$有「足夠強」的關聯，否則&lt;span  class=&#34;math&#34;&gt;\(\hat{\beta}_{_{TSLS}}\)&lt;/span&gt;的大樣本漸近分配不會是常態分配。&lt;/p&gt;

&lt;p&gt;考慮TSLS中的第一階段迴歸模型：$X=Z\alpha_z+W\alpha_w+u$我們希望$\alpha_z$聯立夠顯著。&lt;/p&gt;

&lt;p&gt;檢定原則&lt;/p&gt;

&lt;p&gt;$H_0$:$Z$ 工具變數只有微弱關聯性。&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;$X$迴歸在「總」工具變數群($Z$,$W$)，進行$\alpha_z=0$的聯立F檢定。&lt;/li&gt;
&lt;li&gt;$F&amp;gt;10$拒絕$H_0$。&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Q3: 我對遺漏變數偏誤(OVB)的擔心是否多餘？&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;或許根本沒有必要用工具變數，在&lt;a href=&#34;https://bookdown.org/tpemartin/econometric_analysis/iv.html#eq:ch3-test&#34;&gt;(3.5)&lt;/a&gt;迴歸模型下，PP早已和ϵϵ（即「控制條件下的非售價因素銷售」）無關——直接對&lt;a href=&#34;https://bookdown.org/tpemartin/econometric_analysis/iv.html#eq:ch3-test&#34;&gt;(3.5)&lt;/a&gt;進行最小平方法估計即可。
&lt;span  class=&#34;math&#34;&gt;\(
\begin{equation}
Y   =X\beta+W\gamma +\epsilon
\tag{3.6}
\end{equation}
\)&lt;/span&gt;
$H_0 $: 迴歸模型&lt;a href=&#34;https://bookdown.org/tpemartin/econometric_analysis/iv.html#eq:ch3-model71&#34;&gt;(3.6)&lt;/a&gt;中的$\beta$係數估計「沒有」面臨OVB: 用OLS或TSLS都可以: 在大樣本下，&lt;span  class=&#34;math&#34;&gt;\(\\hat{\beta}_{OLS}\approx\hat{\beta}_{TSLS}\)&lt;/span&gt;。&lt;/p&gt;

&lt;p&gt;$H_1 $: 迴歸模型&lt;a href=&#34;https://bookdown.org/tpemartin/econometric_analysis/iv.html#eq:ch3-model71&#34;&gt;(3.6)&lt;/a&gt;中的$\beta$係數估計「有」面臨OVB: 只能用TSLS :在大樣本下，&lt;span  class=&#34;math&#34;&gt;\(\\hat{\beta}_{OLS}\neq \hat{\beta}_{TSLS}\)&lt;/span&gt;。&lt;/p&gt;

&lt;p&gt;Hausman檢定統計量:
&lt;span  class=&#34;math&#34;&gt;\(
H\equiv\left(\hat{\beta}_{IV}-\hat{\beta}_{OLS}\right)^{&#39;}\left[V(\hat{\beta}_{IV}-\hat{\beta}_{OLS})\right]^{-1}\left(\hat{\beta}_{IV}-\hat{\beta}_{OLS}\right)\sim\chi_{(df)}^{2}.
\)&lt;/span&gt;
– df： $\beta$係數個數.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;當$H&amp;gt;\chi_{(df)}^{2}(\alpha)$才拒絕$H_0$。&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Ghost Blog Workflow</title>
      <link>/post/ghost-blog-workflow/</link>
      <pubDate>Wed, 26 Jun 2019 00:00:00 +0000</pubDate>
      <guid>/post/ghost-blog-workflow/</guid>
      <description>&lt;p&gt;Sep 25, 2019 的update: 这个WorkFlow不太完美，现在转用Blogdown和Git来管理，正在摸索中。&lt;/p&gt;
&lt;p&gt;&lt;del&gt;总算把Ghost配得七七八八，以后要好好记下笔记了。像以前看过的东西时间久了就全忘了，太郁闷了。&lt;/del&gt;&lt;/p&gt;
&lt;h2 id=&#34;目前的workflow如下&#34;&gt;目前的Workflow如下&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;在Synology Drive下Draft目录存放草稿&lt;/li&gt;
&lt;li&gt;Typora里写markdown并保存&lt;/li&gt;
&lt;li&gt;存Leanote和evernote各一份，这个应该可以通过&lt;a href=&#34;https://ifttt.com/&#34;&gt;IFTTT&lt;/a&gt;来实现，日后研究&lt;/li&gt;
&lt;li&gt;另外一个解决方案是直接Git init Draft目录，再往&lt;a href=&#34;github.com&#34;&gt;Github&lt;/a&gt;上push备份。&lt;/li&gt;
&lt;li&gt;存Ghost发布&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;需要的代码注入&#34;&gt;需要的代码注入&lt;/h2&gt;
&lt;h3 id=&#34;公式&#34;&gt;公式&lt;/h3&gt;
&lt;p&gt;在&lt;code&gt;Post Header&lt;/code&gt; 粘贴以下脚本&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-js&#34; data-lang=&#34;js&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;script&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;text/javascript&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;src&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;https://cdn.bootcss.com/mathjax/2.7.3/latest.js?config=TeX-AMS-MML_HTMLorMML&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;s&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;c&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;r&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;i&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;p&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;t&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt; &lt;/span&gt;
&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;script&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;text/x-mathjax-config&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;  
    &lt;span style=&#34;color:#a6e22e&#34;&gt;MathJax&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;Hub&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;Config&lt;/span&gt;({
        &lt;span style=&#34;color:#a6e22e&#34;&gt;tex2jax&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
            &lt;span style=&#34;color:#a6e22e&#34;&gt;inlineMath&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [[&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;$$&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;$$&amp;#39;&lt;/span&gt;], [&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;\\\\(&amp;#39;&lt;/span&gt;,&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;\\\\)&amp;#39;&lt;/span&gt;]],
            &lt;span style=&#34;color:#a6e22e&#34;&gt;processEscapes&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;true&lt;/span&gt;
        }
    });
&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;s&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;c&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;r&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;i&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;p&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;t&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt; &lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;语法高亮&#34;&gt;语法高亮&lt;/h3&gt;
&lt;p&gt;在&lt;code&gt;Post Header&lt;/code&gt; 粘贴以下脚本&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-js&#34; data-lang=&#34;js&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;link&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;rel&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;stylesheet&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;href&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;https://cdnjs.cloudflare.com/ajax/libs/prism/1.16.0/themes/prism-tomorrow.css&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;在&lt;code&gt;Post Footer&lt;/code&gt;粘贴以下脚本&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-js&#34; data-lang=&#34;js&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;script&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;src&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;https://cdnjs.cloudflare.com/ajax/libs/prism/1.16.0/prism.min.js&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;s&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;c&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;r&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;i&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;p&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;t&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;script&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;src&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;https://cdnjs.cloudflare.com/ajax/libs/prism/1.16.0/components/prism-python.js&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;s&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;c&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;r&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;i&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;p&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;t&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;script&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;src&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;https://cdnjs.cloudflare.com/ajax/libs/prism/1.16.0/components/prism-r.js&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;s&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;c&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;r&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;i&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;p&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;t&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;script&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;src&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;https://cdnjs.cloudflare.com/ajax/libs/prism/1.16.0/components/prism-sas.js&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;s&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;c&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;r&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;i&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;p&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;t&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;script&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;src&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;https://cdnjs.cloudflare.com/ajax/libs/prism/1.16.0/components/prism-bash.js&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;s&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;c&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;r&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;i&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;p&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;t&lt;/span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;a href=&#34;https://prismjs.com/&#34;&gt;prism.js&lt;/a&gt;不支持Stata就凑合着用用吧。需要载入的&lt;a href=&#34;https://cdnjs.com/libraries/prism&#34;&gt;components&lt;/a&gt;取决于博文需要。&lt;/p&gt;
&lt;h2 id=&#34;需要注意的地方&#34;&gt;需要注意的地方&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Ghost对于H1不能生成Toc，从H2开始&lt;/li&gt;
&lt;li&gt;对于Markdown中的公式有些需要&lt;a href=&#34;https://blog.yhong.wang/gong-shi/&#34;&gt;转义&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Logistic Regression</title>
      <link>/post/logistic-regression/</link>
      <pubDate>Wed, 26 Jun 2019 00:00:00 +0000</pubDate>
      <guid>/post/logistic-regression/</guid>
      <description>&lt;h2 id=&#34;odds-ratios&#34;&gt;Odds ratios&lt;/h2&gt;

&lt;p&gt;An &lt;a href=&#34;https://en.wikipedia.org/wiki/Odds_ratio&#34;&gt;odds ratio&lt;/a&gt; of 1.0 is equivalent to a beta weight of 0.0.&lt;/p&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Group&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Diseased&lt;/th&gt;
&lt;th align=&#34;center&#34;&gt;Healthy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Exposed&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;$D_E$&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;$H_E$&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Not exposed&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;$D_N$&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;$H_N$&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;$OR={\frac {D_{E}/H_{E}}{D_{N}/H_{N}}}$&lt;/p&gt;

&lt;p&gt;The distribution of the odds ratio is far from normal. Take the natural logarithm of the odds ratio to get normal.&lt;/p&gt;

&lt;p&gt;$logit = ln(OR)$&lt;/p&gt;

&lt;p&gt;When the mean is around 0.50, the OLS regression and logistic regression produce consistent results, but when the probability is close to 0 or 1, the logistic regression is especially important.&lt;/p&gt;

&lt;h2 id=&#34;logistic-regression&#34;&gt;Logistic regression&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;logit&lt;/code&gt; command gives the regression coefficients to estimate the logit score. The &lt;code&gt;logistic&lt;/code&gt; command gives us the odds ratios we need to interpret the effect size of the predictors.&lt;/p&gt;

&lt;p&gt;Both commands give the same results, except that &lt;code&gt;logit&lt;/code&gt; gives the coefficients for estimating the &lt;strong&gt;logit score&lt;/strong&gt; and &lt;code&gt;logistic&lt;/code&gt; gives the &lt;strong&gt;odds ratios&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The McFadden pseudo-$R^2$ represents how much larger log likelihood is for the final solution.
, meaning the log likelihood for the fitted model is 2% larger than for the log likelihood for the intercept-only model.
This is not explained variance. The pseudo-$R^2$  is often a small value, and many researchers do not report it. The biggest mistake is to report it and interpret it as explained variance.&lt;/p&gt;

&lt;p&gt;If you are interested in specific effects of individual variables, it is better to rely on odds ratios for interpreting results of logistic regression. &lt;del&gt;This shows that mothers who smoke have 2.02 times greater odds of having a low-birthweight child.&lt;/del&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Odds ratios&lt;/strong&gt; tell us what happens to the odds of an outcome, whereas &lt;strong&gt;risk ratios&lt;/strong&gt; tell us what happens to their probability.&lt;/p&gt;

&lt;p&gt;For binary predictor variables, you can interpret the odds ratios and percentages directly. For variables that are not binary, you need to have some other standard. One solution is to compare specific examples, such as having no dinners with the family versus having seven dinners with them each week. Another solution is to evaluate the effect of a 1-standard-deviation change for variables that are not binary.&lt;code&gt;listcoef&lt;/code&gt;,get from package &lt;code&gt;spost13&lt;/code&gt;. After logit/logitstic regression, run &lt;code&gt;listcoef, help&lt;/code&gt;or  &lt;code&gt;listcoef, help percent&lt;/code&gt;&lt;/p&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Group&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Experimental (E)&lt;/th&gt;
&lt;th align=&#34;center&#34;&gt;Control (C)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Events (E)&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;EE&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;CE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Non-events (N)&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;EN&lt;/td&gt;
&lt;td align=&#34;center&#34;&gt;CN&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;$ RR={\frac {EE/(EE+EN)}{CE/(CE+CN)}}={\frac {EE(CE+CN)}{CE(EE+EN)}}. $
相对风险是指在暴露在某条件下，一个事件的发生风险
&lt;code&gt;oddsrisk&lt;/code&gt;
$OR={\frac {EE/CE}{EN/CN}}={\frac {EE\cdot CN}{EN\cdot CE}}$
一个事件发生比是该事件发生和不发生的比率
Risk ratio is different from the odds ratio, although it asymptotically approaches it for small probabilities of outcomes. If EE is substantially smaller than EN, then EE/(EE + EN) $ \scriptstyle \approx $ EE/EN. Similarly, if CE is much smaller than CN, then CE/(CN + CE) $ \scriptstyle \approx $ CE/CN.
$ RR={\frac {EE(CE+CN)}{CE(EE+EN)}}\approx {\frac {EE\cdot CN}{EN\cdot CE}}=OR. $&lt;/p&gt;

&lt;p&gt;The difference is small with a rare outcome.The relative risk is appealing, but it should not be used in a study that controls the number of people in each category.&lt;/p&gt;

&lt;h2 id=&#34;hypothesis-testing&#34;&gt;Hypothesis testing&lt;/h2&gt;

&lt;p&gt;chi-squared test that has  k degrees of freedom, tells us only that the overall model has at least one significant predictor.&lt;/p&gt;

&lt;h3 id=&#34;testing-individual-coefficients&#34;&gt;Testing individual coefficients&lt;/h3&gt;

&lt;p&gt;The z test in the Stata output is actually the square root of the Wald chi-squared test.&lt;/p&gt;

&lt;p&gt;The likelihood-ratio chi-squared test for each parameter estimate is based on comparing two logistic models, one with the individual variable we want to test included and one without it. The likelihood-ratio test is the difference in the likelihood-ratio chi-squared values for these two models (this appears as LR chi2(1) near the upper right corner of the output). The difference between the two likelihood-ratio chi-squared values is 1 degree of freedom.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;use nlsy97_chapter11, clear
logistic drank30 male dinner97 pdrink97
estimates store a
logistic drank30 age97 male dinner97 pdrink97
#subtracts the chi-squared values and estimates the probability of the chi-squared difference;
lrtest a&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or just use &lt;code&gt;lrdrop1&lt;/code&gt;&lt;/p&gt;

&lt;h3 id=&#34;testing-sets-of-coefficients&#34;&gt;Testing sets of coefficients&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;test pdrink97 dinner97
#it is the same as:
logistic drank30 age97 male if !mi(dinner97) &amp;!mi(pdrink97)
estimates store a
logistic drank30 age97 male pdrink97 dinner97 
lrtest a
lrdrop1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;this overall test only tells us that at least one of them is significant.&lt;/p&gt;

&lt;h2 id=&#34;margins&#34;&gt;Margins&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;logit drank30 age97 i.black pdrink97 dinner97
margins, dydx(black) atmeans
margins black, atmeans
margins, at(pdrink97=(1 2 3 4 5)) atmeans
marginsplot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can run the logistic regression using the i. label for this categorical variable, i.black. This produces the same results for the logistic regression as if we had simply used black, but the results will work properly if we follow this command with other postestimation commands.&lt;/p&gt;

&lt;h2 id=&#34;nested-logistic-regressions&#34;&gt;Nested logistic regressions&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;nestreg&lt;/code&gt; command is extremely general, applicable across a variety of regression models, including logistic, negative binomial, Poisson, probit, ordered logistic, tobit, and others. It also works with the complex sample designs for many regression models.&lt;/p&gt;

&lt;h2 id=&#34;power-analysis&#34;&gt;Power analysis&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;powerlog, p1(.70) p2(.75) alpha(.05)
powerlog, p1(.70) p2(.75) alpha(.05) rsq(.30) help&lt;/code&gt;&lt;/pre&gt;</description>
    </item>
    
    <item>
      <title>Measurement, reliability, and validity</title>
      <link>/post/measurement-reliability-and-validity/</link>
      <pubDate>Wed, 26 Jun 2019 00:00:00 +0000</pubDate>
      <guid>/post/measurement-reliability-and-validity/</guid>
      <description>&lt;h2 id=&#34;constructing-a-scale&#34;&gt;Constructing a Scale&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;recode empathy2 empathy4 empathy5 (1=5 &#34;Does not describe very well&#34;) ///
  (2=4) (3=3) (4=2) (5=1 &#34;Describes very well&#34;), pre(rev) label(empathy)
egen empathy = rowmean(empathy1 revempathy2 empathy3 revempathy4 ///
  revempathy5 empathy6 empathy7)
egen miss = rowmiss(empathy1 revempathy2 empathy3 revempathy4 ///
   revempathy5 empathy6 empathy7) 
egen empathya = rowmean(empathy1 revempathy2 empathy3 revempathy4 ///
   revempathy5 empathy6 empathy7) if miss &lt; 3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;One drawback to using the rowmean() function is that it simply adds the score on the items a person answers and divides by the number of items answered.&lt;/p&gt;

&lt;h2 id=&#34;reliability&#34;&gt;Reliability&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Stability&lt;/strong&gt; means that if you measure a variable today using a particular scale and then measure it again tomorrow using the same scale, your results will be consistent.(correlation r,&lt;code&gt;pwcorr&lt;/code&gt;, intraclass correlation $\rho_I$)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Equivalence&lt;/strong&gt; means that you have two measures of the same variable and they produce consistent results. (correlation $r_{xx}$)* (A low correlation means either that the measure is not reliable or that the measures are not truly equivalent.)&lt;/li&gt;
&lt;li&gt;A reliable test would be &lt;strong&gt;internally consistent&lt;/strong&gt; if the score for the first half of the items was highly correlated with the score for the second half of the items.(correlation &lt;span  class=&#34;math&#34;&gt;\(r_{x_Ax_B}\)&lt;/span&gt;), alpha,&lt;span  class=&#34;math&#34;&gt;\(\alpha\)&lt;/span&gt;) In general, an $\alpha&amp;gt;0.8$ is considered good reliability, and many researchers feel an $\alpha&amp;gt;0.7$ is adequate reliability. (&lt;span  class=&#34;math&#34;&gt;\(\alpha=\sigma^2_{True}/(\sigma^2_{True}+\sigma^2_{error})\)&lt;/span&gt;)However, for this interpretation to be used, we need to assume that the scale is valid.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;alpha empathy1 revempathy2 empathy3 revempathy4 revempathy5 /// 
empathy6 empathy7, asis item min(5)&lt;/code&gt;
The asis (as is) option means that we do not want Stata to change the signs of any of our variables.
The bottom row of the output table, &lt;em&gt;Test scale&lt;/em&gt;, reports the $\alpha$ for the scale (0.7462). Above this value is the $\alpha$ we would obtain if we dropped each item, one at a time. The &lt;em&gt;item-test correlation&lt;/em&gt; column reports the correlation of each item with the total score of the seven items. &lt;em&gt;item-rest correlation&lt;/em&gt;. This is the correlation of each item with the total of the other items.
The equivalent of alpha for items that are dichotomous is the Kuder–Richardson measure of reliability.&lt;code&gt;alpha&lt;/code&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rater consistency&lt;/strong&gt; is important when you have observers rating a video, observed behavior, essay, or something else where two or more people are rating the same information. Here reliability means that a pair of raters gives consistent results.(kappa,$\kappa$ &lt;code&gt;kap coder1 coder2&lt;/code&gt;)$\kappa$ only gives us credit for the extent the agreement exceeds what we would have expected to get by chance alone. kappa tends to be lower than alpha.&lt;/p&gt;

&lt;h2 id=&#34;validity&#34;&gt;Validity&lt;/h2&gt;

&lt;p&gt;A valid measure is one that measures what it is supposed to be measuring.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;表面效度(face validity)&lt;/strong&gt;：把設計的問卷，拿給親朋好友填，並問他們問卷好不好。指測量工具在外顯形式上的有效程度&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;內容效度(content validity)&lt;/strong&gt;：找一群有相關經驗的人來看題目，問他們設計的好不好，有沒有哪裡要修改。Content validity ratio (CVR): Judges rate each item as &lt;em&gt;essential, useful, or not necessary.&lt;/em&gt;  $CVR=(Ne - N/2)/(N/2)$ , in which the $Ne$ is the number of panelists indicating &amp;quot;essential&amp;quot; and $N$ is the total number of panelists. You can keep the items that have a relatively high CVR and drop those that do not.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;效標效度(criterion validity)&lt;/strong&gt;：把測量工具和其他可測量的工具，算他們之間的相關n以測驗分數和特定效標（criterion）之間的相關係數，表示測量工具有效性之高低。&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;（1）同時效度(current validity)：把設計好的題目，和標準工具（同樣的觀念，相同的變項），去算之間的相關。如：測疼痛忍受度，有四題一分鐘可測完的題目，和另一份標準工具的題目，45題1小時可做完的題目去測，如果R＝0.92（高相關），表示原題目有同時效度。&lt;/li&gt;
&lt;li&gt;（2）預測效度(predictive validity)：一個調查，可以預測未來的事件、行為、態度、結果。如：手術後，病人對止痛藥的需求，看24個病人的分數，分數越高，手術忍受度越高。把24的分數算出，和拿止痛藥量求相關，R＝－0.82，表示高忍痛程度，低止痛藥量。SAT（可以預測大學第一學期的平均成績）成績，和大學第一學期的平均成績求相關，R＝0.42，表示沒有預測效度。但是R如果逐年增加，則表示有預測效度。&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;構念（建構）效度(construct validity)：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;We can assess the &lt;strong&gt;convergent&lt;/strong&gt; and &lt;strong&gt;divergent&lt;/strong&gt; validity of our measure, hope, by seeing whether it is positively correlated with variables with which we believe it converges and negatively correlated with variables with which we believe it diverges.&lt;code&gt;ttest, esize, pwcorr&lt;/code&gt;&lt;/p&gt;

&lt;h2 id=&#34;factor-analysis&#34;&gt;Factor analysis&lt;/h2&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;exploratory factor analysis, which Stata calls &lt;strong&gt;principal factor analysis&lt;/strong&gt;: the variance is partitioned into the shared variance and unique or error variance. The shared variance is how much of the variance in any one item can be explained by the rest of the items. PF&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;principal-component factor analysis&lt;/strong&gt; PCF&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;putdocx&lt;/code&gt; stata 15可以create word documents!&lt;/p&gt;

&lt;h3 id=&#34;terminology&#34;&gt;Terminology&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Extraction(萃取)&lt;/li&gt;
&lt;li&gt;Eigenvalues: In the case of PCF analysis, If there are 10 items, the sum of the eigenvalues will be 10.The factors will be ordered from the most important, which has the largest eigenvalue, to the least important, which has the smallest eigenvalue.In PF analysis, the sum of the eigenvalues will be less than the number of items, and the eigenvalues’ interpretation is complex.&lt;/li&gt;
&lt;li&gt;Communality and uniqueness: PF analysis tries to explain the shared variance. PCF analysis tries to explain all the variance, which is why it is ideal for the uniqueness to approach zero.&lt;/li&gt;
&lt;li&gt;Loadings: how clusters of items are most related to one or another of the factors. If an item has a loading over 0.4 on a factor, it is considered a good indicator of that factor.&lt;/li&gt;
&lt;li&gt;Simple structure: This is a pattern of loadings where each item loads strongly on just one factor and a subset of items load strongly on each factor. When an item loads strongly on more than one factor, it is factorially confounded.&lt;/li&gt;
&lt;li&gt;Scree plot: This is a graph showing the eigenvalue for each factor. When doing a PCF analysis, we usually drop factors that have eigenvalues in the neighborhood of 1.0 or smaller.&lt;/li&gt;
&lt;li&gt;Rotation: 轉軸的方式有很多種，但基本就是兩大類：正交 (orthogonal) 與斜交 (oblique rotation)。轉軸的目的是讓因素更有意義，並同時看看因素之間的關係。更詳細一點來說，如果是正交轉軸的話，那就是假設因素之間沒有關連；相對地，斜交假設因素之間有一定的關連。&lt;/li&gt;
&lt;li&gt;Factor score: weights each item based on how related it is to the factor. Also the factor score is scaled to have a mean of 0.0 and a variance of 1.0.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use PCF when you have a set of items that you believe all measure one concept. In this situation, you would be interested in the first principal factor. You would want to see if it explained a substantial part of the total variance for the entire set of items, and you would want most of the items to have a &lt;strong&gt;loading of 0.4 or above&lt;/strong&gt; on this factor. Because PCF analysis is trying to explain all the variance in the items, the &lt;strong&gt;uniqueness&lt;/strong&gt; for each item should approach zero. Generally, we should consider any factor that has an eigenvalue of more than 1.A visual way to examine the eigenvalues is with a scree plot.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;factor rnatspac rnatenvir rnatheal rnatcity rnatcrime rnatdrug ///
	rnateduc rnatrace rnatarms rnatfare rnatroad rnatsoc rnatchld rnatsci, pcf
screeplot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If, on the other hand, you want to identify two or more latent variables that represent interpretable dimensions of some concept, then PF analysis is probably best.&lt;/p&gt;

&lt;h3 id=&#34;rotation&#34;&gt;Rotation&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Orthogonal:&lt;code&gt;rotate&lt;/code&gt;With a varimax rotation, we can think of the loadings as being the estimated correlation between each item and each factor.&lt;/li&gt;
&lt;li&gt;oblique:&lt;code&gt;rotate, promax&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;estat common&lt;/code&gt; to get correlation matrix of promax rotated common factors&lt;/p&gt;

&lt;h2 id=&#34;get-one-factor-score&#34;&gt;Get one factor score&lt;/h2&gt;

&lt;p&gt;However, this distinction rarely makes a lot of practical difference. The factor score may make a difference if there are some items with very large loadings, say, 0.9, and others with very small loadings, say, 0.2. But we would probably drop the weakest items. When the loadings do not vary a great deal, computing a factor score or a mean/total score will produce comparable results.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;factor rnatenvir rnatheal rnatcity rnatcrime rnatdrug rnateduc rnatrace ///
	rnatfare rnatsoc rnatchld, pcf
predict libfscore, norotate
egen libmean = rowmean(rnatenvir rnatheal rnatcity rnatcrime rnatdrug ///
	rnateduc rnatrace rnatfare rnatsoc rnatchld)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;correlation higher than 0.9...&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Missing values</title>
      <link>/post/missing-values/</link>
      <pubDate>Wed, 26 Jun 2019 00:00:00 +0000</pubDate>
      <guid>/post/missing-values/</guid>
      <description>&lt;p&gt;Many advanced Stata estimation models can use multiple imputation for handling missing values.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.iriseekhout.com/missing-data/auxiliary-variables/&#34;&gt;Auxiliary variables&lt;/a&gt; are variables that can help to make estimates on incomplete data, while they are not part of the main analysis (Collins et al., 2001).&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Include all variables in the analysis model, including the dependent variable,&lt;/li&gt;
&lt;li&gt;Include auxiliary variables that predict patterns of missingness,&lt;/li&gt;
&lt;li&gt;and Include additional variables that predict a person’s score on a variable that has missing values.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The imputation model is then used to generate a complete dataset.&lt;/p&gt;
&lt;p&gt;Once you have included a reasonably large number of variables, adding additional variables may not be helpful because of multicollinearity.&lt;/p&gt;
&lt;p&gt;Drop any participant who does not have complete information on every item used in the analysis. This approach goes by several names, including &lt;strong&gt;full case analysis&lt;/strong&gt;, &lt;strong&gt;casewise deletion&lt;/strong&gt;, or &lt;em&gt;&lt;strong&gt;listwise deletion.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;There will be a substantial loss of power because of the reduced sample size.&lt;/li&gt;
&lt;li&gt;Listwise deletion can introduce substantial bias. (survival bias)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One alternative to listwise deletion involves substituting the mean on a variable for anybody who does not have a response. This has two serious limitations. People who are average on a variable are often more likely to give an answer than are people who have an extreme value.The second problem with mean substitution is that when you give several people the same score on a variable, these people have zero variance on the variable. This artificially reduced variance will seriously bias our parameter estimates.&lt;/p&gt;
&lt;p&gt;The key to understanding multiple imputation is that the imputed missing values will not contain any unique information once the variables in the model and the auxiliary variables are allowed to explain the patterns of missing values and predict the score of the missing values. The imputed values for variables with missing values are simply consistent with the observed data. This allows us to use all available information in our analysis.&lt;/p&gt;
&lt;h2 id=&#34;multiple-imputation&#34;&gt;Multiple imputation&lt;/h2&gt;
&lt;p&gt;A powerful way of working with missing values involves multiple imputation. The command &lt;em&gt;mi&lt;/em&gt; involves three straightforward steps:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Create &lt;em&gt;m&lt;/em&gt; complete datasets by imputing the missing values. Each dataset will have no missing values, but the values imputed for missing values will vary across the  datasets.&lt;/li&gt;
&lt;li&gt;Do your analysis in each of the &lt;em&gt;m&lt;/em&gt;  complete datasets.&lt;/li&gt;
&lt;li&gt;Pool your &lt;em&gt;m&lt;/em&gt;  solutions to get one solution.
&lt;ul&gt;
&lt;li&gt;The parameter estimates—for example, regression coefficients—will be the mean of their corresponding values in the  datasets.&lt;/li&gt;
&lt;li&gt;The standard errors used for testing significance will combine the standard errors from the solutions plus the variance of the parameter estimates across the  solutions. If each solution is yielding a very different estimate, this uncertainty is added to the standard errors. Also the degrees of freedom is adjusted based on the number of imputations and proportion of data that have missing values.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The most widely used approach is using multivariate normal regression (MVN). &lt;code&gt;mi impute mvn&lt;/code&gt; is designed for continuous variables. &lt;code&gt;mi impute chained&lt;/code&gt; is another useful alternative.&lt;/p&gt;
&lt;p&gt;A missing value will have a code of ., .a, .b, etc. Remember that a missing value is recorded in a Stata dataset as an extremely high value. Within mi, a missing-value code, . (dot), has a special meaning. It denotes the missing values eligible for imputation. If you have a set of missing values that should not be imputed, you should record them as extended missing values, that is, as .a, .b, etc.&lt;code&gt;recode agem (.a = .)&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;misstable summarize ln_wagem gradem agem ttl_expm tenurem not_smsa south blackm
misstable patterns ln_wagem gradem agem ttl_expm tenurem not_smsa south blackm
quietly misstable summarize ln_wagem gradem agem ttl_expm tenurem not_smsa south blackm, gen(miss_)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;then&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;logit miss_ln_wagem gradem agem ttl_expm tenurem not_smsa south blackm if ln_wagem &amp;lt;= .
logit miss_gradem ln_wagem agem ttl_expm tenurem not_smsa south blackm if gradem &amp;lt;= .
logit miss_agem ln_wagem gradem ttl_expm tenurem not_smsa south blackm if agem &amp;lt;= .
logit miss_ttl_expm ln_wagem gradem agem tenurem not_smsa south blackm if ttl_expm &amp;lt;= .
logit miss_tenurem ln_wagem gradem agem ttl_expm not_smsa south blackm if tenurem &amp;lt;= .
logit miss_blackm ln_wagem gradem agem ttl_expm tenurem not_smsa south if blackm &amp;lt;= .
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Or use &lt;code&gt;pwcorr , obs sig&lt;/code&gt; to find potential auxiliary variables.&lt;/p&gt;
&lt;p&gt;Any variable that is statistically significant in these logistic regressions should be included in the imputation step.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mi set flong
mi register imputed ln_wagem gradem agem ttl_expm tenurem blackm
mi register regular not_smsa south 
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The &lt;code&gt;mi set flong&lt;/code&gt; command tells Stata how to arrange our multiple datasets(flong (full and long), or mlong (marginal and long)). The &lt;code&gt;mi register imputed&lt;/code&gt; command registers all the variables that have missing values and need to be imputed. The &lt;code&gt;mi register regular&lt;/code&gt; command registers all the variables that have no missing values or for which we do not want to impute values.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mi impute mvn ln_wagem gradem agem ttl_expm tenurem blackm, add(20) rseed(2121)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;生成m=20个数据集，&lt;code&gt;_mi_m&lt;/code&gt; variable identifies datasets and ranges from 0 to 20.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mi impute mvn ln_wagem gradem agem ttl_expm tenurem blackm, add(20) rseed(2121)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;To get pooled $R^2$ and standardized $\beta$s use &lt;code&gt;mibeta&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mibeta ln_wagem gradem agem ttl_expm tenurem not_smsa south blackm, fisherz miopts(vartable)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;When &lt;strong&gt;impossible&lt;/strong&gt; values are imputed(建议不调整): Binary variables, squares, and interactions（在原数据集先相乘，再impute）&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Multilevel analysis</title>
      <link>/post/multilevel-analysis/</link>
      <pubDate>Wed, 26 Jun 2019 00:00:00 +0000</pubDate>
      <guid>/post/multilevel-analysis/</guid>
      <description>&lt;p&gt;Multilevel analysis can address the lack of independence of the observations when you are analyzing grouped data. See &lt;em&gt;Stata Multilevel Mixed-Effects Reference Manual&lt;/em&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;groups of individuals&lt;/li&gt;
&lt;li&gt;panel data&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;fixedeffects-regression-models&#34;&gt;Fixed-effects regression models&lt;/h2&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[y_it = \beta_0 +\beta x_{it}+\mu_i+\eta_{it}\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;if &lt;span  class=&#34;math&#34;&gt;\(\mu_i\)&lt;/span&gt; correlates with &lt;span  class=&#34;math&#34;&gt;\(x_{it}\)&lt;/span&gt; -&amp;gt; Fixed-effects
if &lt;span  class=&#34;math&#34;&gt;\(\mu_i\)&lt;/span&gt; independent of &lt;span  class=&#34;math&#34;&gt;\(x_{it}\)&lt;/span&gt; -&amp;gt; Random-effects models give consistent estimates&lt;/p&gt;

&lt;p&gt;&lt;code&gt;xtreg&lt;/code&gt;  see &lt;em&gt;Stata Longitudinal-Data/Panel-Data Reference Manual.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&#34;randomeffects-regression-models&#34;&gt;Random-effects regression models&lt;/h2&gt;

&lt;p&gt;&lt;span  class=&#34;math&#34;&gt;\[y_it = \beta_0 +\beta x_{it}+\gamma z_i +\mu_i+\eta_{it}\]&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;assume &lt;span  class=&#34;math&#34;&gt;\(\mu_i\)&lt;/span&gt; is independent of &lt;span  class=&#34;math&#34;&gt;\(x_{it}\)&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;fixed component, &lt;span  class=&#34;math&#34;&gt;\( \beta_0 +\beta x_{it}+\gamma z_i\)&lt;/span&gt; , describes the overall relationship between our dependent variable and our independent variable. The random component, &lt;span  class=&#34;math&#34;&gt;\(\mu_i\)&lt;/span&gt; i represents the effects of the unobserved time-invariant variables.&lt;/p&gt;

&lt;p&gt;score = fixed part + random effects + error&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Going back and forth between wide and long formats&lt;/strong&gt; : &lt;code&gt;reshape wide&lt;/code&gt; and &lt;code&gt;reshape long&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;reshape long drink, i(id) j(wave)&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;randomintercept-model&#34;&gt;Random-intercept model&lt;/h2&gt;

&lt;h3 id=&#34;linear-model&#34;&gt;linear model&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;mixed drink c.wave || id:
estimates store linear
margins, at(wave=(0(2)10))
marginsplot&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&#34;quadratic-term&#34;&gt;quadratic term&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;mixed drink c.wave##c.wave || id:
estimates store quadratic
margins, at(wave=(0(2)10))
marginsplot
lrtest linear quadratic&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A proportional reduction in error (PRE) measuring how much the residual (error) variance is reduced by adding the quadratic term may be useful. We will call the random-intercept linear model “Model 1” and the random-intercept quadratic model “Model 2”.&lt;/p&gt;

&lt;p&gt;PRE = (var(Residual)Model1-var(Residual)Model2)/var(Residual)Model1&lt;/p&gt;

&lt;h3 id=&#34;treating-time-as-a-categorical-variable&#34;&gt;Treating time as a categorical variable&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;mixed drink i.wave || id:
estimates store means
margins, at(wave=(0(2)10))
marginsplot
lrtest linear means
lrtest quadratic means&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;randomcoefficients-model&#34;&gt;Random-coefficients model&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;mixed drink c.wave || id: wave, cov(unstructured)
predict yhat_drink, fitted&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;including-a-timeinvariant-covariate&#34;&gt;Including a time-invariant covariate&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;* Random coefficients model with time invariant covariate
* gender coded as male = 1, female = 0
mixed drink c.wave i.male || id: wave
margins male, at(wave=(0(2)8))
marginsplot

* Random coefficients, with wave interacting with the
* time invariant covariate--gender coded
mixed drink c.wave##i.male || id: wave
margins male, at(wave=(0(2)8))
marginsplot

mixed drink c.wave##c.wave##i.male || id: wave
margins male, at(wave=(0(2)8))
marginsplot&lt;/code&gt;&lt;/pre&gt;</description>
    </item>
    
    <item>
      <title>Multiple Regressions</title>
      <link>/post/multiple-regressions/</link>
      <pubDate>Wed, 26 Jun 2019 00:00:00 +0000</pubDate>
      <guid>/post/multiple-regressions/</guid>
      <description>&lt;!-- raw HTML omitted --&gt;
&lt;p&gt;Note: toc is not compatible with &lt;code&gt;markup: mmark&lt;/code&gt;&lt;/p&gt;
&lt;h2 id=&#34;basic&#34;&gt;Basic&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;F: There is a highly significant relationship between outcomes and the set of predictors.&lt;/li&gt;
&lt;li&gt;R2: How much of the outcome variance is explained by the regression model&lt;/li&gt;
&lt;li&gt;Adj-R2: remove the chance effects&lt;/li&gt;
&lt;li&gt;Coef.: &lt;em&gt;unstandardized regression coefficients&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;t: coef/standard error&lt;/li&gt;
&lt;li&gt;Std. Err.: represents the average distance that the observed values fall from the regression line. Conveniently, it tells you how wrong the regression model is on average using the units of the response variable.&lt;/li&gt;
&lt;li&gt;,beta gives &lt;strong&gt;beta weights&lt;/strong&gt;: based on standardizing all variables to have a mean of 0 and a standard deviation of 1. These beta weights are interpreted similarly to how you interpret correlations in that beta&amp;lt;0.2 is considered a weak effect,  between 0.2 and 0.5 is considered a moderate effect, and  is considered a strong effect.(range of -1 to +1, if out of range, -&amp;gt;multicollinearity problem):a 1-standard-deviation change in the independent variable produces a - beta standard-deviation change in the dependent variable.&lt;/li&gt;
&lt;li&gt;increment in R2:&lt;em&gt;part-correlation square&lt;/em&gt; because it measures the part that is uniquely explained by the variable. or &lt;em&gt;semipartial R2&lt;/em&gt; (Semipartial Corr.^2 in &lt;code&gt;pcorr&lt;/code&gt; )estimates only the &lt;strong&gt;unique&lt;/strong&gt; effect of each predictor. Another way to compare is partial correlation;&lt;/li&gt;
&lt;li&gt;distribution of the dependent variable: &lt;code&gt;histogram env_con, frequency normal kdensity&lt;/code&gt; (for &lt;a href=&#34;https://lotabout.me/2018/kernel-density-estimation/&#34;&gt;kernel density estimation&lt;/a&gt;)&lt;strong&gt;Skewness&lt;/strong&gt;(0:Normal; &amp;lt;0: negative or left skew, &amp;gt;0: positive or skew to the right)&lt;strong&gt;kurtosis&lt;/strong&gt;(3: normal; &amp;lt;3: tails are too thick, flat or negative kurtosis; &amp;gt;3: tails are too thin, peaky or positive kurtosis)&lt;code&gt;sktest&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;distribution of the residuals: for large sample, normality is not a critical issue. &lt;code&gt;rvfplot, yline(0)&lt;/code&gt;residual-versus-fitted plot:
To solve the non-normal distribution of residual, we can use &lt;code&gt;reg y xs, vce(robust)&lt;/code&gt; or use bootstrap&lt;code&gt;reg y xs, vce(bootstrap, rep(1000))&lt;/code&gt; , it will change std err and hence t-value.  However,
Andrew J. Leone, Miguel Minutti-Meza, and Charles E. Wasley (2019) Influential Observations and Inference in Accounting Research. The Accounting Review In-Press.
they talk about robust regression using &lt;strong&gt;robreg, what&#39;s the difference?&lt;/strong&gt;
ALso, check &lt;a href=&#34;https://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/se_programming.htm&#34;&gt;Correcting for Cross-Sectional and Time-Series Dependence in Accounting Research&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;regress env_con educat inc com3 hlthprob epht3, beta
predict envhat
preserve
set seed 515
sample 100, count
twoway (scatter env_con envhat) (lfit env_con envhat)
restore
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;diagnostic-statistics&#34;&gt;Diagnostic statistics&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;http://www.r-tutor.com/elementary-statistics/simple-linear-regression/standardized-residual&#34;&gt;Rstandard:&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The standardized residual is the residual divided by its standard deviation.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;regress env_con educat inc com3 hlthprob epht3, beta
predict yhat
predict residual, residual
predict rstandard, rstandard
list respnum env_con yhat residual rstandard if abs(rstandard) &amp;gt; 2.58 &amp;amp; rstandard &amp;lt; .
dfbeta
list respnum rstandard _dfbeta_1 if abs(_dfbeta_1) &amp;gt; 2/sqrt(3769) &amp;amp; _dfbeta_1 &amp;lt; .
estat vif

&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
&lt;li&gt;Influential observations: DFbeta: You could think of this as redoing the regression model, omitting just one observation at a time and seeing how much difference omitting each observation makes. **&lt;strong&gt;A value of &lt;strong&gt;DFbeta  &amp;gt;2/sqrt(N) ** indicates that an observation has a large influence&lt;/strong&gt;&lt;/strong&gt; More specific than rstandard&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;. dfbeta
(739 missing values generated)
                       _dfbeta_1: dfbeta(educat)
(739 missing values generated)
                       _dfbeta_2: dfbeta(inc)
(739 missing values generated)
                       _dfbeta_3: dfbeta(com3)
(739 missing values generated)
                       _dfbeta_4: dfbeta(hlthprob)
(739 missing values generated)
                       _dfbeta_5: dfbeta(epht3)
&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
&lt;li&gt;multicollinearity: The more correlated the predictors, the more they overlap and, hence, the more difficult it is to identify their independent effects. In such situations, you can have multicollinearity in which one or more of the predictors are virtually redundant.
variance inflation factor &lt;code&gt;estat vif&lt;/code&gt; after regression, if &amp;gt;10, for any variable, a multicollinearity problem may exist. If the average VIF is substantially greater than 1.00, there still could be a problem.(Dropping a variable, create a scale that combines them into one variable.)
1/VIF = 1-R2(of regress X1 on other Xs) It tells how much of the variance in the independent variable is available to predict the outcome variable independently.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;weighted-data&#34;&gt;Weighted data&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;regress env_con educat inc com3 hlthprob epht3 [pweight=finalwt], beta
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;When you do a weighted regression this way, Stata automatically uses the robust regression—whether you ask for it or not—because weighted data require robust standard errors.&lt;/p&gt;
&lt;h2 id=&#34;categorical-predictors-and-hierarchical-regression&#34;&gt;Categorical predictors and hierarchical regression&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;regress smday97 age97 male psmoke97 aa hispanic other if !missing(smday97, ///
	age97, male, psmoke97, aa, hispanic, other), beta
test aa hispanic other
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;nested regressions&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;nestreg: regress smday97 (age97 male) (psmoke97) (aa hispanic other), beta
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If you put i. as a stub in front of a categorical variable, Stata will make the first category the reference category and then generate a dummy variable for each of the remaining categories.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;regress smday97 age97 male psmoke97 i.race
#change reference category or what Stata refers to as the baselevel
regress smday97 age97 male psmoke97 ib3.race
regress smday97 age97 male psmoke97 ib(last).race
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;interaction&#34;&gt;interaction&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;g ed_male = educ*male
reg inc educ male ed_male,beta
nestreg: regress inc (educ male) (ed_male), beta
regress inc i.male##c.educ, beta
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;some researchers choose to center quantitative independent variables, such as education, before computing the interaction terms.
Centering is important for independent variables where a value of zero may not be meaningful.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;summarize educ
generate educ_c = educ - r(mean)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;margins help us to interpret the interaction term&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;margins male, at(educ=(8 10 12 14 16 18))
marginsplot
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;nonlinear&#34;&gt;nonlinear&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;regress ln_wage c.ttl_exp##c.ttl_exp, beta
margins, at(ttl_exp = (0(2)28))
marginsplot
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a href=&#34;https://stats.idre.ucla.edu/stata/dae/multiple-regression-power-analysis/&#34;&gt;Power analysis&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>We have no idea</title>
      <link>/post/we-have-no-idea/</link>
      <pubDate>Wed, 26 Jun 2019 00:00:00 +0000</pubDate>
      <guid>/post/we-have-no-idea/</guid>
      <description>&lt;p&gt;&lt;img src=&#34;https://yhong.wang/images/2019/06/26/ecab79a7a6e23208d6db55bbd70e478f.png&#34; alt=&#34;&amp;ldquo;Fundanmental&amp;rdquo; Matter Particles&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yhong.wang/images/2019/06/26/95596bf7e88ea381ae21a1c87614320c.png&#34; alt=&#34;Mass Values&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yhong.wang/images/2019/06/26/707793eb1776621eb336bb1088f8329c.png&#34; alt=&#34;Force Carrier Particles&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yhong.wang/images/2019/06/26/1bc4eb76cf49df9c63897ed6879d5775.png&#34; alt=&#34;Forces&#34;&gt;&lt;/p&gt;
&lt;p&gt;Bosons make up one of the two classes of &lt;a href=&#34;https://en.wikipedia.org/wiki/Elementary_particle&#34;&gt;particles&lt;/a&gt;, the other being &lt;a href=&#34;https://en.wikipedia.org/wiki/Fermion&#34;&gt;fermions&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So far, we have some hints and some ideas about what the smallest distance in the universe might be (the Planck length). We have a pretty good catalog of twelve matter particles that so far we haven’t been able to break further apart (the Standard Model). And we have a list of three possible ways that these particles can interact (the electroweak and strong forces and gravity).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yhong.wang/images/2019/06/26/1d7d5449e431246f76c2f99437720885.png&#34; alt=&#34;Mass of the Proton&#34;&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>An example preprint / working paper</title>
      <link>/publication/preprint/</link>
      <pubDate>Sun, 07 Apr 2019 00:00:00 +0000</pubDate>
      <guid>/publication/preprint/</guid>
      <description>&lt;!-- raw HTML omitted --&gt;
&lt;p&gt;Supplementary notes can be added here, including &lt;a href=&#34;https://sourcethemes.com/academic/docs/writing-markdown-latex/&#34;&gt;code and math&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Slides</title>
      <link>/slides/example/</link>
      <pubDate>Tue, 05 Feb 2019 00:00:00 +0000</pubDate>
      <guid>/slides/example/</guid>
      <description>&lt;h1 id=&#34;welcome-to-slides&#34;&gt;Welcome to Slides&lt;/h1&gt;
&lt;p&gt;&lt;a href=&#34;https://sourcethemes.com/academic/&#34;&gt;Academic&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;features&#34;&gt;Features&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Efficiently write slides in Markdown&lt;/li&gt;
&lt;li&gt;3-in-1: Create, Present, and Publish your slides&lt;/li&gt;
&lt;li&gt;Supports speaker notes&lt;/li&gt;
&lt;li&gt;Mobile friendly slides&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id=&#34;controls&#34;&gt;Controls&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Next: &lt;code&gt;Right Arrow&lt;/code&gt; or &lt;code&gt;Space&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Previous: &lt;code&gt;Left Arrow&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Start: &lt;code&gt;Home&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Finish: &lt;code&gt;End&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Overview: &lt;code&gt;Esc&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Speaker notes: &lt;code&gt;S&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Fullscreen: &lt;code&gt;F&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Zoom: &lt;code&gt;Alt + Click&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/hakimel/reveal.js#pdf-export&#34;&gt;PDF Export&lt;/a&gt;: &lt;code&gt;E&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id=&#34;code-highlighting&#34;&gt;Code Highlighting&lt;/h2&gt;
&lt;p&gt;Inline code: &lt;code&gt;variable&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Code block:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;porridge &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;blueberry&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;
&lt;span style=&#34;color:#66d9ef&#34;&gt;if&lt;/span&gt; porridge &lt;span style=&#34;color:#f92672&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;blueberry&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;:
    &lt;span style=&#34;color:#66d9ef&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;Eating...&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;hr&gt;
&lt;h2 id=&#34;math&#34;&gt;Math&lt;/h2&gt;
&lt;p&gt;In-line math: $x + y = z$&lt;/p&gt;
&lt;p&gt;Block math:&lt;/p&gt;
&lt;p&gt;$$
f\left( x \right) = ;\frac{{2\left( {x + 4} \right)\left( {x - 4} \right)}}{{\left( {x + 4} \right)\left( {x + 1} \right)}}
$$&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;fragments&#34;&gt;Fragments&lt;/h2&gt;
&lt;p&gt;Make content appear incrementally&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{{% fragment %}} One {{% /fragment %}}
{{% fragment %}} **Two** {{% /fragment %}}
{{% fragment %}} Three {{% /fragment %}}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Press &lt;code&gt;Space&lt;/code&gt; to play!&lt;/p&gt;
&lt;p&gt;&lt;!-- raw HTML omitted --&gt;
One
&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;strong&gt;Two&lt;/strong&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
Three
&lt;!-- raw HTML omitted --&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;A fragment can accept two optional parameters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;class&lt;/code&gt;: use a custom style (requires definition in custom CSS)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;weight&lt;/code&gt;: sets the order in which a fragment appears&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id=&#34;speaker-notes&#34;&gt;Speaker Notes&lt;/h2&gt;
&lt;p&gt;Add speaker notes to your presentation&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-markdown&#34; data-lang=&#34;markdown&#34;&gt;{{% speaker_note %}}
&lt;span style=&#34;color:#66d9ef&#34;&gt;-&lt;/span&gt; Only the speaker can read these notes
&lt;span style=&#34;color:#66d9ef&#34;&gt;-&lt;/span&gt; Press &lt;span style=&#34;color:#e6db74&#34;&gt;`S`&lt;/span&gt; key to view
{{% /speaker_note %}}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Press the &lt;code&gt;S&lt;/code&gt; key to view the speaker notes!&lt;/p&gt;
&lt;aside class=&#34;notes&#34;&gt;
  &lt;ul&gt;
&lt;li&gt;Only the speaker can read these notes&lt;/li&gt;
&lt;li&gt;Press &lt;code&gt;S&lt;/code&gt; key to view&lt;/li&gt;
&lt;/ul&gt;
&lt;/aside&gt;
&lt;hr&gt;
&lt;h2 id=&#34;themes&#34;&gt;Themes&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;black: Black background, white text, blue links (default)&lt;/li&gt;
&lt;li&gt;white: White background, black text, blue links&lt;/li&gt;
&lt;li&gt;league: Gray background, white text, blue links&lt;/li&gt;
&lt;li&gt;beige: Beige background, dark text, brown links&lt;/li&gt;
&lt;li&gt;sky: Blue background, thin dark text, blue links&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;ul&gt;
&lt;li&gt;night: Black background, thick white text, orange links&lt;/li&gt;
&lt;li&gt;serif: Cappuccino background, gray text, brown links&lt;/li&gt;
&lt;li&gt;simple: White background, black text, blue links&lt;/li&gt;
&lt;li&gt;solarized: Cream-colored background, dark green text, blue links&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;

&lt;section data-noprocess data-shortcode-slide
  
      
      data-background-image=&#34;/img/boards.jpg&#34;
  &gt;

&lt;h2 id=&#34;custom-slide&#34;&gt;Custom Slide&lt;/h2&gt;
&lt;p&gt;Customize the slide style and background&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-markdown&#34; data-lang=&#34;markdown&#34;&gt;{{&amp;lt; slide background-image=&amp;#34;/img/boards.jpg&amp;#34; &amp;gt;}}
{{&amp;lt; slide background-color=&amp;#34;#0000FF&amp;#34; &amp;gt;}}
{{&amp;lt; slide class=&amp;#34;my-style&amp;#34; &amp;gt;}}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;hr&gt;
&lt;h2 id=&#34;custom-css-example&#34;&gt;Custom CSS Example&lt;/h2&gt;
&lt;p&gt;Let&#39;s make headers navy colored.&lt;/p&gt;
&lt;p&gt;Create &lt;code&gt;assets/css/reveal_custom.css&lt;/code&gt; with:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-css&#34; data-lang=&#34;css&#34;&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;reveal&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;section&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;h1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;,&lt;/span&gt;
.&lt;span style=&#34;color:#a6e22e&#34;&gt;reveal&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;section&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;h2&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;,&lt;/span&gt;
.&lt;span style=&#34;color:#a6e22e&#34;&gt;reveal&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;section&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;h3&lt;/span&gt; {
  &lt;span style=&#34;color:#66d9ef&#34;&gt;color&lt;/span&gt;: &lt;span style=&#34;color:#66d9ef&#34;&gt;navy&lt;/span&gt;;
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;hr&gt;
&lt;h1 id=&#34;questions&#34;&gt;Questions?&lt;/h1&gt;
&lt;p&gt;&lt;a href=&#34;https://discourse.gohugo.io&#34;&gt;Ask&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://sourcethemes.com/academic/docs/&#34;&gt;Documentation&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Privacy Policy</title>
      <link>/privacy/</link>
      <pubDate>Thu, 28 Jun 2018 00:00:00 +0100</pubDate>
      <guid>/privacy/</guid>
      <description>&lt;p&gt;&amp;hellip;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Terms</title>
      <link>/terms/</link>
      <pubDate>Thu, 28 Jun 2018 00:00:00 +0100</pubDate>
      <guid>/terms/</guid>
      <description>&lt;p&gt;&amp;hellip;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>External Project</title>
      <link>/project/external-project/</link>
      <pubDate>Wed, 27 Apr 2016 00:00:00 +0000</pubDate>
      <guid>/project/external-project/</guid>
      <description></description>
    </item>
    
    <item>
      <title>Internal Project</title>
      <link>/project/internal-project/</link>
      <pubDate>Wed, 27 Apr 2016 00:00:00 +0000</pubDate>
      <guid>/project/internal-project/</guid>
      <description>&lt;p&gt;Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum. Sed ac faucibus dolor, scelerisque sollicitudin nisi. Cras purus urna, suscipit quis sapien eu, pulvinar tempor diam. Quisque risus orci, mollis id ante sit amet, gravida egestas nisl. Sed ac tempus magna. Proin in dui enim. Donec condimentum, sem id dapibus fringilla, tellus enim condimentum arcu, nec volutpat est felis vel metus. Vestibulum sit amet erat at nulla eleifend gravida.&lt;/p&gt;
&lt;p&gt;Nullam vel molestie justo. Curabitur vitae efficitur leo. In hac habitasse platea dictumst. Sed pulvinar mauris dui, eget varius purus congue ac. Nulla euismod, lorem vel elementum dapibus, nunc justo porta mi, sed tempus est est vel tellus. Nam et enim eleifend, laoreet sem sit amet, elementum sem. Morbi ut leo congue, maximus velit ut, finibus arcu. In et libero cursus, rutrum risus non, molestie leo. Nullam congue quam et volutpat malesuada. Sed risus tortor, pulvinar et dictum nec, sodales non mi. Phasellus lacinia commodo laoreet. Nam mollis, erat in feugiat consectetur, purus eros egestas tellus, in auctor urna odio at nibh. Mauris imperdiet nisi ac magna convallis, at rhoncus ligula cursus.&lt;/p&gt;
&lt;p&gt;Cras aliquam rhoncus ipsum, in hendrerit nunc mattis vitae. Duis vitae efficitur metus, ac tempus leo. Cras nec fringilla lacus. Quisque sit amet risus at ipsum pharetra commodo. Sed aliquam mauris at consequat eleifend. Praesent porta, augue sed viverra bibendum, neque ante euismod ante, in vehicula justo lorem ac eros. Suspendisse augue libero, venenatis eget tincidunt ut, malesuada at lorem. Donec vitae bibendum arcu. Aenean maximus nulla non pretium iaculis. Quisque imperdiet, nulla in pulvinar aliquet, velit quam ultrices quam, sit amet fringilla leo sem vel nunc. Mauris in lacinia lacus.&lt;/p&gt;
&lt;p&gt;Suspendisse a tincidunt lacus. Curabitur at urna sagittis, dictum ante sit amet, euismod magna. Sed rutrum massa id tortor commodo, vitae elementum turpis tempus. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean purus turpis, venenatis a ullamcorper nec, tincidunt et massa. Integer posuere quam rutrum arcu vehicula imperdiet. Mauris ullamcorper quam vitae purus congue, quis euismod magna eleifend. Vestibulum semper vel augue eget tincidunt. Fusce eget justo sodales, dapibus odio eu, ultrices lorem. Duis condimentum lorem id eros commodo, in facilisis mauris scelerisque. Morbi sed auctor leo. Nullam volutpat a lacus quis pharetra. Nulla congue rutrum magna a ornare.&lt;/p&gt;
&lt;p&gt;Aliquam in turpis accumsan, malesuada nibh ut, hendrerit justo. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Quisque sed erat nec justo posuere suscipit. Donec ut efficitur arcu, in malesuada neque. Nunc dignissim nisl massa, id vulputate nunc pretium nec. Quisque eget urna in risus suscipit ultricies. Pellentesque odio odio, tincidunt in eleifend sed, posuere a diam. Nam gravida nisl convallis semper elementum. Morbi vitae felis faucibus, vulputate orci placerat, aliquet nisi. Aliquam erat volutpat. Maecenas sagittis pulvinar purus, sed porta quam laoreet at.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>An example journal article</title>
      <link>/publication/journal-article/</link>
      <pubDate>Tue, 01 Sep 2015 00:00:00 +0000</pubDate>
      <guid>/publication/journal-article/</guid>
      <description>&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;p&gt;Supplementary notes can be added here, including &lt;a href=&#34;https://sourcethemes.com/academic/docs/writing-markdown-latex/&#34;&gt;code and math&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>An example conference paper</title>
      <link>/publication/conference-paper/</link>
      <pubDate>Mon, 01 Jul 2013 00:00:00 +0000</pubDate>
      <guid>/publication/conference-paper/</guid>
      <description>&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;p&gt;Supplementary notes can be added here, including &lt;a href=&#34;https://sourcethemes.com/academic/docs/writing-markdown-latex/&#34;&gt;code and math&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>
