<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Posts | Adam Bibler</title><link>https://www.adambibler.com/post/</link><atom:link href="https://www.adambibler.com/post/index.xml" rel="self" type="application/rss+xml"/><description>Posts</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><image><url>https://www.adambibler.com/media/sharing.png</url><title>Posts</title><link>https://www.adambibler.com/post/</link></image><item><title>Income vs. Rent Growth, Part 2</title><link>https://www.adambibler.com/post/income-vs-rent-growth-part-2/</link><pubDate>Fri, 04 Oct 2024 00:00:00 +0000</pubDate><guid>https://www.adambibler.com/post/income-vs-rent-growth-part-2/</guid><description>
&lt;p>In the previous post, I looked at median gross rent and median household income growth from 2005 - 2023 according to the American Community Survey. But that analysis was only at the national level. What about the state level? Let’s find out.&lt;/p>
&lt;p>Again, I’ll pull the median household income and median gross rent from the ACS. But this time I’ll select states. Additionally, I’ll switch to looking at median household income only for renter households. Finally, rather than looking at the full time series, I’ll only pull the starting and ending years.&lt;/p>
&lt;pre class="r">&lt;code>state_hh_income &amp;lt;- get_ACS(&amp;quot;NAME,B25119_003E&amp;quot;,&amp;quot;state&amp;quot;,2005,2005,1)
state_hh_income2 &amp;lt;- get_ACS(&amp;quot;B25119_003E&amp;quot;,&amp;quot;state&amp;quot;,2023,2023,1)
state_hh_income$inc05 &amp;lt;- as.numeric(state_hh_income$B25119_003E)
state_hh_income2$inc23 &amp;lt;- as.numeric(state_hh_income2$B25119_003E)
state_hh_income &amp;lt;- cbind(state = state_hh_income$NAME,
inc05 = state_hh_income$inc05,
inc23 = state_hh_income2$inc23)
state_hh_income &amp;lt;- as_tibble(state_hh_income)
state_hh_income &amp;lt;- state_hh_income %&amp;gt;%
mutate(inc_growth = as.numeric(inc23) / as.numeric(inc05) - 1)
state_hh_rent &amp;lt;- get_ACS(&amp;quot;B25064_001E,NAME&amp;quot;,&amp;quot;state&amp;quot;,2005,2005,1)
state_hh_rent2 &amp;lt;- get_ACS(&amp;quot;B25064_001E&amp;quot;,&amp;quot;state&amp;quot;,2023,2023,1)
state_hh_rent$rent05 &amp;lt;- as.numeric(state_hh_rent$B25064_001E)
state_hh_rent2$rent23 &amp;lt;- as.numeric(state_hh_rent2$B25064_001E)
state_hh_rent &amp;lt;- cbind(state = state_hh_rent$NAME,
rent05 = state_hh_rent$rent05,
rent23 = state_hh_rent2$rent23)
state_hh_rent &amp;lt;- as_tibble(state_hh_rent)
state_hh_rent &amp;lt;- state_hh_rent %&amp;gt;%
mutate(rent_growth = as.numeric(rent23) / as.numeric(rent05) - 1
)
state_income_rent &amp;lt;- merge(state_hh_income, state_hh_rent, by = &amp;quot;state&amp;quot;)
state_income_rent &amp;lt;- state_income_rent %&amp;gt;%
mutate(diff = rent_growth - inc_growth)
state_income_rent &amp;lt;- arrange(state_income_rent, desc(diff))
state_income_rent2 &amp;lt;- state_income_rent %&amp;gt;%
select(state, inc_growth, rent_growth, diff) %&amp;gt;%
mutate(across(c(&amp;quot;inc_growth&amp;quot;, &amp;quot;rent_growth&amp;quot;, &amp;quot;diff&amp;quot;), function(x) (paste0(round(x, 4) * 100,&amp;quot;%&amp;quot;))))&lt;/code>&lt;/pre>
&lt;p>States with the largest difference:&lt;/p>
&lt;pre class="r">&lt;code>knitr::kable(head(state_income_rent2))&lt;/code>&lt;/pre>
&lt;table>
&lt;thead>
&lt;tr class="header">
&lt;th align="left">state&lt;/th>
&lt;th align="left">inc_growth&lt;/th>
&lt;th align="left">rent_growth&lt;/th>
&lt;th align="left">diff&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr class="odd">
&lt;td align="left">Arizona&lt;/td>
&lt;td align="left">90.42%&lt;/td>
&lt;td align="left">124.27%&lt;/td>
&lt;td align="left">33.85%&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">Florida&lt;/td>
&lt;td align="left">81.34%&lt;/td>
&lt;td align="left">112.48%&lt;/td>
&lt;td align="left">31.14%&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">Hawaii&lt;/td>
&lt;td align="left">66.36%&lt;/td>
&lt;td align="left">94.97%&lt;/td>
&lt;td align="left">28.62%&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">Delaware&lt;/td>
&lt;td align="left">45.8%&lt;/td>
&lt;td align="left">71.25%&lt;/td>
&lt;td align="left">25.45%&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">Nevada&lt;/td>
&lt;td align="left">63.83%&lt;/td>
&lt;td align="left">88.39%&lt;/td>
&lt;td align="left">24.56%&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">Wyoming&lt;/td>
&lt;td align="left">61.81%&lt;/td>
&lt;td align="left">86.22%&lt;/td>
&lt;td align="left">24.41%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Arizona and Florida had some of the fastest rent growth post-Covid, so this makes sense.&lt;/p>
&lt;p>States with the smallest difference:&lt;/p>
&lt;pre class="r">&lt;code>knitr::kable(tail(state_income_rent2))&lt;/code>&lt;/pre>
&lt;table>
&lt;thead>
&lt;tr class="header">
&lt;th align="left">&lt;/th>
&lt;th align="left">state&lt;/th>
&lt;th align="left">inc_growth&lt;/th>
&lt;th align="left">rent_growth&lt;/th>
&lt;th align="left">diff&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr class="odd">
&lt;td align="left">47&lt;/td>
&lt;td align="left">Ohio&lt;/td>
&lt;td align="left">74.57%&lt;/td>
&lt;td align="left">65.42%&lt;/td>
&lt;td align="left">-9.15%&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">48&lt;/td>
&lt;td align="left">Puerto Rico&lt;/td>
&lt;td align="left">55.94%&lt;/td>
&lt;td align="left">46.58%&lt;/td>
&lt;td align="left">-9.37%&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">49&lt;/td>
&lt;td align="left">Vermont&lt;/td>
&lt;td align="left">86.61%&lt;/td>
&lt;td align="left">75.99%&lt;/td>
&lt;td align="left">-10.62%&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">50&lt;/td>
&lt;td align="left">West Virginia&lt;/td>
&lt;td align="left">87.45%&lt;/td>
&lt;td align="left">75.98%&lt;/td>
&lt;td align="left">-11.47%&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">51&lt;/td>
&lt;td align="left">Illinois&lt;/td>
&lt;td align="left">80.26%&lt;/td>
&lt;td align="left">68.66%&lt;/td>
&lt;td align="left">-11.59%&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">52&lt;/td>
&lt;td align="left">District of Columbia&lt;/td>
&lt;td align="left">144.39%&lt;/td>
&lt;td align="left">128.85%&lt;/td>
&lt;td align="left">-15.54%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Let’s redo the national analysis, but this time with the renter median household income.&lt;/p>
&lt;pre class="r">&lt;code>us_hh_income &amp;lt;- get_ACS(&amp;quot;B25119_003E&amp;quot;,&amp;quot;us&amp;quot;,2005,2019,1)
us_hh_income2 &amp;lt;- get_ACS(&amp;quot;B25119_003E&amp;quot;,&amp;quot;us&amp;quot;,2021,2023,1)
us_hh_income &amp;lt;- rbind(us_hh_income, us_hh_income2)
us_rent &amp;lt;- get_ACS(&amp;quot;B25064_001E&amp;quot;,&amp;quot;us&amp;quot;,2005,2019,1)
us_rent2 &amp;lt;- get_ACS(&amp;quot;B25064_001E&amp;quot;,&amp;quot;us&amp;quot;,2021,2023,1)
us_rent &amp;lt;- rbind(us_rent, us_rent2)
us_hh_income$B25119_003E[1]&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## [1] &amp;quot;28251&amp;quot;&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>us_hh_income &amp;lt;- us_hh_income %&amp;gt;%
mutate(income_index = as.numeric(B25119_003E) * 100 / 28251)
rent_start &amp;lt;- us_rent$B25064_001E[1]
us_rent &amp;lt;- us_rent %&amp;gt;%
mutate(rent_index = as.numeric(B25064_001E) * 100/ 728)
rent_income &amp;lt;- inner_join(us_hh_income, us_rent, by = &amp;quot;year&amp;quot;)
rent_income$Difference &amp;lt;- rent_income$rent_index - rent_income$income_index
total_change &amp;lt;- rent_income %&amp;gt;%
select(rent_index, income_index, Difference)
total_change &amp;lt;- total_change[18,]
total_change2 &amp;lt;- total_change %&amp;gt;%
mutate(across(everything(), function(x) (paste0(round(x, 2),&amp;quot;%&amp;quot;))))
knitr::kable(total_change2)&lt;/code>&lt;/pre>
&lt;table>
&lt;thead>
&lt;tr class="header">
&lt;th align="left">rent_index&lt;/th>
&lt;th align="left">income_index&lt;/th>
&lt;th align="left">Difference&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr class="odd">
&lt;td align="left">193.13%&lt;/td>
&lt;td align="left">183.07%&lt;/td>
&lt;td align="left">10.06%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;pre class="r">&lt;code>df &amp;lt;- rent_income %&amp;gt;%
select(year, rent_index, income_index) %&amp;gt;%
rename(renter_income_index = income_index) %&amp;gt;%
gather(key = &amp;quot;variable&amp;quot;, value = &amp;quot;value&amp;quot;, -year)
x &amp;lt;- ggplot(df, aes(x = year, y = value)) +
geom_line(aes(color = variable), size = 1.5) +
scale_color_manual(values = c(&amp;quot;Orange&amp;quot;, &amp;quot;cyan3&amp;quot;)) +
labs(caption = &amp;quot;Source: American Community Survey \n @abibler.bsky.social&amp;quot;,
title =
&amp;quot;Median Gross Rent vs. Median Household Income (Renters), \n 2005 = 100&amp;quot;) +
ylab(&amp;quot;Index&amp;quot;) +
xlab(&amp;quot;Year&amp;quot;) +
theme_minimal() +
scale_y_continuous(breaks=(seq(100, 200, 25)), limits = c(100, 200)) +
theme(legend.title = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
plot.title = element_text(size = 18, face = &amp;quot;bold&amp;quot;))
x&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://www.adambibler.com/post/income-vs-rent-growth-part-2/index_files/figure-html/unnamed-chunk-5-1.png" width="672" />&lt;/p>
&lt;p>We can see that this time the difference is “only” 10%, and now the difference seems to be more due to the post-Covid rent spike.&lt;/p></description></item><item><title>Income vs. Rent Growth, 2005 - 2023</title><link>https://www.adambibler.com/post/income-vs-rent-growth-2005-2023/</link><pubDate>Tue, 01 Oct 2024 00:00:00 +0000</pubDate><guid>https://www.adambibler.com/post/income-vs-rent-growth-2005-2023/</guid><description>
&lt;div id="background" class="section level2">
&lt;h2>Background&lt;/h2>
&lt;p>The 1-year ACS data was released &lt;a href="https://www.census.gov/newsroom/press-kits/2024/acs-1-year-estimates.html">last month&lt;/a>. Every year, the release is exciting to me since it means the time series of available data gets a little longer. As someone interested in rent affordability I thought I would take a look at the overall change in rents and incomes from the inception of the ACS (2005) to the most recent data year (2023).&lt;/p>
&lt;p>This analysis will make use of ‘get_ACS’ function, described in an earlier &lt;a href="../a-very-simple-function-for-getting-census-acs-data-into-r/">post&lt;/a>.&lt;/p>
&lt;p>First, get the median household income and median gross rent tables. Because the 2020 ACS data was &lt;a href="https://www.census.gov/newsroom/press-releases/2021/changes-2020-acs-1-year.html">not released&lt;/a>, I split each in to two separate calls.&lt;/p>
&lt;pre class="r">&lt;code>us_hh_income &amp;lt;- get_ACS(&amp;quot;B19013_001E&amp;quot;,&amp;quot;us&amp;quot;,2005,2019,1)
us_hh_income2 &amp;lt;- get_ACS(&amp;quot;B19013_001E&amp;quot;,&amp;quot;us&amp;quot;,2021,2023,1)
us_hh_income &amp;lt;- rbind(us_hh_income, us_hh_income2)
us_rent &amp;lt;- get_ACS(&amp;quot;B25064_001E&amp;quot;,&amp;quot;us&amp;quot;,2005,2019,1)
us_rent2 &amp;lt;- get_ACS(&amp;quot;B25064_001E&amp;quot;,&amp;quot;us&amp;quot;,2021,2023,1)
us_rent &amp;lt;- rbind(us_rent, us_rent2)&lt;/code>&lt;/pre>
&lt;p>Next, get the starting year values, and covert each time series in to an index.&lt;/p>
&lt;pre class="r">&lt;code>us_hh_income$B19013_001E[1]
us_hh_income &amp;lt;- us_hh_income %&amp;gt;%
mutate(income_index = as.numeric(B19013_001E) * 100 / 46242)
rent_start &amp;lt;- us_rent$B25064_001E[1]
us_rent &amp;lt;- us_rent %&amp;gt;%
mutate(rent_index = as.numeric(B25064_001E) * 100/ 728)&lt;/code>&lt;/pre>
&lt;p>Join the income and rent series together.&lt;/p>
&lt;pre class="r">&lt;code>rent_income &amp;lt;- inner_join(us_hh_income, us_rent, by = &amp;quot;year&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>Look at the total difference between the two.&lt;/p>
&lt;pre class="r">&lt;code>rent_income$Difference &amp;lt;- rent_income$rent_index - rent_income$income_index
total_change &amp;lt;- rent_income %&amp;gt;% select(rent_index, income_index, Difference)
total_change &amp;lt;- total_change[18,]
total_change2 &amp;lt;- total_change %&amp;gt;%
mutate(across(everything(), function(x) (paste0(round(x, 2),&amp;quot;%&amp;quot;))))
knitr::kable(total_change2)&lt;/code>&lt;/pre>
&lt;table>
&lt;thead>
&lt;tr class="header">
&lt;th align="left">rent_index&lt;/th>
&lt;th align="left">income_index&lt;/th>
&lt;th align="left">Difference&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr class="odd">
&lt;td align="left">193.13%&lt;/td>
&lt;td align="left">168.07%&lt;/td>
&lt;td align="left">25.06%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Cumulatively, rent has grown 25% more than income. But how does this compare over time?&lt;/p>
&lt;pre class="r">&lt;code>library(ggplot2)
df &amp;lt;- rent_income %&amp;gt;%
select(year, rent_index, income_index) %&amp;gt;%
gather(key = &amp;quot;variable&amp;quot;, value = &amp;quot;value&amp;quot;, -year)
x &amp;lt;- ggplot(df, aes(x = year, y = value)) +
geom_line(aes(color = variable), size = 1.5) +
scale_color_manual(values = c(&amp;quot;dodgerblue1&amp;quot;, &amp;quot;Orange&amp;quot;)) +
labs(caption = &amp;quot;Source: American Community Survey \n @abibler.bsky.social&amp;quot;,
title =
&amp;quot;Median Gross Rent vs. Median Household Income, \n 2005 = 100&amp;quot;) +
ylab(&amp;quot;Index&amp;quot;) +
xlab(&amp;quot;Year&amp;quot;) +
theme_minimal() +
scale_y_continuous(breaks=(seq(100, 200, 25)), limits = c(100, 200)) +
theme(legend.title = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
plot.title = element_text(size = 18, face = &amp;quot;bold&amp;quot;))
x&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://www.adambibler.com/post/income-vs-rent-growth-2005-2023/index_files/figure-html/unnamed-chunk-6-1.png" width="672" />&lt;/p>
&lt;p>It looks like much of the difference is explained by the great recession, when income fell and rent continued to rise.&lt;/p>
&lt;p>Note that this is just at the national level. It would be interesting to look at state differences. Also, this is comparing rent to income, across all types of households (renters and owners). I’ll tackle just renters in a future post.&lt;/p>
&lt;/div></description></item><item><title>Voucher Locations Part 1</title><link>https://www.adambibler.com/post/voucher-locations-part-1/</link><pubDate>Thu, 08 Feb 2024 00:00:00 +0000</pubDate><guid>https://www.adambibler.com/post/voucher-locations-part-1/</guid><description>
&lt;div id="background" class="section level2">
&lt;h2>Background&lt;/h2>
&lt;p>The Housing Choice Voucher program is the United States’s largest rental assistance program, providing rental subsidies to over 2.3 million households.&lt;/p>
&lt;/div>
&lt;div id="location-of-voucher-households" class="section level2">
&lt;h2>Location of Voucher Households&lt;/h2>
&lt;p>HUD provides geographic data on its assisted households in a variety of ways. HUD’s enterprise GIS service provides voucher locations by &lt;a href="https://hudgis-hud.opendata.arcgis.com/datasets/8d45c34f7f64433586ef6a448d00ca12_17/explore?location=37.982646%2C-112.717602%2C4.58">Census tract&lt;/a>. HUD also provides data on Housing Choice Voucher households (and households in its other direct rental assistance programs) through an annual data set known as the ‘&lt;a href="https://www.huduser.gov/portal/datasets/assthsg.html">Picture of Subsidized Households&lt;/a>.’ For this analysis we’ll look at the Picture data at the state level.&lt;/p>
&lt;p>First we’ll want to set up our necessary packages.&lt;/p>
&lt;pre class="r">&lt;code># set libraries
library(httr)
library(readxl)
library(rjson)
library(tidycensus)
library(tidyverse)
library(tigris)
#&lt;/code>&lt;/pre>
&lt;p>Next, we’ll download the Picture data.&lt;/p>
&lt;pre class="r">&lt;code># # Download Picture of Subsidized Household Data
# # icesTAF::mkdir(&amp;quot;Data&amp;quot;)
# # download.file(&amp;quot;https://www.huduser.gov/portal/datasets/pictures/files/STATE_2023_2020census.xlsx&amp;quot;, &amp;quot;Data/STATE_2023_2020census.xlsx&amp;quot;, mode = &amp;quot;wb&amp;quot;)
#&lt;/code>&lt;/pre>
&lt;p>Then, we’ll read the data in to R and examine it.&lt;/p>
&lt;pre class="r">&lt;code>#Read in picture data
state_picture &amp;lt;- read_excel(&amp;quot;Data/STATE_2023_2020census.xlsx&amp;quot;)
#Filter the data to only by the HCV program
vouchers_state &amp;lt;- state_picture %&amp;gt;% filter (program_label == &amp;quot;Housing Choice Vouchers&amp;quot;)
#Look at states with the most and least vouchers
vouchers_state &amp;lt;- vouchers_state %&amp;gt;% arrange(desc(number_reported))
head(vouchers_state$States)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## [1] &amp;quot;CA California&amp;quot; &amp;quot;NY New York&amp;quot; &amp;quot;TX Texas&amp;quot; &amp;quot;FL Florida&amp;quot;
## [5] &amp;quot;IL Illinois&amp;quot; &amp;quot;MA Massachusetts&amp;quot;&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>tail(vouchers_state$States)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## [1] &amp;quot;DE Delaware&amp;quot; &amp;quot;AK Alaska&amp;quot;
## [3] &amp;quot;WY Wyoming&amp;quot; &amp;quot;GU Guam&amp;quot;
## [5] &amp;quot;VI U.S. Virgin Islands&amp;quot; &amp;quot;MP Northern Mariana Islands&amp;quot;&lt;/code>&lt;/pre>
&lt;p>Not surprisingly, California, New York, and Texas have the most voucher households as these states are the most populous. However, the relationship between vouchers and population isn’t &lt;em>quite&lt;/em> perfect, as Texas (and Florida) actually have greater populations than New York. We can download population from the Census Bureau and attach it to the HUD data to examine this more closely.&lt;/p>
&lt;pre class="r">&lt;code># Get population from the ACS, using the tidycensus package, including a shapefile for mapping
state_population &amp;lt;- get_acs(
geography = &amp;quot;state&amp;quot;,
variables = &amp;quot;B01003_001&amp;quot;,
year = 2022,
survey = &amp;quot;acs1&amp;quot;,
geometry = TRUE,
resolution = &amp;quot;20m&amp;quot;
) %&amp;gt;% shift_geometry()&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code># Attach the population the the Picture data
vouchers_pop &amp;lt;- inner_join(state_population, vouchers_state, by = c(&amp;quot;GEOID&amp;quot; = &amp;quot;code&amp;quot;))
vouchers_pop &amp;lt;- vouchers_pop %&amp;gt;% rename(vouchers = number_reported)
# Plot the voucher data by state
ggplot(data = vouchers_pop, aes(fill = vouchers)) +
geom_sf() +
labs(title = &amp;quot;Vouchers By State&amp;quot;,
caption = &amp;quot;Source: HUD Picture of Subsidized Households&amp;quot;) +
scale_fill_continuous(name = &amp;quot;&amp;quot;, label = scales::comma_format()) +
theme_void()&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://www.adambibler.com/post/voucher-locations-part-1/index_files/figure-html/unnamed-chunk-5-1.png" width="672" />&lt;/p>
&lt;pre class="r">&lt;code>ggplot(vouchers_pop, aes(x=estimate, y=vouchers)) +
geom_text(label = vouchers_pop$State) +
geom_smooth(method=lm) +
labs(title = &amp;quot;Population vs. Total Vouchers by State&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://www.adambibler.com/post/voucher-locations-part-1/index_files/figure-html/unnamed-chunk-5-2.png" width="672" />&lt;/p>
&lt;pre class="r">&lt;code>lmpop &amp;lt;- lm(vouchers ~ estimate, data = vouchers_pop)
summary(lmpop)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>##
## Call:
## lm(formula = vouchers ~ estimate, data = vouchers_pop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -60948 -8550 -155 5643 90054
##
## Coefficients:
## Estimate Std. Error t value Pr(&amp;gt;|t|)
## (Intercept) -1.868e+03 3.923e+03 -0.476 0.636
## estimate 7.420e-03 4.023e-04 18.444 &amp;lt;2e-16 ***
## ---
## Signif. codes: 0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1
##
## Residual standard error: 21160 on 50 degrees of freedom
## Multiple R-squared: 0.8719, Adjusted R-squared: 0.8693
## F-statistic: 340.2 on 1 and 50 DF, p-value: &amp;lt; 2.2e-16&lt;/code>&lt;/pre>
&lt;p>Even though population explains about 87% of the variation in Vouchers by State, we see that New York and Massachusetts for example are overrepresented while stats like Texas and Florida are underrepresented.&lt;/p>
&lt;p>Of course, Vouchers are not available to anyone but rather to low-income households. It is more likely that differences in low-income population sizes would explain differences in Voucher sizes better than overall population. We can get low-income population estimates from HUD’s Comprehensive Housing Affordability Strategy (CHAS) data.&lt;/p>
&lt;pre class="r">&lt;code># Read in the CHAS data
chas_states &amp;lt;- read_csv(&amp;quot;Data\\CHAS\\2005thru2009-040-csv\\table1.csv&amp;quot;)
#Calculate totals (Adding renters with and without conditions)
chas_states &amp;lt;- chas_states %&amp;gt;% mutate(
fips = substr(geoid, 8, 9),
Total_HHs_LE_30pct = T1_est77 + T1_est113,
Total_HHs_LE_30pct_moe = (T1_moe77^2 + T1_moe113^2)^.5,
Share_HHs_LE_30pct = Total_HHs_LE_30pct / T1_est75,
Total_HHs_LE_50pct = Total_HHs_LE_30pct + T1_est84 + T1_est120,
Total_HHs_LE_50pct_moe = (Total_HHs_LE_30pct_moe^2 + T1_moe84^2 + T1_moe120^2)^.5,
Share_HHs_LE_50pct = Total_HHs_LE_50pct / T1_est75,
Share_HHs_LE_50pct_moe = Total_HHs_LE_50pct_moe / T1_est1,
Total_HHs_LE_80pct = Total_HHs_LE_50pct + T1_est91 + T1_est127,
Share_HHs_LE_80pct = Total_HHs_LE_80pct / T1_est75)
# Attache the CHAS data to the population and Voucher data
pop_program &amp;lt;- inner_join(vouchers_pop, chas_states, by = c(&amp;quot;GEOID&amp;quot; = &amp;quot;ST&amp;quot;))
lm_vli &amp;lt;- lm(vouchers ~ Total_HHs_LE_80pct, data = pop_program)
summary(lm_vli)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>##
## Call:
## lm(formula = vouchers ~ Total_HHs_LE_80pct, data = pop_program)
##
## Residuals:
## Min 1Q Median 3Q Max
## -41785 -9400 1133 7250 66621
##
## Coefficients:
## Estimate Std. Error t value Pr(&amp;gt;|t|)
## (Intercept) -6.276e+03 3.644e+03 -1.722 0.0912 .
## Total_HHs_LE_80pct 4.800e-02 2.305e-03 20.826 &amp;lt;2e-16 ***
## ---
## Signif. codes: 0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1
##
## Residual standard error: 19000 on 50 degrees of freedom
## Multiple R-squared: 0.8966, Adjusted R-squared: 0.8946
## F-statistic: 433.7 on 1 and 50 DF, p-value: &amp;lt; 2.2e-16&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>#Plot the relationship
ggplot(pop_program, aes(x=Total_HHs_LE_80pct, y=vouchers)) +
geom_text(label = vouchers_pop$State) +
geom_smooth(method=lm) +
labs(title = &amp;quot;Low Income Population vs. Total Vouchers by State&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://www.adambibler.com/post/voucher-locations-part-1/index_files/figure-html/unnamed-chunk-6-1.png" width="672" />&lt;/p>
&lt;p>It turns out that low-income renter population explains the variation in state vouchers only modestly more than overall population.&lt;/p>
&lt;p>We can also look at the share of the very low-income population receiving vouchers by state.&lt;/p>
&lt;pre class="r">&lt;code>pop_program &amp;lt;- pop_program %&amp;gt;% mutate(voucher_share_VLI = vouchers / Total_HHs_LE_50pct)
ggplot(data = pop_program, aes(fill = voucher_share_VLI)) +
geom_sf() +
labs(title = &amp;quot;Vouchers By State&amp;quot;,
caption = &amp;quot;Source: HUD Picture of Subsidized Households&amp;quot;) +
scale_fill_continuous(name = &amp;quot;&amp;quot;, label = scales::comma_format()) +
theme_void()&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://www.adambibler.com/post/voucher-locations-part-1/index_files/figure-html/unnamed-chunk-7-1.png" width="672" />&lt;/p>
&lt;pre class="r">&lt;code>summary(pop_program$voucher_share_VLI)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.02770 0.04459 0.05174 0.06100 0.07199 0.17931&lt;/code>&lt;/pre>
&lt;/div></description></item><item><title>A (Very) Simple Function for Getting Census ACS Data into R</title><link>https://www.adambibler.com/post/a-very-simple-function-for-getting-census-acs-data-into-r/</link><pubDate>Sat, 23 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.adambibler.com/post/a-very-simple-function-for-getting-census-acs-data-into-r/</guid><description>
&lt;script src="https://www.adambibler.com/post/a-very-simple-function-for-getting-census-acs-data-into-r/index_files/header-attrs/header-attrs.js">&lt;/script>
&lt;style type="text/css">
&lt;/style>
&lt;p>The Census Bureau provides data to the public in a number of ways, the most direct way being through &lt;a href="data.census.gov">data.census.gov&lt;/a>. While data.census.gov has improved since its initial launch, it is still frustrating and lacking in its ability to browse and download data. As one example, it seems that as of April 2022 the American Community Survey (ACS) estimates from 2005-2009 are not available.&lt;/p>
&lt;div class="figure">
&lt;img src="data.census.gov_screenshot1.jpg" alt="" />
&lt;p class="caption">The “Years” option only goes back to 2010&lt;/p>
&lt;/div>
&lt;p>Fortunately, the Census Bureau also provides an API for more “sophisticated” users to query and download data. In this post, I share and demonstrate a simple function for downloading ACS data in R via the Census API. Note of course there are entire R packages that do this is as well, like &lt;a href="https://cran.r-project.org/web/packages/acs/index.html">acs&lt;/a> and &lt;a href="https://walker-data.com/tidycensus/">tidycensus&lt;/a>. However, if I simply want to grab some data I find this simple function is sufficient.&lt;/p>
&lt;pre class="r">&lt;code>library(janitor)
library(jsonlite)
library(tidyverse)
get_ACS &amp;lt;- function(vars, geo, start_year, stop_year, vintage) {
out_tibble &amp;lt;- tibble()
years &amp;lt;- seq(from = start_year, to = stop_year)
for (year in years) {
query &amp;lt;- paste0(&amp;quot;https://api.census.gov/data/&amp;quot;,
year,
&amp;quot;/acs/acs&amp;quot;,
vintage,
&amp;quot;?get=&amp;quot;,
vars,
&amp;quot;&amp;amp;for=&amp;quot;,
geo)
myJSON &amp;lt;- fromJSON(query)
myTibble &amp;lt;- as_tibble(myJSON, name_repail = &amp;quot;minimal&amp;quot;)
myTibble &amp;lt;- janitor::row_to_names(myTibble,1)
myTibble &amp;lt;- myTibble %&amp;gt;% mutate(year = year)
out_tibble &amp;lt;- rbind(out_tibble, myTibble)
}
return(out_tibble)
}&lt;/code>&lt;/pre>
&lt;p>Having defined the function we can now use it! Let’s get the estimate of Median Family Income in the United States (Table B19113&lt;a href="#fn1" class="footnote-ref" id="fnref1">&lt;sup>1&lt;/sup>&lt;/a>) from 2005-2019.&lt;/p>
&lt;pre class="r">&lt;code>us_fam_income &amp;lt;- get_ACS(&amp;quot;B19113_001E&amp;quot;,&amp;quot;us&amp;quot;,2005,2019,1)
knitr::kable(us_fam_income)&lt;/code>&lt;/pre>
&lt;table>
&lt;thead>
&lt;tr class="header">
&lt;th align="left">B19113_001E&lt;/th>
&lt;th align="left">us&lt;/th>
&lt;th align="right">year&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr class="odd">
&lt;td align="left">55832&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2005&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">58526&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2006&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">61173&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2007&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">63366&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2008&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">61082&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2009&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">60609&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2010&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">61455&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2011&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">62527&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2012&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">64030&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2013&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">65910&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2014&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">68260&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2015&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">71062&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2016&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">73891&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2017&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">76401&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2018&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">80944&lt;/td>
&lt;td align="left">1&lt;/td>
&lt;td align="right">2019&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Note that the estimate value is being stored as a character variable. The Census uses character codes like “*” to indicate missing or topcoded values. As is often the case, it is necessary to perform additional cleaning after retrieving the data.&lt;/p>
&lt;div class="footnotes">
&lt;hr />
&lt;ol>
&lt;li id="fn1">&lt;p>For a list of these table codes, see &lt;a href="https://api.census.gov/data/2019/acs/acs5/groups.html">here&lt;/a>&lt;a href="#fnref1" class="footnote-back">↩︎&lt;/a>&lt;/p>&lt;/li>
&lt;/ol>
&lt;/div></description></item><item><title>Exploring Census Household Pulse Survey Part 1</title><link>https://www.adambibler.com/post/exploring-census-household-pulse-survey-part-1/</link><pubDate>Sun, 30 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.adambibler.com/post/exploring-census-household-pulse-survey-part-1/</guid><description>
&lt;script src="https://www.adambibler.com/post/exploring-census-household-pulse-survey-part-1/index_files/header-attrs/header-attrs.js">&lt;/script>
&lt;div id="background" class="section level2">
&lt;h2>Background&lt;/h2>
&lt;p>The Census Bureau began the Household Pulse Survey to measure the impacts of the Coronavirus Pandemic on the U.S. Household Population. This post will demonstrate some basics of downloading the data, getting it into R, and doing some simple analysis.&lt;/p>
&lt;p>Download this post as an R Markdown file &lt;a href="https://github.com/abibler/Census_Household_Pulse">here&lt;/a>.&lt;/p>
&lt;/div>
&lt;div id="getting-the-data" class="section level2">
&lt;h2>Getting the data&lt;/h2>
&lt;p>This analysis will be based around using the Public Use File (PUF). The PUF contains the person-level responses to the survey and can be used to produce custom estimates. The PUFs for each week are published at &lt;a href="https://www.census.gov/programs-surveys/household-pulse-survey/datasets.html" class="uri">https://www.census.gov/programs-surveys/household-pulse-survey/datasets.html&lt;/a>.&lt;/p>
&lt;p>The code below will download and unzip the data.&lt;/p>
&lt;pre class="r">&lt;code># icesTAF::mkdir(&amp;quot;Data&amp;quot;)
# download.file(&amp;quot;https://www2.census.gov/programs-surveys/demo/datasets/hhp/2020/wk1/HPS_Week01_PUF_CSV.zip&amp;quot;, &amp;quot;Data/HPS_Week01_PUF_CSV.zip&amp;quot;)
# unzip(&amp;quot;Data/HPS_Week01_PUF_CSV.zip&amp;quot;, exdir = &amp;quot;Data/HPS_Week01_PUF_CSV&amp;quot;)&lt;/code>&lt;/pre>
&lt;/div>
&lt;div id="working-with-the-data" class="section level2">
&lt;h2>Working with the data&lt;/h2>
&lt;pre class="r">&lt;code>library(forcats)
library(scales)
library(srvyr)
library(tidyverse)&lt;/code>&lt;/pre>
&lt;p>First, we read in the PUF.&lt;/p>
&lt;pre class="r">&lt;code>puf &amp;lt;- read_csv(file = &amp;quot;Data/HPS_Week01_PUF_CSV/pulse2020_puf_01.csv&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>The PUF contains the &lt;strong>PWEIGHT&lt;/strong> variable to produce (weighted) estimates. In order to calculate standard errors though, we also need the “Replicate Weights” file, attaching it to the PUF.&lt;/p>
&lt;pre class="r">&lt;code>repweights &amp;lt;- read_csv(file = &amp;quot;Data/HPS_Week01_PUF_CSV/pulse2020_repwgt_puf_01.csv&amp;quot;)
puf_w_weights &amp;lt;- inner_join(puf, repweights, by = c(&amp;quot;SCRAM&amp;quot;,&amp;quot;WEEK&amp;quot;))&lt;/code>&lt;/pre>
&lt;p>Now, we convert data frame to survey object. This allows for calculating summary statistics without re-specifying the weight each time.&lt;/p>
&lt;pre class="r">&lt;code>wgts &amp;lt;- colnames(repweights)[3:length(colnames(repweights))]
survey_puf &amp;lt;- as_survey_rep(puf_w_weights, id = SCRAM, weights = PWEIGHT,
repweights = all_of(wgts), type = &amp;quot;Fay&amp;quot;, rho = 0.5005)&lt;/code>&lt;/pre>
&lt;p>And now we should be all set to start analyzing the data. First, let’s make sure we know what we are doing by estimating something that already appears in the Detailed Tables. Specifically, we’ll look at the &lt;a href="https://www2.census.gov/programs-surveys/demo/tables/hhp/2020/wk1/housing2b_week1.xlsx">Housing 2b table&lt;/a>. The table states that there are &lt;strong>8,918,242&lt;/strong> persons in renter occupied housing units with No Confidence in the Ability to Pay Next Month’s Rent, &lt;strong>12,571,649&lt;/strong> with slight confidence, and so on. We can replicate these numbers, adding the category IDs from the data.&lt;/p>
&lt;pre class="r">&lt;code>renters_payment_confidence &amp;lt;-
survey_puf %&amp;gt;%
filter(WEEK == &amp;quot;1&amp;quot; &amp;amp; TENURE == &amp;quot;3&amp;quot;) %&amp;gt;%
group_by(MORTCONF) %&amp;gt;%
survey_count() %&amp;gt;%
mutate_if(is.numeric, round, digits = 0)
renters_payment_confidence$MORTCONF &amp;lt;- factor(renters_payment_confidence$MORTCONF, labels =
c( &amp;quot;Question Seen But Category Not Collected&amp;quot;,
&amp;quot;Missing / Did Not Report&amp;quot;,
&amp;quot;No Confidence&amp;quot;,
&amp;quot;Slight Confidence&amp;quot;,
&amp;quot;Moderate Confidence&amp;quot;,
&amp;quot;High Confidence&amp;quot;,
&amp;quot;Payment Deferred&amp;quot;))
knitr::kable(renters_payment_confidence, format.args = list(big.mark = &amp;quot;,&amp;quot;))&lt;/code>&lt;/pre>
&lt;table>
&lt;thead>
&lt;tr class="header">
&lt;th align="left">MORTCONF&lt;/th>
&lt;th align="right">n&lt;/th>
&lt;th align="right">n_se&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr class="odd">
&lt;td align="left">Question Seen But Category Not Collected&lt;/td>
&lt;td align="right">170,927&lt;/td>
&lt;td align="right">61,811&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">Missing / Did Not Report&lt;/td>
&lt;td align="right">153,139&lt;/td>
&lt;td align="right">37,374&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">No Confidence&lt;/td>
&lt;td align="right">8,918,242&lt;/td>
&lt;td align="right">377,552&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">Slight Confidence&lt;/td>
&lt;td align="right">12,571,649&lt;/td>
&lt;td align="right">374,676&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">Moderate Confidence&lt;/td>
&lt;td align="right">18,070,862&lt;/td>
&lt;td align="right">480,523&lt;/td>
&lt;/tr>
&lt;tr class="even">
&lt;td align="left">High Confidence&lt;/td>
&lt;td align="right">30,643,777&lt;/td>
&lt;td align="right">609,009&lt;/td>
&lt;/tr>
&lt;tr class="odd">
&lt;td align="left">Payment Deferred&lt;/td>
&lt;td align="right">938,815&lt;/td>
&lt;td align="right">153,909&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>We see that we were able to successfully reproduce the estimates. However, the standard errors are slightly off. (If anyone knows why, please let me know.)&lt;/p>
&lt;p>Now we can produce custom estimates. For example, the pulse asks respondents a series of questions about their mental health. &lt;a href="https://www2.census.gov/programs-surveys/demo/tables/hhp/2020/wk1/health2a_week1.xlsx">Health Table 2a&lt;/a> lists Symptoms of Anxiety By Selected Characteristics. However, the respondent’s housing situation is not one of the characteristics.&lt;/p>
&lt;p>Let’s look at the symptoms of anxiety for renters.&lt;/p>
&lt;pre class="r">&lt;code>anxiety_for_all_renters &amp;lt;- survey_puf %&amp;gt;%
filter(WEEK == &amp;quot;1&amp;quot; &amp;amp; TENURE == &amp;quot;3&amp;quot;) %&amp;gt;%
group_by(ANXIOUS) %&amp;gt;%
summarise(proportion = survey_mean())
anxiety_for_all_renters$Group &amp;lt;- &amp;quot;All Renters&amp;quot;
anxiety_for_renters_w_no_conf &amp;lt;- survey_puf %&amp;gt;%
filter( WEEK == &amp;quot;1&amp;quot; &amp;amp;
TENURE == &amp;quot;3&amp;quot; &amp;amp;
MORTCONF == &amp;quot;1&amp;quot; ) %&amp;gt;%
group_by(ANXIOUS) %&amp;gt;%
summarise(proportion = survey_mean())
anxiety_for_renters_w_no_conf$Group &amp;lt;- &amp;quot;Renters With No Confidence
in Paying Next Month&amp;#39;s Rent&amp;quot;
anxiety_for_renters &amp;lt;- rbind(anxiety_for_all_renters, anxiety_for_renters_w_no_conf)
anxiety_for_renters$ANXIOUS &amp;lt;- factor(anxiety_for_renters$ANXIOUS, labels = c(&amp;quot;Missing&amp;quot;,
&amp;quot;Not at all&amp;quot;,
&amp;quot;Several days&amp;quot;,
&amp;quot;More than half the days&amp;quot;,
&amp;quot;Nearly every day&amp;quot;))
ggplot(anxiety_for_renters, aes(x = ANXIOUS, y = proportion, fill = Group)) +
geom_bar(stat = &amp;quot;identity&amp;quot;, position = &amp;quot;dodge&amp;quot;) +
theme(axis.text.x = element_text(angle = 45)) +
xlab(&amp;quot;Over the last 7 days, how often have you been bothered by the
following problems: Feeling nervous, anxious, or on edge? Would you
say not at all, several days, more than half the days, or nearly every
day?&amp;quot;) +
ylab(&amp;quot;Share of Respondants&amp;quot;) +
scale_y_continuous(labels = scales::percent)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://www.adambibler.com/post/exploring-census-household-pulse-survey-part-1/index_files/figure-html/unnamed-chunk-7-1.png" width="672" />&lt;/p>
&lt;p>As might be expected, renters who report not being confident in their ability to pay next month’s rent also report feeling anxious more often.&lt;/p>
&lt;p>One thing to note about the Pulse is that tenure (whether the home is owned or rented) is missing for a large number of the respondents.&lt;/p>
&lt;pre class="r">&lt;code>tenure_w_missing &amp;lt;- survey_puf %&amp;gt;%
filter(WEEK == &amp;quot;1&amp;quot;) %&amp;gt;%
group_by(TENURE) %&amp;gt;%
summarise(proportion = survey_mean())
tenure_w_missing$TENURE &amp;lt;- factor(tenure_w_missing$TENURE, labels =
c( &amp;quot;Question Seen But Category Not Collected&amp;quot;,
&amp;quot;Missing / Did Not Report&amp;quot;,
&amp;quot;Owned free and clear&amp;quot;,
&amp;quot;Owned with a mortgage&amp;quot;,
&amp;quot;Rented&amp;quot;,
&amp;quot;Occupied without payment of rent&amp;quot;))
tenure_w_missing$proportion &amp;lt;- scales::label_percent()(tenure_w_missing$proportion)
tenure_w_missing[,1:2]&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## # A tibble: 6 x 2
## TENURE proportion
## &amp;lt;fct&amp;gt; &amp;lt;chr&amp;gt;
## 1 Question Seen But Category Not Collected 0.5%
## 2 Missing / Did Not Report 10.2%
## 3 Owned free and clear 18.3%
## 4 Owned with a mortgage 40.8%
## 5 Rented 28.7%
## 6 Occupied without payment of rent 1.5%&lt;/code>&lt;/pre>
&lt;p>We can see from below that even though tenure is missing for a large share of respondants, the proportion of owners and renters appears in line with that reported in the &lt;a href="https://data.census.gov/cedsci/table?q=b25003&amp;amp;tid=ACSDT1Y2019.B25003">American Community Survey&lt;/a>.&lt;/p>
&lt;pre class="r">&lt;code>tenure_no_missing &amp;lt;- survey_puf %&amp;gt;%
filter(WEEK == &amp;quot;1&amp;quot; &amp;amp; TENURE != &amp;quot;-88&amp;quot; &amp;amp; TENURE != &amp;quot;-99&amp;quot;) %&amp;gt;%
group_by(TENURE) %&amp;gt;%
summarise(proportion = survey_mean())
tenure_no_missing$TENURE &amp;lt;- factor(tenure_no_missing$TENURE, labels =
c( &amp;quot;Owned free and clear&amp;quot;,
&amp;quot;Owned with a mortgage&amp;quot;,
&amp;quot;Rented&amp;quot;,
&amp;quot;Occupied without payment of rent&amp;quot;))
tenure_no_missing$proportion &amp;lt;- scales::label_percent()(tenure_no_missing$proportion)
tenure_no_missing[,1:2]&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## # A tibble: 4 x 2
## TENURE proportion
## &amp;lt;fct&amp;gt; &amp;lt;chr&amp;gt;
## 1 Owned free and clear 21%
## 2 Owned with a mortgage 46%
## 3 Rented 32%
## 4 Occupied without payment of rent 2%&lt;/code>&lt;/pre>
&lt;/div></description></item></channel></rss>