This is a series of research modules designed to both teach beginning R users how to work with data and lead undergraduates through research by constructing case studies of individual counties using public data. These modules are designed to be completed asynchronously. If students work on these independently, weekly debriefing sessions are helpful to work students through any difficulties they encounter. Students in a class with 26 required research hours typically completed between 3-4 modules, as well as preparing and presenting both a lightning talk and a team oral talk or poster of their research. An accompanying peer-reviewed publication about these materials and their delivery is in preparation. Contents Each research module is contained in a separate folder containing a PDF of the instructions provided to students, an editable Quarto source file for the instructions, and instructor notes that contain the data sources and notes on teaching the modules. Module 1: Getting Started in R Objectives: Download R and RStudio on your personal computer. Create a single-value object, a vector, and a data frame. Use number and character data. Produce a Quarto report in PDF format to save a record of your work. Module 2: Navigating Data Objectives Read data into R. Use commands that tell us information about a data frame. Create subsets of data based on positions and conditions. Use subsets to calculate metrics from two data sources. Report on the average age of residents, average age of agricultural producers, population size, percentage of residents in agricultural producer households, and percent of acres used for agriculture in your case study county. Module 3: Agricultural Statistics: Producers and Farms Objectives: Review percent calculations used in Module 2 to calculate statistics on producer gender and race. Use unique to calculate the variety of commodities produced by a county. Use ifelse to assign categories and aggregate to calculate summary statistics for those categories (farm size, farm ownership). Making Figures At this point, students may need to make figures for a presentation of their research. The attached tutorial is not an in-depth module on data visualization, but merely a template for students who need to make figures with limited time. Objectives: Make comparison plots with ggplot2. Module 4: Agricultural Statistics: Data Linking and Trends Through Time Objectives: Use subsets, joins, and calculations across rows to quantify 20-year change in farm size and ownership. Use calculations based on a previous row (with lag) to quantify 20-year change in 5-year intervals for agricultural land use and agricultural sales. Use the dplyr package for data cleaning. Module 5: Demographic Data Objectives: Clean data by extracting column names from a multi-row header, filtering with %in% statements, and matching variable codes to their meanings. Increase reproducibility by using the tidycensus package to pull data from the online American Community Survey data portal. Calculate employment metrics (percent residents employed in agriculture, manufacturing, hospitality, and technology), residents over retirement age, and net migration rate. Module 6: Reproducible Data Retrieval: Connecting to the Cloud Objectives: Navigate the US Census API to locate variables of interest from the American Community Survey. Use the tidycensus package to pull data from the American Community Survey and calculate a descriptive metric of interest. Use the NASS Quick Stats page to locate variables of interest from the Census of Agriculture. Use the rnassqs package to pull data from the Census of Agriculture and calculate a descriptive metric of interest. Module 7: Working with Spatial Data in R Objectives: Use the sf package to explore and visualize vector data. Use the terra package to explore and visualize raster data. Use sf and tigris to retrieve a shapefile of a case study county. Use a shapefile to crop a raster. Calculate summary statistics for a raster. Module 8: Spatial Data: Land Cover, Ownership, and Quality Objectives: Use spatial summary statistics to calculate average agricultural quality. Crop vector data and calculate land management percentages. Use freq to calculate land cover percentages. Module 9: Integrating Migration Data Objectives: Calculate statistics for multiple counties at once using summarise. Determine the difference in population size and density from source counties to study area counties. Join land cost data to migration data and calculate the difference in land value, using terra::extract. Determine the average household size of source counties.
Building similarity graph...
Analyzing shared references across papers
Loading...
Carolyn Koehn
Boise State University
Boise State University
Building similarity graph...
Analyzing shared references across papers
Loading...
Carolyn Koehn (Tue,) studied this question.
synapsesocial.com/papers/6a211689d499ed480b16f81a — DOI: https://doi.org/10.25334/k6zp-5b29