MET581 Lecture 04 Homework
Wrangling Data 2
This document contains all questions for lessson 4. Please create a Quarto document containing all text, code and output used to answer the questions.
Explain how escapes work in R with respect to regular expressions
List and describe the tidyverse commands used to join two datasets together. What two key arguments can you use to prevent mistakes in your joins?
Load the nycflights13 package and join the datasets flights and airlines together, selecting only the columns:
year
,month
,day
,hour
,origin
,tailnum
andcarrier
from flights. Can you do the same but using themutate()
function instead?In the starwars dataset and in one command, add two new columns:
- one column that converts the
name
column to lower case - one column that converts the
eye_colour
to uppercase
- Download the gene annotation file from the NCBI found here
unzip the file and load in NCBI37.3.gene.loc
into R
- 5a. add the column headers:
Entrez_Gene_ID
,CHR
,BP_START
,BP_END
,STRAND
,GENE_NAME
- 5b. How many Genes are on the positive strand?
- 5c. How many Genes begin and end with a letter?