Homework #3 yippee!

  1. I used \s{2,} to search for every space that was 2 or larger. \s found the spaces, the 2 specified that it was two spaces, and the ,comma specified that it was two or more spaces. I then replaced it with a comma followed by a space.

  2. I used (\w+), (\w+), (.*) to capture the first word as the first capture, the second word as the second capture, and the rest of the line as the third capture. I then replaced it with \2 \1 \(\3) to swap the first and last names, as well as putting the university name (capture 3) in parentheses.

  3. I used mp3 as my find, and replaced it with mp3 \r to keep the mp3 and to add a line return after each number.

  4. I used (\d+)\s(.*)(\.mp3) to capture the first digits, ignore the space, capture the rest of the song name, and capture the .mp3 at the end of the song title. I then used \2_\1\3 to make the song name capture first, add an underscore before the digits capture, and keep the .mp3 at the end.

  5. I used (\w)\w+,(\w+),\d+\.\d+(,\d+) to keep the first letter of the genus, get rid of the rest of the genus, keep the species, get rid of the first number, and keep the final comma and number. In the replace box, I used \1_\2\3 to format by the first letter of the genus, underscore species, and keep the final number.

  6. Using the original data, I used (\w)(\w+),(\w{4})(\w+),(\d+).(\d+)(,\d+) to capture everything independently, making sure to capture the first 4 letters of the species name. Then, following the order of the captures, I replaced it with \1_\3\7 to obtain a final result with the first letter of the genus, underscore, first 4 letters of the species, comma, and final number.

  7. Using the original data, I found (\w{3})(\w+),(\w{3})(\w+),(\d+).(\d+)(,\d+) capturing everything independently making sure to specify the letters of each word I wanted to keep. For the replacement, in order to rearrange the digits I did \1\3\7,\5.\6, leaving me with the final correct format.

  8. For the spaces, I typed a space into the find bar and replaced it with nothing. For the special characters, I went character by character using versions of \% until all of the characters were gone. When it came to replacing the random underscores, I made sure to do \_ only on the lines of data. For the NA present in the binary column, I replaced all NA with 0.