Harnessing the power of regular expressions

Woods Hole Oceanographic Institution (online)

May 21, 2020

8:45 am - 12:30 pm EDT

Instructor: Amber York

Helpers: Karen Soenen, Brett Longworth, Stace Beaulieu

General Information

Are you tired of hand-editing your data to fix errors and formatting issues? Harness the power of regular expressions (regex) to search, match, and manipulate your data.

Whether you are new to regular expressions or would like an overview of capabilities to level up your regex game, this workshop is for you! No programming experience is required. These skills can be implemented in many programming languages, text editors, and the command line.

This workshop is targeted towards the technical WHOI staff in order to improve project efficiency and build technical skills. It will only be held for 10 people at a time through an online Zoom meeting. Registration is required. Please contact stace@whoi.edu for availability.

Workshop sponsorship: WHOI Academic Programs Doherty Award and a DDVPR Technical Staff Training Award.

When: May 21, 2020

Requirements:

Handouts: These are not required materials but they are useful for reference.

Accessibility: We are dedicated to providing a positive and accessible learning environment for all. Please notify the instructors in advance of the workshop if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.

Contact: Please email adyork@whoi.edu for more information.


Why should I learn regular expressions?

Regular expressions can seem like a mysterious super-power but there is no need to be intimidated. We are going to go through this together. Once you get a handle on them they can make life a lot easier for you!

Regular expression xkcd comic
Creative Commons License by xkcd https://xkcd.com/208/

Code of Conduct

We will be using the Carpentries code of conduct for this workshop.

Everyone who participates in this workshop is required to conform to the Code of Conduct.


Surveys

Please be sure to complete these surveys before and after the workshop. If you aleady filled out a survey for the WHOI workshop on May 22nd, you don't have to fill this out again.

Pre-workshop Survey

Post-workshop Survey



Setup

It is recommended that you download and install free Zoom conference software. No other setup required. See Joining a meeting from Zoom Help Center). You will need an up-to-date web browser.

Schedule and Syllabus

This workshop will use content from the Library Carpentry Regular Expression lesson. See https://carpentries.org/ for more information about the Carpentries organization.

Participants will receive an overview of regular expression capabilities and where they can be used (languages, command line, text-editors, etc.).

We will cover regular expression patterns and syntax as well as strategies for forming regular expression patterns. Using an online regular expression editor we will see in real-time how our patterns match and manipulate data.

Participants will be given exercises to practice regular expression skills.

</table>

08:45 0. Intro Meet your instructor and helpers. Learn what resources and tools we will be using during this workshop.
09:00 1. Regular Expressions (RegEx) Intro What are regular expressions, and what can they do? How can you imagine using regular expressions in your work?
09:30 2. Matching & Extracting

How can you use regular expressions to match and extract strings?

  • Match words, emails, and phone numbers in the Code of Conduct using regex101.com
  • Exercise finding email addresses using regex101.com
9:45 Break 15 min break
10:00 2. Matching & Substitution

RegEx challenges based upon your data examples, and past data submissions to BCO-DMO.

  • Convert a comma delimited list into separate lines.
  • Modify timestamps to ISO format and add a time zone.
  • Transform a file list into a csv tableSplit data in one column into separate data columns (e.g. KM1208_station1_cast5.csv -> csv table with columns cruise_id,station,cast).
  • Match genus, species, and an identifier in a sample name to add a scientific name and identifier column.
10:45 Break 15 min break
11:00 3. "Choose your own adventure!"

Pick activities and we will share our results.

  • Pick an exercise(s) from Exercises
  • Or come up with your own example.
11:30 4. Going Further

We will cover some more concepts and resources you can consult when using regular expressions in your work.

</td> </tr>

11:45 Break 15 min break
12:00 Questions?
12:15 Wrap-Up & Resources