python regex match and replace beginning and end of string but keep the middle -


i have dataframe holiday names. have problem on days, holidays observed on different days, on day of holiday. here example problems:

1  "independence day (observed)" 2  "christmas eve, christmas day (observed)" 3  "new year's eve, new year's day (observed)" 4  "martin luther king, jr. day" 

i want replace ' (observed)' '' , before comma if ' (observed)' matched. output should be:

1  "independence day" 2  "christmas day" 3  "new year's day" 4  "martin luther king, jr. day" 

i able both independently:

(foo['holiday']  .replace(to_replace=' \(observed\)', value='', regex=true)  .replace(to_replace='.+, ', value='', regex=true)) 

but caused problem 'martin luther king, jr. day'.

replace.py

import re  input = [     "independence day (observed)",     "christmas eve, christmas day (observed)",     "new year's eve, new year's day (observed)",     "martin luther king, jr. day" ]  holiday in input:     print re.sub('^(.*?, )?(.*?)( \(observed\))$', '\\2', holiday) 

output

> python replace.py  independence day christmas day new year's day martin luther king, jr. day 

explanation

  • ^: match @ start of string.
  • (.*?, )?: match followed command , space. make lazy match, doesn't consume portion of string want keep. last ? makes whole thing optional, because of sample input doesn't have comma @ all.
  • (.*?): grab part want later use in capturing group. part lazy match because...
  • ( \(observed\)): strings might have " (observed)" on end, declare in separate group here. lazy match in prior piece won't consume this.
  • $: match @ end of string.

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -