Python 2.7. Extracting data from some part of a string using a regex -
let's import regex.
import re
assume there's string containing data.
data = '''mike: jan 25.1, feb 24.3, mar 29.0 rob: jan 22.3, feb 20.0, mar 22.0 nick: jan 23.4, feb 22.0, mar 23.4'''
for example, want extract floats rob's line only.
name = 'rob'
i'd make this:
def data_extractor(name, data): return re.findall(r'\d+\.\d+', re.findall(r'{}.*'.format(name),data)[0])
the output ['22.3', '20.0', '22.0']
.
is way pythonic or should improved somehow? job, i'm not appropriateness of such code.
thanks time.
a non-regex way consists in splitting lines , trimming them, , checking 1 starts rob
, grab float values:
import re data = '''mike: jan 25.1, feb 24.3, mar 29.0 rob: jan 22.3, feb 20.0, mar 22.0 nick: jan 23.4, feb 22.0, mar 23.4''' name = 'rob' lines = [line.strip() line in data.split("\n")] l in lines: if l.startswith(name): print(re.findall(r'\d+\.\d+', l)) # => ['22.3', '20.0', '22.0']
see python demo
if want use purely regex way, may use pypi regex
module \g
based regex:
import regex data = '''mike: jan 25.1, feb 24.3, mar 29.0 rob: jan 22.3, feb 20.0, mar 22.0 nick: jan 23.4, feb 22.0, mar 23.4''' name = 'rob' rx = r'(?:\g(?!\a)|{}).*?(\d+\.\d+)'.format(regex.escape(name)) print(regex.findall(rx, data))
this pattern matches:
(?:\g(?!\a)|{})
- end of last successful match orname
contents.*?
- 0+ chars other line break chars, few possible(\d+\.\d+)
- group 1 (just valuefindall
return) matching 1+ digits,.
, 1+ digits.
the regex.escape(name)
escape chars (
, )
etc. might appear in name
.
Comments
Post a Comment