regex python find dollar amount and few words at the same time -
i need find dollar amount , few(3 or 4) words surrounding amount @ same time in 1 paragraph.
in-process research , development of $184.3 million , charges $120 of million impairment of long-lived assets. see notes 2, 16 and21 consolidated financial statements. income continuingoperations fiscal year ended september 30, 2001 includes netgain on sale of businesses , investments of $276.6 million , net gainon sale of common shares of subsidiary of $64.1 million.
what want below, [amount, amount+ digit words, 3-4 words after before amount].
[$184.3 $184.3 million, research , development of $184.3 million],[$120, $120 of million,charges $120 of million impairment of long-lived assets ], [$276.6, $276.6 million, investments of $276.6 million] ,[ $64.1, $64.1 million, subsidiary of $64.1 million.]
what tried , found dollar amount.
[\$]{1}\d+\.?\d{0,2}
thanks!
so let's name pattern have:
amount_patt = r"[\$]{1}[\d,]+\.?\d{0,2}"
digit word should defined using above:
digit_word_patt = amount_patt + r" (\w+)"
now, surrounding 3-4 words, following:
words_patt = r"(\s+ ){3, 4}" + amount_patt + r"(\s+ ){3, 4}"
you're done! use these re
methods string extraction.
Comments
Post a Comment