Regex for 0 or more of a set where at least one of a subset is mandatory -
i'm using regex in notepad++ (basically pcre syntax) find arbitrary-length runs of set of characters. however, run must contain @ least 1 of subset of characters.
for example, use set [abcdefg] string can contain 0 or more of a, b, c, or d, must contain @ least e, f, or g.
currently i'm using [abcd]*[efg][abcd]* i.e. specifying optional ones before and after mandatory ones. there more concise way specify this? (i'm using sets of diacritics etc. pain modify , use them little possible... string used doesn't render below. use \x{0000} syntax verbose)
[ּֽׁׂׅ֑ׄ]*[ ִ ֶ ַ ֻ][ּֽׁׂׅ֑ׄ]*
shorter , more correct
[a-g]*[efg][a-g]*
plus, bookend whitespace boundary:
(?<!\s)[a-g]*[efg][a-g]*(?!\s)
update hebrew char's
the equivalent literal regex
[ִֶַֻּֽׁׂ֑ׅׄ]*[ִֶַֻ][ִֶַֻּֽׁׂ֑ׅׄ]*
but doesn't render well.
the better choice convert codepoint notation
[\x{591}\x{5b4}\x{5b6}-\x{5b7}\x{5bb}-\x{5bd}\x{5c1}-\x{5c2}\x{5c4}-\x{5c5}]*[\x{5b4}\x{5b6}-\x{5b7}\x{5bb}][\x{591}\x{5b4}\x{5b6}-\x{5b7}\x{5bb}-\x{5bd}\x{5c1}-\x{5c2}\x{5c4}-\x{5c5}]*
expanded
[\x{591}\x{5b4}\x{5b6}-\x{5b7}\x{5bb}-\x{5bd}\x{5c1}-\x{5c2}\x{5c4}-\x{5c5}]* [\x{5b4}\x{5b6}-\x{5b7}\x{5bb}] [\x{591}\x{5b4}\x{5b6}-\x{5b7}\x{5bb}-\x{5bd}\x{5c1}-\x{5c2}\x{5c4}-\x{5c5}]*
Comments
Post a Comment