powershell - How can I extract a string from a text file and rename a file with it? -
for project working on have thousands of forms (.pdf) need rename using contents within forms.
so far, have run ocr on them , exported content text files. each pdf form has .txt file of same name containing information. use powershell (if possible) extract specific part of text file rename pdf file, not sure how can that.
to give better idea of i'm working on, form contained in pdf , text files (ex-12345.pdf , 12345.txt) looks this-
~~~
constituency: xxxyyyzzz
polling station: abc def ghi (001)
stream: 123
~~~
what need extract polling station name , rename pdf file that.
"12345.pdf" -> "abc_def_ghi_(001).pdf"
so need figure out how extract string between "station:" , "stream:" 12345.txt. make things bit more complicated, text files want extract string have irregularities when comes spacing.
for example, previous form may in text file-
~~~
constit uency: xxxyyyzzz
polling stat on: abc de f ghi (00 1)
s tream: 12 3
~~~
fortunately, letters seem intact.
so, learn how extract string containing polling station name these text files , rename corresponding pdf files it.
thanks help.
'polling station: abc def ghi (001)' | select-string ' station: (.+)' | foreach-object { "{0}.pdf" -f ($_.matches[0].groups[1].value -replace ' ','_') } # outputs 'abc_def_ghi_(001).pdf'
Comments
Post a Comment