list - Splitting column text file into smaller files based on value differences in Python -

May 15, 2011

i trying split text file 3 columns many smaller individual text files based on presence of jumps in value in first column. here example of small part of file split:

2457062.30520078 1.00579146 1

2457062.30588184 1.00607543 1

2457062.30656300 1.00605515 1

2457062.71112193 1.00288150 1

2457062.71180299 1.00322454 1

2457062.71248415 1.00430136 1

between lines 3 , 4 there jump larger usual. point data split , individually created text files separated, creating 1 first 3 lines , 1 latter 3 lines. jumps exceed change of 0.1 in first column. goal have jump example split point separate files. insight appreciated, thanks

i loop through main file , keep writing lines long condition met. fits definition of while loop perfectly. main complexity need 2 open files @ same time (the main 1 , 1 writing to), that's not problem python.

maintext = "big_file.txt" sfile_templ = 'small_file_{:03.0g}.txt' # delimiter space in example gave,  #  might tab (\t) or comma or anything. delimiter = ' '   lim = .1  # count how many files have created. = 0  # open main file open(maintext) mainfile:     # read first line , set things     line = mainfile.readline()     # note want first element ([0]) before     #  delimiter (.split(delimiter)) of row (line)     #  number (float)     v_cur = float(line.split(delimiter)[0])     v_prev = v_cur      # stop loop once reach end of file (eof)     #  readline() return empty string.     while line:         # open second file writing (mode='w').         open(sfile_templ.format(i), mode='w') subfile:             # long values in limit, keep              #  writing lines current file.             while line , abs(v_prev - v_cur)<lim:                 subfile.write(line)                 line = mainfile.readline()                 v_prev = v_cur                 v_cur = float(line.split(delimiter)[0])         # increment file counter         += 1         # make sure don't stuck after 1 file         #  (if don't replace v_prev here, while loop         #  never execute after first time.)         v_prev = v_cur

Search This Blog

RT

list - Splitting column text file into smaller files based on value differences in Python -

Comments

Post a Comment

Popular posts from this blog

Ansible warning on jinja2 braces on when -

Parsing a protocol message from Go by Java -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -