python - File read performance test yields interesting results. Possible explanations? -


i'm stress-testing system determine how punishment filesystem can take. 1 test involves repeated reads on single small (thus presumably heavily cached) file determine overhead.

the following python 3.6.0 script generates 2 lists of results:

import random, string, time  stri = bytes(''.join(random.choice(string.ascii_lowercase) in range(100000)), 'latin-1')  inf = open('bench.txt', 'w+b') inf.write(stri)  t in range(0,700,5):     readl = b''     start = time.perf_counter()     in range(t*10):         inf.seek(0)         readl += inf.read(200)     print(t/10.0, time.perf_counter()-start)  print()  t in range(0,700,5):     readl = b''     start = time.perf_counter()     in range(3000):         inf.seek(0)         readl += inf.read(t)     print(t/10.0, time.perf_counter()-start)  inf.close() 

when plotted following graph:

results graph

i find these results weird. second test (blue in picture, mutable read lenght parameter) starts off linearly increasing expected, after point decides climb more quickly. more surprisingly, first test (pink, mutable repetitions count , fixed read length) shows wild departure interesting because size of read function remains fixed there. it's irregular head-scratching @ best. system idle when running tests.

what plausible reason there causes such major performance degradation after number of repetitions?

edit:

the fact readl byte array apparently major performance hog. switching string drastically improves everything. yet when working strings, calling read , seek functions minor factor comparison. here more test variants of test 1 (mutable repetitions). test 2 left out because results turn out entirely explained byte array performance difference alone:

import random, string, time  strs = ''.join(random.choice(string.ascii_lowercase) in range(100000)) strb = bytes(strs, 'latin-1')  inf = open('bench.txt', 'w+b') inf.write(strb)  #bytes , read t in range(0,700,5):     readl = b''     start = time.perf_counter()     in range(t*10):         inf.seek(0)         readl += inf.read(200)     print(t/10.0, '%f' % (time.perf_counter()-start)) print()  #bytes no read t in range(0,700,5):     readl = b''     start = time.perf_counter()     in range(t*10):         readl += strb[0:200]     print(t/10.0, '%f' % (time.perf_counter()-start)) print()  #string , read t in range(0,700,5):     readl = ''     start = time.perf_counter()     in range(t*10):         inf.seek(0)         readl += inf.read(200).decode('latin-1')     print(t/10.0, '%f' % (time.perf_counter()-start)) print()  #string no read t in range(0,700,5):     readl = ''     start = time.perf_counter()     in range(t*10):         readl += strs[0:200]     print(t/10.0, '%f' % (time.perf_counter()-start)) print()  inf.close() 

results graph


Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -