performance - Writing large amount of data using Akka in more efficient way -
i've implemented scala akka application streams 4 different types of data biomodule sensor (ecg, eeg, breath , general data). these data (timestamp , value) typically stored in 4 different csv files. however, have store each sample in 2 different files different timestamps, application writing in 8 different csv files @ same time. i've implemented 1 akka actor responsible persisting data, receive path file in write data, timestamp , value. however, bottleneck, since number of samples need store large (e.g. 1 ecg sample received each 4ms). result, actor had finished recording in short experiment 1-2 minutes after experiment over.
i've tried 4 actors 4 different message types, idea distribute work. didn't notice significant improvement in performances.
i'm wondering if has idea how improve performance. better use 1 actor storing files, few actors or efficient if have 1 actor each file? or maybe, doesn't make difference? improve code storing data?
this method responsible storing data:
def processvalue(sample: waveformvalue): unit ={ val csvfilewriter=new printwriter(new bufferedwriter(new filewriter(sample.filepath,true))) csvfilewriter.append(sample.timestamp.tostring) csvfilewriter.append(",") csvfilewriter.append(sample.value.tostring) csvfilewriter.append("\r\n") csvfilewriter.flush() csvfilewriter.close()
}
it seems me bottleneck i/o -- disk access. looks opening, writing to, , closing file each sample, expensive. suggest:
- open each file once, , close @ end of processing. might need store file in member variable, or if have have arbitrary collection of files store them in map in member variable.
- don't flush after every sample write.
- use buffered writes each file writer. avoids flushing data filesystem every write, involves system call , waiting data written disk. see you're doing this, benefit lost since flushing/closing file after each sample anyway.
Comments
Post a Comment