search - ElasticSearch - Update or new index? -


requirements:

  • a single elasticsearch index needs constructed bunch of flat files gets dropped every week
  • apart weekly feed, intermittent diff files, providing additional data not part of original feed (insert or update, no delete)
  • the time parse , load these files (weekly full feed or diff files) elasticsearch not huge
  • the weekly feeds received in 2 consecutive weeks expected have significant differences (deletes, additions, updates)
  • the index critical apps function , needs have close 0 downtime
  • we not concerned exact changes made in feed, need have ability rollback previous version in case current load fails reason
  • to state obvious, searches need fast , responsive

given these requirements, planning following:

  1. for incremental updates (diff) can insert or update records as-is using bulk api
  2. for full updates reconstruct new index , swap alias mentioned in post. in case of rollback, can revert previous working index (backups maintained if rollback needs go few versions)

questions:

  1. is best approach or better crud documents on created index using built-in versioning, when re-constructing index?
  2. what impact of modifying data (delete, update) underlying lucene indices/shards? can modifications cause fragmentation or inefficiency?

  1. at first glance, i'd overall approach sound. creating new index every week new data , swapping alias approach if need

    • zero downtime ,
    • to able rollback previous indices whatever reason

if keep 1 index , crud documents in there, you'd not able rollback if goes wrong , end in mixed state data current week , data week earlier.

  1. every time update (even 1 single field) or delete document, previous version flagged deleted in underlying lucene segment. when lucene segments have grown sufficiently big, es merge them , wipe out deleted documents. however, in case, since you're creating index every week (and delete index week prior), won't land situation you'll have space and/or fragmentation issues.

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -