Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit flush #116

Merged
merged 2 commits into from
Apr 30, 2021
Merged

Explicit flush #116

merged 2 commits into from
Apr 30, 2021

Conversation

balmukundblr
Copy link
Contributor

@balmukundblr balmukundblr commented Apr 29, 2021

Description

Please note- This is not a new PR- Original PR (apache/lucene-solr#2349) was raised on old apache/lucene-solr github repository. This is just a copy in new repo.

Longer completion time for Close Index call.

Once AddDoc task completes, Benchmark algo calls ForceMerge/CloseIndex task, which eventually allows all pending flushes to be completed. Since flushes during CloseIndex call are sequential, it takes longer time to complete and delays the overall Index completion time. While indexing 1 million documents with reuters21578 (plain text Document derived from reuters21578 corpus), we observed CloseIndex call takes around 35% of total time.

Solution

Developed a new FlushIndexTask, which uses flushNextBuffer() Lucene API, to flush document at Index thread level, while not impacting any other Index threads. Adding this task in the algo file, immediately after AddDoc task, would ensure flushing all docs before calling ForceMerge/CloseIndex task.
With this solution in place, CloseIndex task time was reduced significantly and it also improved total time for Indexing.

Tests

Since, we are using existing Lucene API - flushNextBuffer(), hence it already has test cases.
-Passed existing tests

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.

@mikemccand
Copy link
Member

Thanks @balmukundblr -- looks great -- I'll try to push today.

@mikemccand mikemccand merged commit 66062e8 into apache:main Apr 30, 2021
@mikemccand
Copy link
Member

Thank you @balmukundblr!

@balmukundblr
Copy link
Contributor Author

Thank you @balmukundblr!

Thank you very much Mike for your great support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants