Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade tika to the 2.4.1 release #789

Closed
bamthomas opened this issue Apr 6, 2021 · 4 comments
Closed

upgrade tika to the 2.4.1 release #789

bamthomas opened this issue Apr 6, 2021 · 4 comments
Assignees
Labels

Comments

@bamthomas
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

No.

Describe the solution you'd like

Datashare/extract is having a dependency on Tika 1.22 (released 1st of august 2019). Since then there has been 4 releases, the latest is 1.26 and there is a 2.0.0-alpha.

For now it is breaking the indexing features with java.lang.NoSuchMethodError (see ICIJ/extract@55ff0cc )

It is necessary to check all dependencies from tika that are specified in the pom.xml (and with datashare transitive dependencies).

The root cause for the NotSuchMethodError seemed to be commons-codec that needed to be upgraded from 1.10 to 1.13. But after having done it we still saw the error.

this may be related to https://issues.liferay.com/browse/LPS-120596

@tballison
Copy link

Upgrading Tika early and often is a good idea. Let me know if you want to chat about migrating to >= 2.1.0.

@mvanzalu mvanzalu added the feat label Sep 14, 2022
@bamthomas bamthomas self-assigned this Sep 20, 2022
@bamthomas
Copy link
Collaborator Author

@tballison thanks for your message. I'm digging into it.
what do you think is the best :

  • progressive upgrade 1.24/1.26/ 2.0 ...
  • going straight to 2.1 and solving problems 1 by 1 ?
  • other strategy ?

@tballison
Copy link

If you have time, I'd recommend going straight to 2.4.1. There aren't that many diffs/changes within 2.x. This is the documentation we've put together: https://cwiki.apache.org/confluence/display/TIKA/Migrating to Tika 2.0.0

The 2.5.0 release should happen in the next few weeks, but that should be a drop in replacement for 2.4.1.

Let me know if you have any questions on 2.x!

@mvanzalu mvanzalu changed the title upgrade tika to the latest 1.26 release upgrade tika to the 2.4.1 release Oct 13, 2022
@mvanzalu mvanzalu self-assigned this Oct 13, 2022
mvanzalu added a commit to ICIJ/extract that referenced this issue Oct 20, 2022
mvanzalu added a commit to ICIJ/extract that referenced this issue Oct 20, 2022
mvanzalu added a commit to ICIJ/extract that referenced this issue Oct 20, 2022
mvanzalu added a commit to ICIJ/datashare-api that referenced this issue Oct 20, 2022
mvanzalu added a commit to ICIJ/extract that referenced this issue Nov 21, 2022
mvanzalu added a commit to ICIJ/datashare-client that referenced this issue Nov 21, 2022
mvanzalu added a commit that referenced this issue Nov 21, 2022
@github-actions
Copy link

This issue is stale because it has been open for 40 days with no activity.

@github-actions github-actions bot added the stale label Nov 23, 2022
@pirhoo pirhoo closed this as completed Nov 23, 2022
@mvanzalu mvanzalu removed the stale label Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

No branches or pull requests

4 participants