-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHECK-2437 add support for using analyzers by language #258
Conversation
@@ -0,0 1,183 @@ | |||
import json | |||
from elasticsearch import Elasticsearch | |||
from flask import request, current_app as app |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unable to import 'flask'
} | ||
} | ||
}, | ||
"hi": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar blocks of code found in 2 locations. Consider refactoring.
# include_type_name=True, | ||
index=index_name | ||
) | ||
es.indices.open(index=index_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad indentation. Found 4 spaces, expected 8
# include_type_name=True, | ||
index=index_name | ||
) | ||
es.indices.put_settings( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad indentation. Found 4 spaces, expected 8
if index_name not in indices: | ||
es.indices.create(index=index_name) | ||
es.indices.close(index=index_name) | ||
es.indices.put_mapping( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad indentation. Found 4 spaces, expected 8
index_name = app.config['ELASTICSEARCH_SIMILARITY'] "_" lang | ||
if index_name not in indices: | ||
es.indices.create(index=index_name) | ||
es.indices.close(index=index_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad indentation. Found 4 spaces, expected 8
Code Climate has analyzed commit 2474023 and detected 148 issues on this pull request. Here's the issue category breakdown:
Note: there is 1 critical issue. The test coverage on the diff in this pull request is 80.0% (50% is the threshold). This pull request will bring the total coverage in the repository to 88.5% (0.0% change). View more on Code Climate. |
* Bump ujson from 1.35 to 5.4.0 Bumps [ujson](https://github.com/ultrajson/ultrajson) from 1.35 to 5.4.0. - [Release notes](https://github.com/ultrajson/ultrajson/releases) - [Commits](ultrajson/ultrajson@v1.35...5.4.0) --- updated-dependencies: - dependency-name: ujson dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Meedan 2116 update image scoring (#244) * CHECK-2116 update alegre image endpoint to return correctly ordered scoring, add init perl to start.sh file while we're here * CHECK-2116 update alegre image endpoint to return correctly ordered scoring, add init perl to start.sh file while we're here * CHECK-2116 fix typo * CHECK-2116 update test for new scoring setup * CHECK-2116 update contract test * Meedan 2120 add limits (#245) * CHECK-2120 initial push on adding limit to all search responses * CHECK-2120 fix typo * CHECK-2120 remove bad id after testing context hashes on dev * CHECK-2120 update variable name * CHECK-2120 update test * CHECK-2120 fix typo * CHECK-2120 refactor audio similarity to make search function less complex * CHECK-2120 fix more minor code climate issue * Change Alegre port to 3100 to avoid conflict on Mac Monterey (#246) Port 5000, which Alegre currently runs on, is now used by AirPlay on Macs running Monterey. As a result, there is an error that port is in use when our application tries to use that port in development. To fix this, I modified the external port to 3100, which it seems to have been at some point in the past (reflected by Readme). For internal consistency, I went ahead and updated the internal port to 5000, as well, even though it wasn't really necessary. Fixes CHECK-2147 * Meedan 2178 delete with context (#247) * CHECK-2120 initial push on adding limit to all search responses * CHECK-2120 fix typo * CHECK-2120 remove bad id after testing context hashes on dev * CHECK-2120 update variable name * CHECK-2120 update test * CHECK-2120 fix typo * CHECK-2120 refactor audio similarity to make search function less complex * CHECK-2120 fix more minor code climate issue * CHECK-2178 add deletion conditional on context uniqueness * CHECK-2178 fix code climate issues * CHECK-2178 remove context on text until we are able to do something with it in next ticket * add type checking * and of course we want is list * CHECK-2178 add prints to diagnose these last bugs * CHECK-2178 work on type mismatch now * CHECK-2178 fix tests with updated input data * CHECK-2178 fix typo in function params and update tests to reflect added context * CHECK-2178 add context to test * CHECK-2178 remove prints * CHECK-2139 add parameters to establish min cutoff score from ES as we… (#250) * CHECK-2139 add parameters to establish min cutoff score from ES as well as per-model thresholding * CHECK-2139 resolve codeclimate suggestion * Use community version of Tensorflow that works with M1 The TensorFlow binary downloaded from a normal TensorFlow 2.3.1 pip install (from requirements) was crashing when we used the linux/x86_64 emulated arch with M1 macs (which is needed because TensorFlow does not yet have an arm-supported version). To solve this, we are using a community wheel of Tensorflow 2.3.1 compiled as we need it. More on this here: tensorflow/tensorflow#52845 Paired with Ahmed! CHECK-2147 * Fixes creating text graphs When I was trying to generate text clusters locally, it didn’t fail, but no clusters were returned. It worked well for images. Looks like some changes to text similarity were not reflects in the graph writer. Looks like "model" should now be "models" and "text" should be "content". I'm not sure, so I'll ask Devin to review it. Fixes CHECK-2212. * CHECK-2179 initial push on using context in text like other media (#249) * CHECK-2179 initial push on using context in text like other media * CHECK-2179 alter logic of delete to allow to attempt to delete any not-multi-context doc * CHECK-2179 re-add missing var * CHECK-2131 add errbit notification for broken search result (#253) * CHECK-2131 add errbit notification for broken search result * CHECK-2131 remove now irrelevant test * CHECK-2131 old test is changed due to minor change from API - fix maybe? * CHECK-2131 make test more robust * CHECK-2131 switch args * CHECK-2131 More test fixes * CHECK-2131 this set of tests man! * CHECK-2131 more fixing on these tests * CHECK-2387 don't allow nil thresholds (#255) * CHECK-2387 don't allow nil thresholds * CHECK-2387 ah the old zero is not game in python * CHECK-2284 update documentation to more explicitly call out that swagger docs wont work out of box (#257) * CHECK-2284 update documentation to more explicitly call out that swagger docs wont work out of box * MEEDAN-2284 fix whitespace * CHECK-2437 add support for using analyzers by language (#258) * CHECK-2437 add support for using analyzers by language * CHECK-2437 remove old dependencies from half-implementation of analyzers * CHECK-2437 shift es client * CHECK-2437 add tests for new use case * CHECK-2437 add fix for tests to actually pass * Meedan 2437 multiple analyzer indices (#261) * CHECK-2437 add support for using analyzers by language * CHECK-2437 remove old dependencies from half-implementation of analyzers * CHECK-2437 shift es client * CHECK-2437 add tests for new use case * CHECK-2437 add fix for tests to actually pass * CHECK-2437 resolve code review fixes * Optionally allow language override * CHECK-2437 add ascii folding and other minor tweaks (#262) * Change order of analyzer filters * remove draft lines * CHECK-1716 Add explicit model returns for all responses, also sneak in some language analyzer changes (#264) * CHECK-1716 Add explicit model returns for all responses, also sneak in some language analyzer changes * CHECK-1716 add updates to test fixtures * CHECK-1716 add more test fixes * CHECK-2608 version bump cld (#265) * CHECK-2608 add test function (#266) * Fixing PostgreSQL Dockerfile All CI builds were failing with this error: ``` W: The repository 'http://apt.postgresql.org/pub/repos/apt stretch-pgdg Release' does not have a Release file. E: Failed to fetch http://apt.postgresql.org/pub/repos/apt/dists/stretch-pgdg/11/binary-amd64/Packages 404 Not Found [IP: 147.75.85.69 80] E: Some index files failed to download. They have been ignored, or old ones used instead. The command '/bin/sh -c apt-get update && apt-get install -y gawk postgresql-plperl-$PG_MAJOR && localedef -i ru_RU -c -f UTF-8 -A /usr/share/locale/locale.alias ru_RU.UTF-8 && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100 Service 'postgres' failed to build : Build failed ``` Here's an announcement: https://www.postgresql.org/message-id/[email protected] Fixed by installing the packages from the archive repository. * CHECK-2690 remove vectors from responses for alegre text (#268) * Meedan 2690 remove vectors from response (#269) * CHECK-2690 remove vectors from responses for alegre text * CHECK-2690 apply stripper to every case * CHECK-2690 minor fix * CHECK-2702 fix thresholding function for audio (#270) * CHECK-2702 fix thresholding function for audio * CHECK-2702 fix tests * invert index * CHECK-2782 update matching to reject mismatched lengths (#273) * Bump pyjwt from 1.6.4 to 2.4.0 (#236) Bumps [pyjwt](https://github.com/jpadilla/pyjwt) from 1.6.4 to 2.4.0. - [Release notes](https://github.com/jpadilla/pyjwt/releases) - [Changelog](https://github.com/jpadilla/pyjwt/blob/master/CHANGELOG.rst) - [Commits](jpadilla/pyjwt@1.6.4...2.4.0) --- updated-dependencies: - dependency-name: pyjwt dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Bump joblib from 1.0.1 to 1.2.0 (#260) Bumps [joblib](https://github.com/joblib/joblib) from 1.0.1 to 1.2.0. - [Release notes](https://github.com/joblib/joblib/releases) - [Changelog](https://github.com/joblib/joblib/blob/master/CHANGES.rst) - [Commits](joblib/joblib@1.0.1...1.2.0) --- updated-dependencies: - dependency-name: joblib dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Bump certifi from 2018.10.15 to 2022.12.7 (#272) Bumps [certifi](https://github.com/certifi/python-certifi) from 2018.10.15 to 2022.12.7. - [Release notes](https://github.com/certifi/python-certifi/releases) - [Commits](certifi/python-certifi@2018.10.15...2022.12.07) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Bump mako from 1.0.7 to 1.2.2 (#256) Bumps [mako](https://github.com/sqlalchemy/mako) from 1.0.7 to 1.2.2. - [Release notes](https://github.com/sqlalchemy/mako/releases) - [Changelog](https://github.com/sqlalchemy/mako/blob/main/CHANGES) - [Commits](https://github.com/sqlalchemy/mako/commits) --- updated-dependencies: - dependency-name: mako dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Bump protobuf from 3.9.2 to 3.18.3 (#259) Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.9.2 to 3.18.3. - [Release notes](https://github.com/protocolbuffers/protobuf/releases) - [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py) - [Commits](protocolbuffers/protobuf@v3.9.2...v3.18.3) --- updated-dependencies: - dependency-name: protobuf dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Update article.py * Update bulk_similarity_controller.py * Update bulk_similarity_controller.py Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> Co-authored-by: Devin Gaffney <[email protected]> Co-authored-by: Christa Hartsock <[email protected]> Co-authored-by: Christa Hartsock <[email protected]> Co-authored-by: Caio Almeida <[email protected]>
No description provided.