Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inspect() returns incorrect taxon/row #101

Open
marknovak opened this issue Sep 11, 2017 · 8 comments
Open

inspect() returns incorrect taxon/row #101

marknovak opened this issue Sep 11, 2017 · 8 comments

Comments

@marknovak
Copy link

marknovak commented Sep 11, 2017

I'm encountering a problem using inspect() in that, whether or not I provide taxon_name or row_number, inspect(object,...) is returning the incorrect taxon/row.

Based on the description of inspect(), ("To inspect alternative taxonomic meanings of a given name, you need to provide the object resulting from a call to the tnrs_match_names function..."), I suspect my problem stems from the manner in which I am using tnrs_match_names() to create my object:

I am trying to create a tree for some 400 taxa. However, tnrs_match_names() apparently doesn't support calls for more than 250 taxa at a time. Furthermore, I am using fuzzy matching (which takes a while for long lists and is thus prone to interruption). I am therefore running my matching search in chunks of 50 taxa at a time. I'm doing so with a for loop, combining the chunks with rbind(). (Fyi: My list of taxa is continually expanding, so I'm trying to avoid hardcoded fixes.)

My suspicion is thus that inspect() is using some kind of internal bookkeeping to keep track of the location of each taxon (i.e. its taxon_name and row_number), rather than using the "visible" taxon names and row names/numbers seen when one prints object.

For example, inspect(objects,row_number=1) does not actually return the first row of object. Renaming the row labels of object doesn't change anything either.

I'm happy to provide a reproducible example if needed, but figure someone might already know of an easy fix.

[My need to use inspect() here stems from the fact that many of my 400 taxa have multiple matches (that are incorrect, despite using context). I presume I'll need to hardcode their corrections(?).]

thanks

@fmichonneau
Copy link
Member

Thanks for the report Mark. Are you using the CRAN version of the package or the GitHub version? I vaguely remember fixing something about this part of the code relatively recently and I don't know if it has made it to the CRAN release yet. If I still have power tomorrow (I'm in the middle of Irma), I'll investigate further.

@marknovak
Copy link
Author

Thanks. I'm using the CRAN version.

@fmichonneau
Copy link
Member

Could you please try the GitHub version to see if you still get the bug. I checked the commit logs, and I did update part of this code since the last CRAN release.

The easiest way to install rotl from GitHub is to type the following at the R prompt:

source("https://install-github.me/ropensci/rotl")

@marknovak
Copy link
Author

marknovak commented Sep 12, 2017

I actually just logged in to let you know: I installed from GitHub (using Devtools) this morning but unfortunately the problem persists.

Just for kicks, I also reinstalled with source as suggested. No difference.

In case it helps, here's an example:

> head(taxa)
                  search_string                 unique_name approximate_match  ott_id
198     acanthephyra_acutifrons     Acanthephyra acutifrons              TRUE 2974261
199   acanthephyra_curtirostris   Acanthephyra curtirostris              TRUE  563063
200 acanthephyra_stylorostratis Acanthephyra stylorostratis              TRUE  367600
348        acanthina_punctulata   Acanthinucella punctulata              TRUE  294761
349           acanthina_spirata      Acanthinucella spirata              TRUE  294762
35           acetes_intermedius          Acetes intermedius              TRUE 2970917
    is_synonym          flags number_matches
198      FALSE SIBLING_HIGHER              1
199      FALSE SIBLING_HIGHER              3
200      FALSE SIBLING_HIGHER              2
348       TRUE                             1
349       TRUE                             1
35       FALSE                             3
> inspect(taxa,taxon_name = 'acanthephyra_acutifrons')
       search_string        unique_name approximate_match ott_id is_synonym flags
1 amietia_angolensis Amietia angolensis              TRUE  72654      FALSE      
  number_matches
1              1
> inspect(taxa,row_number = 1)
       search_string        unique_name approximate_match ott_id is_synonym flags
1 amietia_angolensis Amietia angolensis              TRUE  72654      FALSE      
  number_matches
1              1
> inspect(taxa,row_number = 198)
Error in check_args_match_names(response, row_number, taxon_name, ott_id) : 
  ‘row_number’ is not a valid row number.
> inspect(taxa,row_number = '198')
Error in check_args_match_names(response, row_number, taxon_name, ott_id) : 
  ‘row_number’ must be a numeric.

fmichonneau added a commit that referenced this issue Sep 15, 2017
fmichonneau added a commit that referenced this issue Sep 18, 2017
@fmichonneau
Copy link
Member

work in progress available for testing: devtools::install_github("ropensci/rotl@fix-101")

@marknovak
Copy link
Author

Thanks Francois,
Unfortunately I get this right off the bat...

> taxa <- tnrs_match_names(Spp$Spp, context_name = "Animals")
More than 100 taxa to match.Error in tnrs_match_names(Spp$Spp, context_name = "Animals") : 
  could not find function "split_by_n"

@fmichonneau
Copy link
Member

sorry about that, I had forgotten to add this function to the package. Please reinstall, it should work now.

@marknovak
Copy link
Author

It's working(!), even after subsetting (e.g., to isolate taxa with multiple matches). It just doesn't work after resorting rows. Thank you for the quick turnaround, Francois. Much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants