-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate detections #259
Comments
Is this from a specific deployment? |
No, this is across the entire project. The issue also occurs in other projects. Here's the code to get all duplicates: det <- get_acoustic_detections(animal_project_code = "2014_DEMER")
dups <-
det %>%
group_by(tag_serial_number, date_time) %>%
filter(n() > 1)
dups File: dups.csv |
@peterdesmet the file |
Thanks @PieterjanVerhelst. Unfortunately the issue is not only related to that or INBO-projects only:
library(etn)
library(dplyr)
con <- connect_to_etn()
det1 <- get_acoustic_detections(animal_project_code = c(
"2010_PHD_REUBENS",
"2011_RIVIERPRIK",
"2012_LEOPOLDKANAAL",
"2013_ALBERTKANAAL"
))
det2 <- get_acoustic_detections(animal_project_code = c(
"2014_DEMER",
"2015_DIJLE",
"2015_HOMARUS",
"2015_PHD_VERHELST_COD",
"2015_PHD_VERHELST_EEL"
))
det <- bind_rows(det1, det2)
dups <-
det %>%
group_by(animal_project_code, tag_serial_number, date_time) %>%
filter(n() > 1)
dups %>%
group_by(animal_project_code) %>%
count() |
@peterdesmet could you create a csv with duplicates for the projects |
@PieterjanVerhelst here you go: dups_pj.csv.zip I notice I already reported this issue before for I do think we'll have to tackle this at some point, as users (like me) will bump into this again and again. |
I can check this for the Demer project. Could you create a csv with all duplicates for that project too, please? |
deployment_id
(sometimes station_name
)
@IPauwels here are the files with duplicates: The station name is sometimes the same, the constant difference is A consistent way to identify them and deciding which one to keep would be good. E.g. delete all duplicate detections associated with |
deployment_id
(sometimes station_name
)
Correction: the |
I checked the file |
@PieterjanVerhelst I've checked and there are 2 files uploaded for the same deployment. Once with the "inbo_data_file" and "VR2W_122322_20141121_1.csv" But there is no trace of them in the database. |
@aubrivliz I think the excercise you did for @PieterjanVerhelst should be done for all of these duplicates. I have for instance duplicates in 'Humarus' and 'PhD Reubens' .... would be interesting to know which files they originally come from |
@PieterjanVerhelst @jreubens Has this been solved? Is this a different issue from #283 ? |
In
2014_demer
(but likely in other projects), I discovered detections that are duplicates (samedatetime
,receiver
,transmitter
), except for theirstation_name
andfile
(source of data):@PieterjanVerhelst @IPauwels is this valid data? If not, @aubrivliz do you think this is an issue in the
acoustic.detections_limited
query?The text was updated successfully, but these errors were encountered: