-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache calls #159
Labels
Projects
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Instead of repeatedly bombing the
overpass
API, it'd be pretty easy to implement a local caching system that would record the call and store the pre-processed data returned from the API. Subsequent calls would then just re-load the local data and deliver anew.The
R.cache
package has a hard-coded default that only allows enduring storage in"~/.Rcache/"
, used in the.onLoad
call. This package sticks a few things inoptions()
, but does not use any environmental variables.A bit more flexibility could be added here via environmental variables, by defaulting to
~/.Rosmdata
(or maybe piggyback on~/.Rcache
if it exists?), but allowing override ifSys.getenv("OSMDATA_CACHE_DIR")
exists.cache duration
Because OSM is constantly updated, it will be important to allow control over cache duration, so that local versions will be automatically updated at some stage. While this could also be handled via an environmental variable,
"OSMDATA_CACHE_DURATION"
, that would need to be explicitly set by a user to work, so would impose additional burdens.A less burdensome option would be an equivalent function parameter, which would best be placed in
overpass_query()
, because it's the overpass calls themselves that will actually be cached. Problem there is that that function is not exported. The general workflow isA
cache_duration
parameter could potentially be set in the initialopq()
call, but that does not contain the fulloverpass
query, and so this parameter would then need to be passed on to any and all subsequent functions. That suggests that the end-point calls are the best place for such a parameter. These currently only have 2 primary parameters (q, doc
), so wouldn't suffer from an additional one there. If that is the point at which caching is determined, then it will likely be better to cache the full processed result, rather than just the direct result of the API call. The call itself could bedigest
-ed, while the cached object would be the final processed end-point. The timestamp could simply be read (file.info()$mtime
), and the cache updated ifdifftime(...) > cache_duration
, otherwise just re-load cached version.The text was updated successfully, but these errors were encountered: