-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests demonstrating forward-slash behaviors in Turtle, JSON-LD, and SPARQL #1872
Conversation
…nd SPARQL Some correction is needed in the SPARQL engine to get these tests to pass. References: * RDFLib#1871 Signed-off-by: Alex Nelson <[email protected]>
Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist some tests here are failing, should they be marked as xfail maybe? https://github.com/RDFLib/rdflib/runs/6161549514?check_suite_focus=true
|
@aucampia No, I'd already designated xfail tests intentionally. The tests that are currently failing are highlighting behaviors that I think rdflib needs to change towards supporting. For instance, |
@ajnelson-nist appologies for not getting to this yet, will look at this on friday. |
Thanks, @aucampia . I have a hunch this could be something like a 1-line patch to code, but finding that 1 line could take a while. |
To me, if we are not following a spec that we say we comply with, it is a bug, and if someone is relying on a bug they should fix their code, though I can see how this is rather extreme, and maybe especially extreme in this case. PS: You can just comment with |
I agree it's a harsh resolution. However, not warning the user feels a bit like applying the ostrich algorithm to queries that are very likely not working as they should. I continue to think that parse errors are eventually the right answer. It's probably better to raise a |
Oh? Sorry, I didn't realize I'd done something |
The worry here is, if we take it as a breaking change the major version must change, and I somewhat want to stagger interface breaking changes so we don't release a new major version every time.
I could go for that.
You did not, just saw you made a commit to fix isort, I would also do the same but we want the isort and black to not be a nuisance to anyone and that is why we added pre-commit.ci, so I just let people know that it can black and isort their code if they want, just in case you did not know.. |
I'd picked up on your recent addition of pre-commit. I think I actually initialized pre-commit after my initial patch. Regardless, I'm using it now, no nuisance induced, no worries. I just ran a catch-up merge and checked afterwards with pre-commit, everything seems fine from a formatting front. |
I'm fine with the error being raised only after the next (or another upcoming) major release. I'll adjust the |
`_test_escapes_and_query` accidentally conflated the query "Running" as both compiling and returning results. This patch changes the `query_ran` variable to `query_compiled`. Signed-off-by: Alex Nelson <[email protected]>
AJN: @aucampia suggested this patch in Pull Request discussion, and I applied it while fixing a typo. Signed-off-by: Alex Nelson <[email protected]>
Some tests are now delayed for rdflib 7, as a result of discussion of balancing bug correction versus backwards compatibility. Suggested-by: Iwan Aucamp <[email protected]> Signed-off-by: Alex Nelson <[email protected]>
I've pushed a delay-until-rdflib-7 patch. The patch just prior should demonstrate from CI logs that these tests will expect parse failures. I realized just now that there is a bit of a coding-style mix, but the pass/fail results are correct. Could you comment on whether I did the version-7 detection correct, in terms of this project's version stamping policy? It's currently set to turn the tests on once version 7 releases. Thinking about it a bit more, you probably want these tests to turn on once you enter version 7 beta, rather than letting the new production release fall over immediately after stamping. ...having typed that out, I can't see you responding otherwise. Patch coming. |
Signed-off-by: Alex Nelson <[email protected]>
Ok, this PR is once again ready for discussion on fixing the bug. |
@ajnelson-nist I will have a look, I just tried with jena though and get this: $ sparql --query <(echo '
> PREFIX ex: <http://example.org/ontology/>
> CONSTRUCT { ex:s ex:p ex:\/o }
> WHERE { }
>
> ') --data ./test/data/rdfs.ttl
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix ex: <http://example.org/ontology/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
ex:s ex:p <http://example.org/ontology//o> . |
Also check RDF4J now:
|
Wikidata's query service also allows it, I think maybe the better option is to somehow implement a strict parsing flag. |
Three more implementations that disagree with me suggest I might not have read SPARQL's grammar correctly. Re-reviewing the grammar, A.2 and/or A.3 hint that I might not have understood whitespace processing possibly allowing for forward slashes. It appears I don't think I'm at a good energy level on a Friday afternoon to resolve this. If anyone's inclined, fresher eyes are welcome before I can return to this next week. |
I don't think you have, at least I understand it the same, and I can see even where the difference comes in: rdflib/rdflib/plugins/sparql/parser.py Line 223 in eba1373
If this line was instead:
Then it would behave as you expect. But I still think that given other implementations allow it we should maybe be similarly leniant. |
Just a checkin on this again, I think the most ideal solution here will be if we could somehow have a strict flag on the query parser, and then if it is set treat this as an error. I'm just not sure what the best way of passing this flag is right now, I guess we can go with *kwargs but will think about it a bit still. |
I think the best to do with these test is to merge them with xfail, and then once we have some solution to making the parser strict we can use that. |
I'd remarked on Issue 1890 on how well I think a keyword argument could work. I noticed #Current:
def prepareQuery(queryString, initNs={}, base=None) -> Query: ...
#After:
def prepareQuery(queryString, initNs={}, base=None, *args: typing.Any, strict: bool=False) -> Query: ... I'll be honest, Python's keyword vs positional arguments with defaults was confusing to me for many years, and still is when I don't see a |
@aucampia I've just done a catch-up merge with The sonarcloud service is complaining, to the tune of failing a test, about my usage of The new test passes (PASS and XFAILs work), so I am otherwise happy with this being merged and/or getting other maintainer feedback. |
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
I will process this later this week, hopefully tomorrow. |
@ajnelson-nist I'm going to make some changes to your branch, happy to adjust them based on your feedback, but in summary:
Regarding SonarCloud bot, it still needs tuning, at best it is a guide for potential issues, not something you should consider as authoritative at the moment, and the URI thing is going to be a problem until we figure out how to disable that permanently on SonarCloud. |
- I moved the Turtle and JSON-LD tests to `test/data/variants` as this allows for further expansion and uses a shared processor which somewhat simplifies testing. - I changed the JSON-LD to be compatible with Jena's riot and jsonld-cli as these do not allow for object valued `@type` directives. I also think this is in line with the JSON-LD spec. - I changed the conditional skips to xfails, as I think this is most appropriate in this case, we can provide a strict flag before version 7.0, and the strict flag's default value can be varied from release to release.
@ajnelson-nist made some minor changes in c387c78 Please let me know if you have any concerns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, we definitely should provide a way to make processing strict for everything we do, in part because there are many harmful consequences of the "robustness principle" [ref]:
I'm fine with this, and agree with the desired usages of skips. If at least you and I have a consensus on the strict flag, I think we should make sure Issue 1871 does NOT close when merging this PR. 1871 should slightly rescope to add a |
Making it now.
You don't have to merge anything into your branch, changes occur directly in your branch, so if you pull you will get the changes. |
It is allowed, but it is also technically reserved for future use and will be ignored by compliant processors (see this note) - I also checked that it works with riot, rdfpipe and jsonld-cli - however it is maybe best to avoid using a reserved keyword, so I will change this to |
- Use `_comment` instead of `@x-comment` to add a comment to JSONLD. - Fixed some typographical errors in tests. - Remove overpinning of output to simplify things a bit.
To prevent this you will have to change the description of this PR here #1872 (comment) to not contain |
Done now. |
I've revised the original comment, and that seems to have mechanically disassociated the issue. |
Is there any reason |
Aside from the extra unexplained change in |
In which commit? I don't see any non intended changes here: 519febc Maybe you are talking about changes in the value of $ task test -- test/test_sparql/test_forward_slash_escapes.py -rA
task: [test] .venv/bin/python -m pytest test/test_sparql/test_forward_slash_escapes.py -rA
============================================================================ test session starts ============================================================================
platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/iwana/sw/d/github.com/iafork/rdflib.cleanish, configfile: pyproject.toml
plugins: subtests-0.7.0, cov-3.0.0
collected 6 items
test/test_sparql/test_forward_slash_escapes.py .x.x.x [100%]
================================================================================== PASSES ===================================================================================
========================================================================== short test summary info ==========================================================================
PASSED test/test_sparql/test_forward_slash_escapes.py::test_query_prepares_expanded
PASSED test/test_sparql/test_forward_slash_escapes.py::test_escapes_and_query_turtle_expanded
PASSED test/test_sparql/test_forward_slash_escapes.py::test_escapes_and_query_jsonld_expanded
XFAIL test/test_sparql/test_forward_slash_escapes.py::test_query_prepares_prefixed
Contrary to the ratified SPARQL 1.1 grammar, the RDFlib SPARQL propcessor
accepts backslashes as part of PN_LOCAL which it treats as escape
characters.
There should be a way to instruct the SPARQL parser to operate in strict
mode, and in strict mode backslashes should not be permitted in PN_LOCAL.
See https://github.com/RDFLib/rdflib/issues/1871
XFAIL test/test_sparql/test_forward_slash_escapes.py::test_escapes_and_query_turtle_prefixed
Contrary to the ratified SPARQL 1.1 grammar, the RDFlib SPARQL propcessor
accepts backslashes as part of PN_LOCAL which it treats as escape
characters.
There should be a way to instruct the SPARQL parser to operate in strict
mode, and in strict mode backslashes should not be permitted in PN_LOCAL.
See https://github.com/RDFLib/rdflib/issues/1871
XFAIL test/test_sparql/test_forward_slash_escapes.py::test_escapes_and_query_jsonld_prefixed
Contrary to the ratified SPARQL 1.1 grammar, the RDFlib SPARQL propcessor
accepts backslashes as part of PN_LOCAL which it treats as escape
characters.
There should be a way to instruct the SPARQL parser to operate in strict
mode, and in strict mode backslashes should not be permitted in PN_LOCAL.
See https://github.com/RDFLib/rdflib/issues/1871
======================================================================= 3 passed, 3 xfailed in 0.30s ======================================================================== |
Ah, I think I've found it's out of scope of this PR, so no worries. I reviewed diff --git a/test/test_sparql/test_datetime_processing.py b/test/test_sparql/test_datetime_processing.py
index 8cec5cca..41da92be 100644
--- a/test/test_sparql/test_datetime_processing.py
b/test/test_sparql/test_datetime_processing.py
@@ -126,7 126,7 @@ def test_dateTime_duration_subs():
WHERE {
VALUES (?duration ?d) {
("P3DT1H15M"^^xsd:dayTimeDuration "2000-10-30T11:12:00"^^xsd:dateTime)
- ("P3DT1H15M"^^xsd:dayTimeDuration "2000-10-30"^^xsd:date)
("P3DT1H15M"^^xsd:dayTimeDuration "2000-10-30"^^xsd:date)
}
}
""" But, Github isn't showing it. When I reviewed the file with a text editor, I found it picked up tabs instead of spaces somewhere in recent work on the So, out of scope. I'm fine with this PR going in. |
This pull request supports Issue 1871.
A follow-on PR will be required to finish resolving that issue.
Summary of changes
Checklist
the same change.
./examples
for new features.CHANGELOG.md
).so maintainers can fix minor issues and keep your PR up to date.