-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add property oid
#3436
Comments
What types would we attach this to? Historically we've tried to minimize things at the Thing level, but it sounds like this would be at least Organization, Product, CreativeWork, Place, ... Is there anything that oid isn't used to identify? How much "oid data" is out there? |
Can these instead be encoded using the existing identifier property and PropertyValue? In which propertyID could perhaps be urn:oid:1.3.60 and value could be |
I'm mainly worried about the oid
|
For identifiers, we should always be worried about bit rot. |
@KalleOlaviNiemitalo I wouldn't worry about any of that if we promoted the URN convention within propertyID values? Its syntax has got a long history, 1997, and already supported in tons of open source. https://datatracker.ietf.org/doc/html/rfc2141 |
@KalleOlaviNiemitalo OIDs are a continuous tree where the levels can be anything. (BTW there's an interesting story why
So in the global OID tree, it is unclear where do you place the boundary between:
http://oid-info.com/get/1.3.60.053393007 doesn't resolve because:
I am a bit uneasy with my proposal to use an extension syntax like I know about the split
I don't think there's any conflict with RFC 3061 because Truth be told, because of this RFC there's very little loss if this proposal is rejected: I can just use <https://kg.ontotext.com/resource/agent/ontotext>
s:identifier <urn:oid:1.3.141:04KE8>;
# or even
s:sameAs <urn:oid:1.3.141:04KE8>. My main motivation is that I researched https://en.wikipedia.org/wiki/ISO/IEC_6523 (a collection of identifier schemes) |
It matches the
If this remains invalid as a URN forever, and schema.org defines its own interpretation of the string "urn:oid:1.3.141:04KE8", then there is no conflict. But if IETF RFC 3061 is ever updated to make this syntax valid as a URN, with semantics different from what schema.org assigned, then that will be a conflict. |
Let's do that. Can you make a PR? |
@KalleOlaviNiemitalo OID syntax is strictly dotted integers ( I personally am ok with this because it seems useful to be able to mint URNs from any identifier:
@danbri I'll make a PR but first say what do you think of the above. |
I would distinguish between two uses of ISO 6523.
As mentioned above, the DUNS tree (
If you look at the list of ICDs the registrar typically make it explicit when they want to use them for ISO 8348. From my position, the iso6523Code field should be restricted to Organization (and maybe Person). I would really avoid using identifier, because it's usage is really vague and abstract for most users. |
The reference list of ICDs is missing the 9XXX codes which contain a lot of VAT number types. I read up a bit more and found the 9XXXs in the EAS list I found are not in the ICD list. So I guess they are not ICD codes. Wouldn't it be of value to be able to add things like VAT numbers for organization identifiers? https://ec.europa.eu/digital-building-blocks/sites/display/DIGITAL/Code lists |
@Tiggerito vatID is already a property on Organization if that helps vatID The Value-added Tax ID of the organization or person. |
@thadguidry Good point. I guess that combined with the organization's address/country would be enough. |
You assume two things:
Generally, taxID and vatID are fields which are difficult to parse, because the syntax is not specified (do you use the common formatting for the country), you need the value from another field (country) to even know what it is, and and even with the country value, parsing still has to handle ambiguities. ISO-6523 is much more constrained, and therefore more robust for parsing. The 9XXX are not official ICDs, these are part of the PEPPOL extension, which is a de facto standard. |
@MatthiasWiesmann I didn't assume any of those. I simply stated that Schema.org provides a property to hold the values. How to format the values, attach additional metadata to the values, that can all certainly be done when coordinating the vatID property with https://schema.org/PropertyValueSpecification could it not? Do you see gaps here if vatId context is that of PropertyValueSpecification or even using multi-typing to provide some external additional context? FYI, Schema.org typically doesn't get into the formatting weeds of values (since there's often not a need when we also have PropertyValueSpecification) unless absolutely necessary to make publisher/consumer lives easier. If you need help with PropertyValueSpecification, we can help, and move that discussion to our mailing list, or just directly in our GitHub Discussions button above. |
My main point is that ISO 6523 solves two problems: type identification and formatting. If I understand PropertyValueSpecification, we would need to have 1 per country and it would not help the identification issue, vatID and taxID are quite ambiguous and don't cover the space well, DUNS codes are neither. |
@MatthiasWiesmann Oops! So sorry, that should have been PropertyValue https://schema.org/PropertyValue . But the PropertyValueSpecification comes from Hydra and it's a way to specify a format (using it's valuePattern). If using PropertyValue, to give more detail about an Organizations multiple Wouldn't that be enough to know that a or just give context (the ICD part) that it's a https://schema.org/iso6523Code directly on the vatID property. @danbri This is likely where we need to provide better docs and guidance on how best to use those. Hmm, and seems we missed adding |
Sadly, things are complicated.
The core problem is that users will just put whatever makes sense for them into the vatID, respectively taxID field. They could structure the values as PropertyValues, but they probably won't: it's complicated and the set of propertyID is not defined. Finding the list of ICDs is not trivial, the list of propertyId for vatIDs and taxIDs does not exist. So a parser will have to assume that taxID and vatID are basically synonyms and represent a badly defined variant of the iso6523 field: there are keys and values, except the set of keys is not really defined and neither is the format of the values. Proper validation is only possible by cross-referencing other fields (address, for the country), with magic fallbacks if this is missing. In turn this means online validation will be difficult, so reporting to the user their data does not parse is harder. Basically, which one do you think I would rather parse and validate?
or
Yes, you could do
Or
But this more difficult to add to a web-page and more work to parse… |
I much prefer property/value pairs in JSON-LD, and how I usually handle it. I have always found property/value pairs (semantics, which allows easier information exchange) to be easier to parse than values with separators that lack extra information. In fact, the term "parsing" indeed comes into play when separator characters are used to delineate information in a multi-value value. For JSON-LD, as a consumer of the data, the libraries handle the parsing for you, so all you have to do is make sense of values and perhaps custom extensions and RDFa nodes. But that's me. |
Australia is an interesting example. Let's see if I remember correctly. We have two business identifiers which are used to pay tax: ACN: Australian Company Number Both are numbers where the ACN is the ABN without the first few numbers. Only businesses registered as a company have an ACN. We also have TFN (Tax File Number), which businesses and individuals have. In Australia we pay GST not VAT. For businesses you track GST paid/charged via the ABN. ICD has an entry for the ABN (0151) but not the other two. So we can identify a business in Australia via the ABN. With the https://schema.org/PropertyValue idea, I guess we might be able to use the EAS codes.
I found what looks like a better EAS list that indicates what the source is: |
That "better EAS list" is incomplete :-( |
The data comes from system which are probably not JSON-LD, and will be parsed into structures which are not JSON-LD. Breaking up the information in transit does not bring much, and ads risks of breakage. The whole point of standards like ISO 6523 is that values can be transported without any transformation in the same way as country codes (ISO 3166), language codes (ISO 639), date-times (ISO 8601). |
This issue is being nudged due to inactivity. |
@danbri and @alex-jansen:
#2915 added https://schema.org/iso6523Code (see that issue and https://en.wikipedia.org/wiki/ISO/IEC_6523 for a description).
However, ISO 6523 is just the
1.3
branch of the ITU/ISO/IEC object ID (oid
) hierarchy. See https://en.wikipedia.org/wiki/Object_identifier and https://www.wikidata.org/wiki/Property:P3743.OIDs can be browsed and resolved at http://oid-info.com/, eg http://oid-info.com/get/1.3.60 is DUNS.
OID can also be used as URN, eg
urn:oid:1.3.60
.We could even use this as a property (not that I'd recommend it), and eg declare:
s:duns owl:equivalentProperty urn:oid:1.3.60. s:leiCode owl:equivalentProperty urn:oid:1.3.199. # CAGE is urn:oid:1.3.141
Now consider https://www.wikidata.org/wiki/Q7095072 Ontotext and its DUNS "053393007" and CAGE "6H8F4".
We could express them as the following alternatives (assuming that ":" is used as a separator):
Here's a proposed definition:
The text was updated successfully, but these errors were encountered: