Wikidata:Property proposal/earliest known OpenTimestamps
earliest known OpenTimestamps
[edit]Originally proposed at Wikidata:Property proposal/Commons
Description | Earliest known OpenTimestamps proof of an image, video or audio file |
---|---|
Data type | String |
Domain | OpenTimestamps proof (Q105855601) |
Allowed values | base64. Regex: [^-A-Za-z0-9 /=] |
Example 1 | File:Monumento_en_recuerdo_de_las_víctimas_de_la_pandemia_del_Covid-19_(Madrid),_18_May_2020.jpg →
AE9wZW5UaW1lc3RhbXBzAABQcm9vZgC/ieLohOiSlAEIqmhCuD/Hf4ZtJzsAHTuQh9y0nPkClhDz 1fdeopuI7h7wENd17BixRGTBvCRgkX/EAkcI8CBa w41aQZeT67MxV8a6yzLwM5 CGD1Ulj/ab4u 4EkpCAjxIB NHI1sm5WX8 US70Xgx4X8thEd5QPvpmRH4pwLn qECPEgwZerXDX3dIzm2TFEDW45 Q3AtCOochAehg/J5ogyYtdgI8CA8SlvQWaw7khoCbhoQtd0 PFDMpT8t/fGo5tMn6Ris4wjwIHmO KaNIQi3LwWhGADlVG/sN7sD6GNXtPN6jI5aqbOFMCPAg9QE6QOmANEFop4w5S6YGF8pBqOcbRxit gwwxawlIm9AI8BCKpKkqvs82FFUY pbH c/kCPEEXsLzg/AI5DlgRjbBYDcI8SCcKPfc5sG8TJjQ Xp kcNV/yIbcJ1TaPKuJltI1e1EdewjxIIywgDtDEexxANU6/hRjxfLVPyWpBRp9JKYBzypMvmtA CPEgh1kQ6Urwy6GpyR2Wa 53Emr8AvbXwmotNVh74fEhgYEI8CA1uwbzIusvFmTWqlf/9tg0ih3n p589opkQ/mz6Fmr0hAjxIFEwStiOuUVBzNWZXsaYM7h8rcJ56reJ9V06Ku9Wq38PCPAg5OygZlVq 00XtW3f3VyJhj7Qh7E8 sNrzshYgNw7KgMI8SDGNzqPE1 icnk1NRdBzNytsMguZAFgvogKDu5P 7CIP1AjxIN4HfpyjWeEWG4pnwXtxsEDa RAx4Vi/mNaU8u0Hts bCPEg6e5CofTGfhHfpM7RGXPE QjNY5Fw0CFTuCT3S5MN/WmcI8SB9I6WjLXjPYThqGqC/elvtbJu3t3gRw35gWp8jQRGNiwjxII9e CBzL40YnNfzyn4ORhBCFjpAwgtymAbK kVsup6jSCPEgcD7U8WAEmftpJcH4Yc/5csto8TCtILdv 65lWDjwqoCAI8CAL8o45rswne7cZbmrrXg3MQP565Ryn/KVoDUhzqduBkAjxICsjI 9/MlqYf2D7 2NMhUS9GIT0H5lB9J2voTDva7ZsJCPAgYHvLDgmDgzmITM8SapnzS1oGOwpZidYbHrkdy ElXoUI 8XEBAAAAAc QsL9Yqo4aUWGF5T9lojorCl42C7EaS9XwbWQ9XJ0IAAAAABcWABQ/ZZ3/jRuZ0NSz EGQUN0dZQ89VRv3///8CAUYEAAAAAAAXqRQhsSNjOkrjMU/sC TiStQnBwbTHocAAAAAAAAAACJq IPAEbKAJAAgI8CCvRyFW9QMGdd7kHReE0xQgG4/78U6FStJ51o25YB/1IwgI8SAQx2gEme6PKezd cePDHRV4oLSSgOjyG8jZDw/0S6nnZwgI8SB5Swqf8QwWAJfuX2Z3XrTQKAfnFFKTwyny51VYIxQy bQgI8CC/oEBbCu2Qe76XpB Wg4NL/0n21GxA4HHRCQe10PhhBwgI8SCLhMxRar7DRnPmaBz/GSxP g5pppSXWVPzmbj46Jk7 QAgI8CC1MpoACmclmjs4F NHHA1MGme/Io8mPYw 2yDlOhobkggI8SA3 edryoUc9SQw62Z7 u53TtK5ztEnGInUjWqCa/IZwLggI8SCt3iW9UHON4S11 6FdPMl5bIByFfoA OCfA/UYooNz37wgI8SDR EV5h9mQHeZJXDGy0MwB0v9TMUtozb4N7NnCcLTUAQgI8SAG3Tj7sWdc Og8BR3TeOFdWggcP7H3wrDRtVm5wEXhAUQgI8SDit/ppsbaYVBNs1c1OlOJHVK NrjktRaReuWDr QGvlnQgI8CB2PhvEnsOMgwvG7bzA1rR4q9 UFxzaIs8BXViUdPFeBQgIAAWIlg1z1xkBA 3AJg== |
Example 2 | MISSING |
Example 3 | MISSING |
Motivation
[edit]The timestamp in the Exif metadata of an image file is only as accurate as the clock in the camera, and it may be completely wrong. Moreover, it can be easily faked with a software app. The OpenTimestamps proof can not be faked: it guarantees a file existed at a specific time or earlier. This may be critical for historically relevant images and for new original discoveries.
It can also be applied to any other file format, including video and audio files.
OpenTimestamps is open source, scalable, permissionless and free to timestamp and to verify. FrankAndProust (talk) 19:06, 9 June 2021 (UTC)
Discussion
[edit]The size of a typical pruned OpenTimestamps proof is around 1500 bytes in base64 format (String). This applies to any file, no matter how big.
The image on Example 1 was uploaded to Wikimedia Commons on 8th June 2021. The uploader might find difficult to convince the community that the image is of a much earlier date, specifically from 18th May 2020 because Exif data are easy to fake. However, the base64 representation of the OpenTimestamps timestamp, irrefutably proofs the image is from date 19th May 2020 or earlier. --FrankAndProust (talk) 19:06, 9 June 2021 (UTC)
- Comment @FrankAndProust: Can you check whether these strings fit within the limits of Wikidata string type? It looks to me like it will be too long. You can test this on test.wikidata.org - create a new property there for this and try it out. ArthurPSmith (talk) 17:34, 10 June 2021 (UTC)
- Comment @ArthurPSmith: You are right. This OpenTimestamps (OTS) proof is slightly longer than the maximum length constraint for String on Commons (1500 characters) when encoded in base64.
- It would be possible to bypass the maximum length constraint if we used the "Tabular data" datatype, but that would need a cluttered hack so I consider it would not be appropriate.
- Do you feel there is a suitable way of embedding the OTS proof on a property? If not, I will relinquish this property proposal and open a new one with URL as the suggested data type. --FrankAndProust (talk) 09:05, 11 June 2021 (UTC)
- I suppose the timestamp is itself a file, but it's not one of the formats that Commons accepts - I guess it could be done with a "cluttered hack" approach as you suggest! Other than that, do you think the more compact en:Ascii85 format rather than base64 would work here? ArthurPSmith (talk) 12:22, 11 June 2021 (UTC)
- Do you feel there is a suitable way of embedding the OTS proof on a property? If not, I will relinquish this property proposal and open a new one with URL as the suggested data type. --FrankAndProust (talk) 09:05, 11 June 2021 (UTC)
- Ascii85 provides some savings but it still falls short of the needs.
- A pruned OTS proof can be consistently reflected on a tabular data datatype: a three column table is needed; the first column is the execution order, the second column is the command (mandatory), the third column is the argument (non-mandatory). Commands are executed sequentially from first to last to generate the proof. The exact replica of the the previously stated binary proof encoded as base64 as tabular data would be:
JPEG file sha256 hash (initial value): aa6842b83fc77f866d273b001d3b9087dcb49cf9029610f3d5f75ea29b88ee1e
Execution order | Command (mandatory) | Argument (non-mandatory) |
---|---|---|
1 | append | d775ec18b14464c1bc2460917fc40247 |
2 | sha256 | |
3 | prepend | 1f8d1c8d6c9b9597f3e512ef45e0c785fcb6111de503efa66447e29c0b9fea84 |
4 | sha256 | |
5 | prepend | c197ab5c35f7748ce6d931440d6e3943702d08ea1c8407a183f279a20c98b5d8 |
6 | sha256 | |
7 | append | 3c4a5bd059ac3b921a026e1a10b5dd3e3c50cca53f2dfdf1a8e6d327e918ace3 |
8 | sha256 | |
9 | append | 798e29a348422dcbc168460039551bfb0deec0fa18d5ed3cdea32396aa6ce14c |
10 | sha256 | |
11 | append | f5013a40e980344168a78c394ba60617ca41a8e71b4718ad830c316b09489bd0 |
12 | sha256 | |
13 | append | 8aa4a92abecf36145518fa96c7f9cfe4 |
14 | sha256 | |
15 | prepend | 5ec2f383 |
16 | append | e439604636c16037 |
17 | sha256 | |
18 | prepend | 9c28f7dce6c1bc4c98d05e9fa470d57fc886dc2754da3cab8996d2357b511d7b |
19 | sha256 | |
20 | prepend | 8cb0803b4311ec7100d53afe1463c5f2d53f25a9051a7d24a601cf2a4cbe6b40 |
21 | sha256 | |
22 | prepend | 875910e94af0cba1a9c91d966bee77126afc02f6d7c26a2d35587be1f1218181 |
23 | sha256 | |
24 | append | 35bb06f322eb2f1664d6aa57fff6d8348a1de7a79f3da29910fe6cfa166af484 |
25 | sha256 | |
26 | prepend | 51304ad88eb94541ccd5995ec69833b87cadc279eab789f55d3a2aef56ab7f0f |
27 | sha256 | |
28 | append | e4eca066556afb4d17b56ddfdd5c89863ed087b13cfac36bcec85880dc3b2a03 |
29 | sha256 | |
30 | prepend | c6373a8f135fa2727935351741ccdcadb0c82e640160be880a0eee4fec220fd4 |
31 | sha256 | |
32 | prepend | de077e9ca359e1161b8a67c17b71b040daf91031e158bf98d694f2ed07b6cf9b |
33 | sha256 | |
34 | prepend | e9ee42a1f4c67e11dfa4ced11973c4423358e45c340854ee093dd2e4c37f5a67 |
35 | sha256 | |
36 | prepend | 7d23a5a32d78cf61386a1aa0bf7a5bed6c9bb7b77811c37e605a9f2341118d8b |
37 | sha256 | |
38 | prepend | 8f5e081ccbe3462735fcf29f83918410858e903082dca601b2be915b2ea7a8d2 |
39 | sha256 | |
40 | prepend | 703ed4f1600499fb6925c1f861cff972cb68f130ad20b76feb99560e3c2aa020 |
41 | sha256 | |
42 | append | 0bf28e39aecc277bb7196e6aeb5e0dcc40fe7ae51ca7fca5680d4873a9db8190 |
43 | sha256 | |
44 | prepend | 2b2323ef7f325a987f60fbd8d321512f46213d07e6507d276be84c3bdaed9b09 |
45 | sha256 | |
46 | append | 607bcb0e09838339884ccf126a99f34b5a063b0a5989d61b1eb91dcbe1255e85 |
47 | sha256 | |
48 | prepend | 0100000001cf90b0bf58aa8e1a516185e53f65a23a2b0a5e360bb11a4bd5f06d643d5c9d0800000000171600143f659dff8d1b99d0d4b310641437478633cf5546fdffffff02014604000000000017a91421b123633a4ae3314fec0be4e24ad4270706d31e870000000000000000226a20 |
49 | append | 6ca00900 |
50 | sha256 | |
51 | sha256 | |
52 | append | af472156f5030675dee41d1784d314201b8ffbf14e854ad279d68db9601ff523 |
53 | sha256 | |
54 | sha256 | |
55 | prepend | 10c7680499ee8f29ecdd71e3c31d1578a0b49280e8f21bc8d90f0ff44ba9e767 |
56 | sha256 | |
57 | sha256 | |
58 | prepend | 794b0a9ff10c160097ee5f66775eb4d02807e7145293c329f2e755582314326d |
59 | sha256 | |
60 | sha256 | |
61 | append | bfa0405b0aed907bbe97a41f9683834bff49f6d46c40e071d10907b5d0f86107 |
62 | sha256 | |
63 | sha256 | |
64 | prepend | 8b84cc516abec34673e6681cff192c4f839a69a525d654fce66e3e3a264efe40 |
65 | sha256 | |
66 | sha256 | |
67 | append | b5329a000a67259a3b3817e3471c0d4c1a67bf228f263d8c3edb20e53a1a1b92 |
68 | sha256 | |
69 | sha256 | |
70 | prepend | 3779daf2a1473d490c3ad99efebb9dd3b4ae73b449c62275235aa09afc86702e |
71 | sha256 | |
72 | sha256 | |
73 | prepend | adde25bd50738de12d75fba15d3cc9796c807215fa003827c0fd4628a0dcf7ef |
74 | sha256 | |
75 | sha256 | |
76 | prepend | d1f8457987d9901de6495c31b2d0cc01d2ff53314b68cdbe0decd9c270b4d401 |
77 | sha256 | |
78 | sha256 | |
79 | prepend | 06dd38fbb1675c3a0f014774de38575682070fec7df0ac346d566e7011784051 |
80 | sha256 | |
81 | sha256 | |
82 | prepend | e2b7fa69b1b69854136cd5cd4e94e24754af8dae392d45a45eb960eb406be59d |
83 | sha256 | |
84 | sha256 | |
85 | append | 763e1bc49ec38c830bc6edbcc0d6b478abdf94171cda22cf015d589474f15e05 |
86 | sha256 | |
87 | sha256 |
- The great advantage of this format is the fact that the proof is embedded on Wikimedia Commons: it is always available for everybody at any time.
- The great disadvantage is the fact that the OpenTimestamps client does not understand this format, so currently it cannot be directly used as an input to the OTS validator.
- This is why I feel the correct solution, at least as a first step, would be using a URL data type to a OTS timestamp file. OTS timestamps should consequently be hosted externally to the Wikimedia project. The main disadvantage is the fact the host can withdraw those files at any time and without previous notice, therefore becoming unavailable to Wikidata users. However, I don't feel there is a better solution.
- So as I said, if no better suggestion is provided, I will create a new property proposal for "earliest known OpenTimestamps URL". --FrankAndProust (talk) 19:04, 11 June 2021 (UTC)
- Comment I have no idea of the usefulness of the above data, but if Commons contributors want it stored as structured data, this is feasible.
- In Wikibase, the length of string property is configurable. So it's technically possible that the SDC team increases it for the above.
- Another aspect is the GUI. The value may not look that great in the GUI and aren't really useful for human readers. It may be worth to define a new sub-datatype of string values that just has some summary as displayed value and the full string value is only visible when querying or editing. There are already a few string-subtypes with special display. This would need to be requested from the SDC team as well.
- At least in Wikidata, it's possible to convert between string datatypes, so it may be possible to start out with a string property while the new datatype is being developed. --- Jura 13:03, 13 June 2021 (UTC)
- Comment @Jura1: The solution you propose is the one I like the most: "a new sub-datatype of string values that just has some summary as displayed value and the full string value is only visible when querying or editing".
- And, like you said, "it may be possible to start out with a string property while the new datatype is being developed". As we have already tested out, the standard constraint of 1500 characters is not enough if we use base64 encoding. Maybe we could start out first with a string property with a max length constraint of 3000 characters?
- This is my implementation proposal. It can be "tuned" to better adapt to the current Wikimedia technical infrastructure:
- The content of the OpenTimestamps file is copied on a text box encoded as base64 and stored on a sub-datatype of a string.
- Wikidata calculates the SHA256 hash of the highest resolution image (or video, or audio).
- Wikidata verifies the SHA256 hash at step 2 matches all the way up to the Merkle root on the OpenTimestamps file. This is done with the "verify" command. There are implementations of this command in Python, Java, Rust and Javascript at the OpenTimestamps Github.
- Wikidata gets the date of the Bitcoin block at the top of the Merkle root and checks for its date and time. It is not necessary to enforce trustless programming here, it would make the process a little more cumbersome, so it would be easier to not use a Bitcoin node in this step and stick to more classical API programming. We would use APIs available publicly. For the image at this example, we would need these two API calls:
$ curl https://blockstream.info/api/block-height/630893 0000000000000000000964ee0c5d67c74783626a058a76105f77ce0469ef379c $ curl https://blockstream.info/api/block/0000000000000000000964ee0c5d67c74783626a058a76105f77ce0469ef379c {"id":"0000000000000000000964ee0c5d67c74783626a058a76105f77ce0469ef379c", "height":630893, "version":536870912, "timestamp":1589857180, "tx_count":2065, "size":1396834, "weight":3993379, "merkle_root":"71cb3d8de9d964953d5f1a0ee00deb650a16cc055bed2366881a4574617af828", "previousblockhash":"000000000000000000068e9a1a96347f7815cd6f0a4d8c9e1b45305a9afd8364", "mediantime":1589851896, "nonce":3231938334, "bits":387021369, "difficulty":16104807485529}
- We take the "mediantime" value from the last JSON, which in this case is 1589851896. We treat it as Unix time and it gives us: "Tue May 19 2020 01:31:36 GMT 0000". This way it is proven that the example image existed that date or earlier.
- 5. The date obtained in the previous step is shown to the user. If the user presses a button somewhere, the content of the OTS file is made visible in base64 encoding.
- n.b.: I am not affiliated with OpenTimestamps or blockstream.info .
- I am a developer and I could freely help on my spare time if necessary.
- What are your thoughts? --FrankAndProust (talk) 19:26, 14 June 2021 (UTC)
- It is already more than two weeks with no responses. Would it be more practical to simply display the URL pointing to the external OpenTimestamps file? --FrankAndProust (talk) 09:40, 29 June 2021 (UTC)
- Oppose you should just build this as something separate from Commons. On Commons we have the upload history in which we can verify that something was uploaded at some point of time. That is enough for us. Bloating the Commons database for this is a bit too much. Multichill (talk) 16:58, 13 September 2021 (UTC)
- I'm marking this as not done because the data it's intended to store is too long. If that changes, it could be revisited, but I would recommend only doing that if there is evidence that other people would use this because, right now, there are no files on Commons which even mention this, not even the example given here. - Nikki (talk) 00:24, 24 December 2021 (UTC)