-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntax for Dictionary types #3355
Comments
I think the only bit of this that could provide opposing intuitions is pattern matching. One intuition is that the data structure is built inductively like everything else and therefore matching on one element is (arbitrarily) order-dependent. The other intuition is that dictionaries are not (generally) order dependent so pattern matching on one element means “this element is in the dictionary.” The latter is incorrect, of course. This probably isn’t a problem that outweighs the benefits of pattern matching with the new syntax I don’t think, but it certainly is a consideration not shared by Lists. |
Thanks for starting this in a well-thought out way. My thoughts:
|
@joelberkeley Thanks for your comments. For your concerns about 2 and 3, please refer to the "Scope" section:
Uniqueness of keys and dictionary indexing remains unchanged and is left as an implementation choice of the library writer. |
@joelberkeley You can use induction-recursion to define a dictionary import Data.So
record Distinct (x, y : String) where
constructor MkDistinct
runDistinct : So (x /= y)
data FreshDict : Type -> Type
isFresh : String -> FreshDict a -> Type
data FreshDict where
DNil : FreshDict a
DCons : (key : String) -> (value : a) ->
(dict : FreshDict a) ->
{auto 0 fresh : isFresh key dict} ->
FreshDict a
isFresh x DNil = ()
isFresh x (DCons y _ dict) = (Distinct x y, isFresh x dict)
failing #"Can't find an implementation for Distinct "1" "1"."#
test : FreshDict Nat
test = DCons "1" 1
$ DCons "1" 2
$ DNil |
Indeed, I mainly want to check if this functionality will lend itself well to these situations, esp when they strike me as a particularly common use-case. |
I like the alternative considered (using a pairing operator), I'm not so sure about the argument to reject it. Would defining this syntax ( Including it in a stdlib would signal it is the preferred/conventional syntax. Maybe there are other arguments against the pairing operator approach that would strengthen this proposal? |
I think the pairing operator is a very strong alternative, and it has two great benefits over this proposal:
Beside the subjective aesthetic difference, dictionary syntax has two benefits over operators lists:
On this last point, providing a preferred operator in the Prelude will not prevent users from using it for non-dictionary purposes or redefining it entirely, which I could qualify as "misuse". |
Summary
Develop new syntax to write dictionaries inspired from the list syntax sugar:
[:=]
[k := v]
[ k1 := v1, k2 := v2, ...]
DCons k v ds
Motivation
Contemporary programming involves manipulation a common set of
data structures, like Lists, Strings, and numbers. One of them is dictionaries.
I will call Dictionary the general category of data structure that involve a key-value pair.
In industry, dictionaries are seen as an efficient key-value storage that indexes the value
in
O(Log(n))
complexity. In idris, this datastructure isSortedMap
.In reality, there are many forms of key-value pair datastructures that
do not fulfill this performance promise but we would still like to use key-value pair syntax to represent them
those use-cases are:
List (a, b)
, lists of pairsBecause of their ubiquity, all those dictionary types should enjoy an easily accessible and
familiar syntax. This is further evidenced by other languages providing syntax sugar
specifically for dictionary types. Like we find in python
{a : b}
, Swift[a: b]
andRuby
{a => b}
The proposal
The dictionary syntax is developed as a desugaring step similar to the one used for lists. To
recap the list desugaring: Lists
[a, b, c]
are desugared into successive application of theconstructor
::
(pronounced "cons") andNil
. Such that[a, b, c]
desugars toa :: (b :: (c :: Nil))
.The proposal is to allow the parser to accept syntax of the form
[ a := b , c := d ]
.This is meant to look like a list where terms on the left of
:=
as assigned values to the right ofit. Another way to understand this syntax is to thing of it as record update but using square brackets
as enclosing symbols, rather than curly brackets.
Scope
The syntax sugar does not help with indexing dictionaries, dictionary update, or performance.
This proposal is only about syntax sugar to write terms that match key-value pair semantics.
They should allow pattern matching but not develop new pattern matching semantics to avoid
introducing complexity in case trees or in learning curve.
Examples
Empty dictionary:
[:=]
Desugars to:
DNil
Singleton dictionary:
[ key := val ]
Desugars to:
DCons key val DNil
Multiple key-values dictionary:
[ k1 := v1 , k2 := v2 ]
Desugars to:
DCons k1 v1 (DCons k2 v2 DNil)
Pattern matching:
How to use it
Currently the constructor for JSON is
Object : List (String, JSON) -> JSON
and so writing JSON values looks like this:There are two problem, we always need to nest
Object
andArray
and the second is that we use the full pair syntaxfor key-value pairs in objects.
We can fix the
Array
andObject
nesting by providing a suitable::
function, but we cannot remove the parenthesisnesting without introducing a bespoke operator for pairing.
Instead, the above can be written in direct style by designing a DSL for objects-as-dictionaries:
The JSON DSL is implemented in the tests
Complex record libraries using a schema as index can be update to support the new dictionary syntax as demonstrated
here: https://github.com/andrevidela/idris-records/blob/dictionary-syntax/src/Records.idr#L78-L79
Technical implementation
This feature is implemented as a desugaring step where the parser is extended to parse lists of pairs of expressions separated by
:=
in the same way as record updates but using square brackets instead.A prototype implementation is here where you can see that the diff is fairly small and the changes to the syntax are also small since all the components already exist. It can be found in the draft #3356 .
Alternatives considered
Use
[]
for empty dictionaryThis does work but it creates unwelcome ambiguities when mixing lists and dictionaries in the same data.
For example given a value of type
JSON
we cannot know if we are building an empty array or an empty objectif we write
and
Nil
stands both for empty dictionary and empty array.To remove this ambiguity, I propose to use
[:=]
as a reserved token to refer to empty dictionaries.We could use
[:]
or[=]
which better mimic[>]
and[<]
but they do not reflect reflect thesyntax used for dictionaries.
Same notation, old constructors
Another option would be to desugar
[k := v]
intoMkPair k v :: []
instead ofDCons k v []
.This would work for the most common case but falls apart whenever there is a type dependency between
the key and the value, or the key-value pair and the rest of the data. For example, we could not implement
the indexed data :
This is because the constructor for the key/value pair is dependent, and
MkPair
is not.Use a pairing operator
This problem already has a solution: Use list syntax with a custom operator to pair-up
key-values. If the pairing operator is called
::=
the json example can be written:Scala uses
->
as a pairing operator:Haskell's aeson uses
.=
for pairing:Other Idris libraries currently use
::=
,:>
,=:
,:=:
:->:
and more.The shortcoming of this approach is obvious by now: there is absolutely
no uniformity when building key-value pairs in Idris. However, in the absence
of a standard pairing operator, using
:=
would render all uses of dictionaries uniform,decreasing the cognitive overhead of learning a new operator for each bespoke dictionary
type.
Out of this, stems two counter points:
,
The current standard pairing operator is the
Pair
type, constructed in its type, and term,with the syntax
(_, _)
. One can use tuple sections to use it as a function (,). We canalso use the name
Pair
to build the type orMkPair
to build terms of the aforementioned type.The problem becomes apparent when attempting to use the standard constructor for pair
to build dictionaries: It is verbose, repetitive, and noisy:
While only makes use of prelude constructors, it is not
a good user experience to type that much boilerplate. This argument is supported by
the fact that library designers have been deploying their own pairing operators to make
constructing dictionaries easier.
The second argument is strongest when applied with domain-specific knowledge. There
are uses of dictionaries where the use of
:=
would be misleading and generally worse thanbespoke operators. For example a typing context is itself a form of dictionary, holding variables
as keys and types as values, but it is very much expected that we write them with a bespoke
operator to represent the fact that a variable is assigned a type, for example
[ x :- t ]
.The presence of this feature does not forbid the use of custom operators for such purposes,
and in fact, is completely backward compatible.
Do nothing
One last thing to consider is to do nothing. Other successful languages such as Java
Haskell or Rust do not have a bespoke dictionary syntax. This is clear evidence that such feature
is not necessary for wider adoption and its maintenance cost might exceed the overall
usage benefits we get from it. However, I personally think that the above argument about
uniformity are enough to justify the additional feature. As the prototype shows, the diff
for it is quite small, rendering the maintenance cost minimal, see #3356.
Conclusion
Dictionary syntax solidifies Idris as a modern programming language, enjoying the same ergonomics
as other programming environments that focus on user-friendliness and accessibility. What's more,
enabling a uniform syntax that accommodates for dictionaries, JSON, XML, dependent types all at once
represents an unprecedented opportunity to lead the way in programming language experience.
The text was updated successfully, but these errors were encountered: