Skip to content

Whitaker's Words in Python: Latin Dictionary and Morphology parser

License

Notifications You must be signed in to change notification settings

blagae/open_words

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Open Words

Notice: this project has moved

This project has moved to its own repository.

Due to the fact that I am doing different stuff than the original author of open_words intended and in order not to clash with them if I ever decide to release the project as a package, I have decided to move away from this fork and to maintain a separate repository which will host any subsequent changes to the code base.

Project history

Open Words is a port of William Whitaker's original Ada code to Python for future maintenance and improvement. You can find the current state of development that started with the original Whitaker's Words, written in Ada, on Martin Keegan's Github repository. More information about William Whitaker and the Words program is available there.

Changes since fork

This specific project is a fork of the initial effort by Luke Hollis. A few functions have been deleted:

  • the promise of english-to-latin lookups has been abandoned
  • multi-word lookups are no longer possible

Other changes include:

  • inefficient dictionary loops (O(n)) have been replaced by lookups (O(log n))
  • Parse has been renamed to Parser
  • the formatting logic has moved to its own module, named formatter
  • tests were added
  • format_data is now a data feeding program which reads Whitaker's file lists into Python dictionaries and lists
  • format_data logic is called automatically if necessary (should be only once)

License

This project is under the MIT license. The license was taken over from the Luke Hollis project.

Usage

To use the standard dictionary lookup, use the Parser class as follows:

from open_words.parse import Parser
parser = Parser()
parser.parse("regis")

The return value is a Python dictionary, easily wrapped to JSON, structured as followed:

{
  "word": "regis",
  "defs": [
    { "orth": [ "rex", "reg" ],
      "senses": [ "king" ],
      "infls": [
        { "stem": "reg",  "ending": "is", "pos": "noun",
          "form": { "declension": "accusative", "number": "plural", "gender": "masculine" }
        }]
    },
    { "orth": [ "rego", "regere", "rexi", "rectus" ],
      "senses": [ "rule, guide", "manage, direct" ],
      "infls": [
        { "stem": "reg", "ending": "is", "pos": "verb",
          "form": { "tense": "present", "voice": "active", "mood": "indicative", "person": 2, "number": "singular" }
        }]
    }
  ]
}

About

Whitaker's Words in Python: Latin Dictionary and Morphology parser

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%