Linguistics Software for BSD

Browse free open source Linguistics software and projects for BSD below. Use the toggles on the left to filter open source Linguistics software by OS, license, language, programming language, and project status.

  • Turn speech into text using Google AI Icon
    Turn speech into text using Google AI

    Accurately convert voice to text in over 125 languages and variants by applying Google's powerful machine learning models with an easy-to-use API.

    New customers get $300 in free credits to spend on Speech-to-Text. All customers get 60 minutes for transcribing and analyzing audio free per month, not charged against your credits.
    Try for free
  • Translate docs, audio, and videos in real time with Google AI Icon
    Translate docs, audio, and videos in real time with Google AI

    Make your content and apps multilingual with fast, dynamic machine translation available in thousands of language pairs.

    Google Cloud’s AI-powered APIs help you translate documents, websites, apps, audio files, videos, and more at scale with best-in-class quality and enterprise-grade control and security.
    Learn More
  • 1
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering, 19(2), 259-284. Rasooli, M. S., Kahefi, O., & Minaei-Bidgoli, B. (2011). Effect of adaptive spell checking in Persian. In NLP-KE Contributors: Omid Kashefi Azadeh Zamanifar Masoumeh Mashaiekhi Meisam Pourafzal Reza Refaei Mohammad Hedayati Kamiar Kanani Mehrdad Senobari Sina Iravanin Mohammad Sadegh Rasooli Mohsen Hoseinalizadeh Mitra Nasri Alireza Dehlaghi Fatemeh Ahmadi Neda PourMorteza
    Leader badge
    Downloads: 608 This Week
    Last Update:
    See Project
  • 2
    Mishkal: Arabic Text Vocalization

    Mishkal: Arabic Text Vocalization

    Arabic Text Vocalization system

    Automatic system of vocalization of arabic text.
    Leader badge
    Downloads: 157 This Week
    Last Update:
    See Project
  • 3
    iramuteq
    IRAMUTEQ : Interface de R pour les Analyses Multidimensionnelles de Textes et de Questionnaires. Logiciel de traitement de données pour des corpus texte ou de type individus/caractères. Permet notamment de réaliser des analyses de type "ALCESTE"
    Leader badge
    Downloads: 613 This Week
    Last Update:
    See Project
  • 4
    Artha ~ The Open Thesaurus
    Artha is a handy thesaurus based on WordNet with distinct features like global hotkey look-up, passive desktop notifications, regular expression based search, etc.. Artha may be used as a free open-source replacement to the proprietary WordWeb Pro.
    Leader badge
    Downloads: 108 This Week
    Last Update:
    See Project
  • Cloud data warehouse to power your data-driven innovation Icon
    Cloud data warehouse to power your data-driven innovation

    BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data.

    BigQuery Studio provides a single, unified interface for all data practitioners of various coding skills to simplify analytics workflows from data ingestion and preparation to data exploration and visualization to ML model creation and use. It also allows you to use simple SQL to access Vertex AI foundational models directly inside BigQuery for text processing tasks, such as sentiment analysis, entity extraction, and many more without having to deal with specialized models.
    Try for free
  • 5
    Free Dictionaries
    Free translating dictionaries. Source format: TEI-P5 XML. Delivery formats: DICT, Stardict, etc. The dictionaries may include information on the pronunciation, etymology and such, in a platform-independent format. Access: web/plugins/standalone.
    Leader badge
    Downloads: 105 This Week
    Last Update:
    See Project
  • 6
    Apertium: Machine Translation Toolbox

    Apertium: Machine Translation Toolbox

    The free and open-source rule-based machine translation platform

    Apertium is a toolbox to build open-source shallow-transfer machine translation systems, especially suitable for related language pairs: it includes the engine, maintenance tools, and open linguistic data for several language pairs.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 7

    Wordcorr

    Data management for comparative linguistics

    Wordcorr automates the tedious and risky process of tabulating and managing the sound correspondences used in working out the historical development of natural languages. Initial support was from NSF.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 8

    Presage

    the intelligent predictive text entry platform

    Presage (formerly Soothsayer) is an intelligent predictive text entry system. Presage generates predictions by modelling natural language as a combination of redundant information sources. Presage computes probabilities for words which are most likely to be entered next by merging predictions generated by the different predictive algorithms. Presage's modular and extensible architecture allows its language model to be extended and customized to utilize statistical, syntactic, and semantic predictive algorithms. Presage's predictive capabilities are implemented by predictive plugins. Predictive plugins use services provided by the platform to implement multiple prediction techniques.
    Leader badge
    Downloads: 24 This Week
    Last Update:
    See Project
  • 9
    srt-translator

    srt-translator

    Subtitle translator from one natural language to other.

    Translating subtitles in format SubRip from one natural language to other. It is based on Google Translate without API and therefore without payment. Translator have automatic and manual spell checkers.
    Downloads: 28 This Week
    Last Update:
    See Project
  • Create and run cloud-based virtual machines. Icon
    Create and run cloud-based virtual machines.

    Secure and customizable compute service that lets you create and run virtual machines on Google’s infrastructure.

    Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a good balance of price and performance. Compute optimized (C2) machines offer high-end vCPU performance for compute-intensive workloads. Memory optimized (M2) machines offer the highest memory and are great for in-memory databases. Accelerator optimized (A2) machines are based on the A100 GPU, for very demanding applications.
    Try for free
  • 10
    JInsect
    The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    oopinyinguide
    OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 12
    The Dictionary System
    The application Dictionary System (DS) is a web application designed for creation of one-way bilingual dictionaries or encyclopaedias offering a working environment for creation of a dictionary and a web page which enables the general public to search in the dictionary. It is so-called DWS application (Dictionary Writing System) or DPS (Dictionary Production / Publishing System). Aplikace Dictionary System (dále DS) je webová aplikace. Je to tzv. DWS aplikace (Dictionary Writing System) nebo také tvz. DPS (Dictionary Production/Publishing System). Aplikace Dictionary System nabízí pracovní prostředí pro tvorbu jednosměrných dvojjazyčných slovníků nebo encyklopedií a webové stránky, které umožňují vyhledávat ve slovníku široké veřejnosti.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    AzConvert is an open source program to convert different scripts of Azerbaijani language (Latin, Arabic and Cyrillic) to each other. It's written in Qt.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    Fresh Memory

    Fresh Memory

    Flashcards application with Spaced Repetition method

    Fresh Memory is an application that helps to learn large amounts of any material with Spaced Repetition method. The most important subject is learning foreign words, but Fresh Memory can be also used to learn anything else. The learning data is stored as flash cards and dictionaries. The flash cards may have several fields, and the user controls what combination of fields to learn. The flashcards can have formatted text and images.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    A project that aims to create reusable components (C libraries, COM components, and Edit controls) for Phonetic Transliteration of Indian languages, such as Telugu, Tamil, Kannada etc.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    EyeMap - Eye Movement Data Analyzer
    EyeMap is a visualization and analysis tool for text reading eye movement data. It can process Unicode, proportion/non-proportion and spaced/unspaced reading materials, which supports various languages and experiment methods.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Al-Mintiq: Arabic eSpeak

    Al-Mintiq: Arabic eSpeak

    Arabic voice files for eSpeak system

    Arabic files and voices for eSpeak Text to speech system, المنطيق : ملفات اللغة العربية لبرنامج توليد الكلام من النص إسبيك
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    HermeneutiX

    HermeneutiX

    Your graphical tool for Syntactic/Semantic Structure Analysis of texts

    HermeneutiX is a tool for diagramming syntactic and semantic structures of complex (not necessarily foreign-language) texts (e.g. bible or other historical excerpts). HermeneutiX is now part of SciToS (the scientific tool set). Starting with version 2.0.0, HermeneutiX can be found on GitHub. Please check out the release summary: https://github.com/scientific-tool-set/scitos/releases For an introduction, check out this video: https://youtu.be/uQjewyG0Ad8 PS: To run a Java application such as HermeneutiX (i.e. SciToS) you need a Java Runtime Environment (JRE). HermeneutiX is currently built to be compatible down to JRE version 6. You may download the current JRE here: http://www.java.com/en/download
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    JoBimText

    JoBimText

    Linking Language to Knowledge with Distributional Semantics

    JobimText is a software solution for automatic text expansion using contextualized distributional similarity. It provides text analysis tools for large corpora and has capabilities to create distributional semantic models (JoBimText models) and multi-word expressions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    LaBB-CAT

    LaBB-CAT

    A linguistic annotation store

    LABB-CAT is a browser-based linguistics research tool that stores recordings and regular-expression searchable text transcripts of interviews. The search results, entire transcripts, and media, can be viewed or exported in a variety of format
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21

    TextComparer

    Small Java program to compare two texts

    Small Java program to compare two texts, originally designed to be used to find quotations in a Byzantine anthology. It can quite likely be used to detect plagiarism between two texts as well Graphical interface which allows easy navigation between corresponding parts in the two different texts. Uses the http://software.jessies.org/salma-hayek/ Java TextArea for this. Probably very much can be done to improve this program and the algorithm which it uses.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    ELIA(Eyegaze Language Integration Analysis) supports the analysis of eye-tracking data for studies in language processing. ELIA eases early analysis of data to enable iterative development of experiments in response to spoken language.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Lyreword
    Lyreword is a flexible word generator for writers, role players, conlangers and everybody who seeks some inspiration for inventing words and names.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Mani
    Coptic - English and Coptic - Czech dictionary related to Crum's coptic dictionary, written in C , based on MySql, with Qt GUI. Is developed as part of project Marcion, containing only coptic data without study environment.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Maskouk : Arabic Collocations
    Maskouk: Arabic Collocations Dictionary المسكوكات اللفظية العربيو، المتلازمات المتواردات
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next