Skip to content

GitHub Org's stars Twitter Follow Hugging Face

ARBML is a group of researchers working on democratizing Arabic NLP research and deveopment:

  • 🙋‍♀️ All about Arabic NLP and ML, open source for the win!
  • 🏵️ Contribution guidelines - open an issue and given the go-ahead submit a PR.
  • 👩‍💻 Some repos have specific contribution guidlines.
  • 📝 Remember to cite if you use one of our resources.

Pinned Loading

  1. ARBML ARBML Public

    Implementation of many Arabic NLP and CV projects. Providing real time experience using many interfaces like web, command line and notebooks.

    JavaScript 382 45

  2. klaam klaam Public

    Arabic speech recognition, classification and text-to-speech.

    Jupyter Notebook 325 66

  3. masader masader Public

    The largest public catalogue for Arabic NLP and speech datasets. There are 500 datasets annotated with more than 25 attributes.

    JavaScript 140 22

  4. Calliar Calliar Public

    A dataset for online Arabic calligraphy. A collection of 2500 annotated calligraphic styles.

    Jupyter Notebook 140 16

  5. tkseem tkseem Public

    Arabic Tokenization Library. It provides many tokenization algorithms.

    Jupyter Notebook 82 17

  6. CIDAR CIDAR Public

    Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.

    Jupyter Notebook 30 3

Repositories

Showing 10 of 30 repositories
  • masader Public

    The largest public catalogue for Arabic NLP and speech datasets. There are 500 datasets annotated with more than 25 attributes.

    ARBML/masader’s past year of commit activity
    JavaScript 140 GPL-3.0 22 3 0 Updated Jul 19, 2024
  • Calliar Public

    A dataset for online Arabic calligraphy. A collection of 2500 annotated calligraphic styles.

    ARBML/Calliar’s past year of commit activity
    Jupyter Notebook 140 MIT 16 2 0 Updated Jun 24, 2024
  • dar Public

    A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.

    ARBML/dar’s past year of commit activity
    Python 11 Apache-2.0 1 1 0 Updated Jun 23, 2024
  • ARBML/masader-webservice’s past year of commit activity
    Python 5 MIT 4 2 0 Updated Jun 22, 2024
  • ARBML/arbml.github.io’s past year of commit activity
    HTML 0 2 1 0 Updated May 10, 2024
  • .github Public
    ARBML/.github’s past year of commit activity
    1 1 0 0 Updated Apr 13, 2024
  • CIDAR-v2 Public
    ARBML/CIDAR-v2’s past year of commit activity
    Jupyter Notebook 4 1 2 0 Updated Mar 30, 2024
  • ARBML/cidar_human_eval’s past year of commit activity
    Python 1 1 1 0 Updated Mar 3, 2024
  • ARBML Public

    Implementation of many Arabic NLP and CV projects. Providing real time experience using many interfaces like web, command line and notebooks.

    ARBML/ARBML’s past year of commit activity
    JavaScript 382 MIT 45 9 (4 issues need help) 0 Updated Mar 1, 2024
  • CIDAR Public

    Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.

    ARBML/CIDAR’s past year of commit activity
    Jupyter Notebook 30 3 0 0 Updated Feb 22, 2024