Skip to content
View lia-git's full-sized avatar

Block or report lia-git

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Python 13,331 994 Updated Oct 28, 2024

Docker Image for Ubuntu Desktop which support HW GPU accelerated GUI apps. you can access the Container with ssh or remote desktop, just like Cloud VM.

Dockerfile 284 63 Updated Oct 24, 2024

An easy-to-use framework for modular RAG

Python 289 43 Updated Oct 28, 2024

主流ocr算法研究实验性的项目,目前实现了CNN BLSTM CTC架构

C 1,266 536 Updated Jun 13, 2020

农业知识图谱(AgriKG):农业领域的信息检索,命名实体识别,关系抽取,智能问答,辅助决策

Python 3,998 1,563 Updated Jul 19, 2024

RabbitMQ running on docker

145 138 Updated Jun 7, 2017

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 49,821 7,119 Updated Oct 28, 2024

一个用于提取简体中文字符串中省,市和区并能够进行映射,检验和简单绘图的python模块

Python 1,671 397 Updated Mar 19, 2024

中华人民共和国行政区划:省级(省份)、 地级(城市)、 县级(区县)、 乡级(乡镇街道)、 村级(村委会居委会) ,中国省市区镇村二级三级四级五级联动地址数据。

JavaScript 18,536 7,039 Updated Sep 13, 2023

Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT,Cross Encoder

Python 483 44 Updated Oct 27, 2024

基于规则匹配的问答系统中的解析器,the parser of based rule QA system

Python 11 4 Updated Mar 13, 2020

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

15,659 1,451 Updated Sep 19, 2024

地址识别,纠错,补全

Python 4 3 Updated Mar 26, 2020

中文地址切分,及地址补全

Python 35 10 Updated Feb 15, 2019

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,791 524 Updated Oct 24, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 30,119 3,526 Updated Oct 26, 2024

This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.

Python 162 15 Updated Sep 15, 2021
Jupyter Notebook 59 17 Updated Sep 3, 2023
Jupyter Notebook 13 2 Updated Feb 7, 2024
Jupyter Notebook 256 61 Updated Sep 12, 2024

Example models using DeepSpeed

Python 6,058 1,030 Updated Sep 17, 2024

Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"

Assembly 535 44 Updated May 20, 2024

Build Text Rerankers with Deep Language Models

Python 250 23 Updated Feb 20, 2024

LLM (Large Language Model) FineTuning

Jupyter Notebook 458 110 Updated May 19, 2024

Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines

Python 194 13 Updated May 6, 2024

A simple toy demo of a local voice assistant with whisper and large language model.

Python 1,264 205 Updated Apr 30, 2024

Open Language Pre-trained Model Zoo

986 136 Updated Nov 18, 2021

This repository contains tutorials and examples for Triton Inference Server

Python 548 92 Updated Oct 26, 2024

《跟我一起深度学习》

Python 177 25 Updated Jul 8, 2024

Netease Youdao's open-source embedding and reranker models for RAG products.

Python 1,450 96 Updated Sep 6, 2024
Next