Skip to content

weijunjiang123/graphragflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

39 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿš€ GraphRAGFlow

images

A Graph-based Retrieval Augmented Generation (GraphRAG) implementation using Ollama or OpenAI API support LLMs and Neo4j Graph database. This project processes documents, extracts entities and relationships using LLMs, and stores the knowledge graph in Neo4j for advanced question-answering capabilities.

here is demo in neo4j

images

โœจ Features

  • Document processing and chunking
  • LLM-based knowledge graph extraction
  • Fix for handling string target attributes during graph document preprocessing
  • Neo4j integration for graph storage
  • Vector embeddings for semantic search
  • Entity extraction capabilities
  • Progress tracking for long-running operations
  • Batch processing with progress indicators
  • Modern Next.js frontend for interactive visualization and management

๐Ÿš€ Quickstart

๐Ÿณ Neo4j Setup with Docker

If you don't have a Neo4j environment, you can easily set up your own using Docker:

  1. Create a docker-compose.yml file in the project root with the following content:
version: '3'
services:
  neo4j:
    image: neo4j:5.13.0
    container_name: neo4j-graphrag
    ports:
      - "7474:7474"  # HTTP
      - "7687:7687"  # Bolt
    volumes:
      - ./neo4j/data:/data
      - ./neo4j/logs:/logs
      - ./neo4j/import:/import
      - ./neo4j/plugins:/plugins
    environment:
      - NEO4J_AUTH=neo4j/your_password  # Change this password
      - NEO4J_dbms_memory_heap_initial__size=1G
      - NEO4J_dbms_memory_heap_max__size=2G
      - NEO4J_dbms_memory_pagecache_size=1G
      # Enable vector index support
      - NEO4J_dbms_security_procedures_unrestricted=gds.*,apoc.*,vectorize.*
      - NEO4J_dbms_security_procedures_allowlist=gds.*,apoc.*,vectorize.*
      # Install Neo4j plugins (APOC, GDS, Vectorize)
      - NEO4J_PLUGINS=["apoc", "graph-data-science", "n10s"]
  1. Start the Neo4j container:
docker-compose up -d
  1. Access the Neo4j Browser at http://localhost:7474 to verify the installation

๐Ÿ“ฆ Requirements

  • Python 3.8+
  • Ollama with models:
    • qwen2.5 (default LLM model)
    • nomic-embed-text (for embeddings)
  • OpenAI API support models:
  • Neo4j database instance
  • Required Python packages:
    • langchain and langchain_experimental
    • neo4j
    • pydantic
    • tqdm
    • fastapi
    • uvciorn
    • pypdf

๐Ÿ› ๏ธ Installation

recomend using uv for package management

  1. Clone this repository:
git clone https://github.com/weijunjiang123/GraphRAG-with-Ollama.git
cd GraphRAG-with-Ollama
  1. Install required packages with uv:
uv sync
  1. Set up Neo4j database instance (local or cloud),if you don't setup checkout this

  2. Make sure Ollama is running with the required models:

ollama pull qwen2.5
ollama pull nomic-embed-text

or you can config api key in .env

โš™๏ธ Configuration

copy the .env.example to .env

cp .env.example .env

Modify the following variables in .env to match your environment:

checkout this for detail

โ–ถ๏ธ Usage

Run the main script to process a document and build the knowledge graph:

uv run main.py

The process includes:

  1. Loading and processing documents
  2. Converting documents to graph format
  3. Saving extracted graph documents
  4. Initializing Neo4j graph
  5. Adding graph documents to Neo4j
  6. Creating vector and fulltext indices
  7. Setting up entity extraction

๐ŸŒ Frontend Deployment (Next.js)

The frontend is located in the web/ directory and built with Next.js. You can use it for interactive visualization and management of the knowledge graph.

๐Ÿ’ป Local Development

  1. Enter the frontend directory:
cd web
  1. Install dependencies:
npm install
# or
yarn install
# or
pnpm install
# or
bun install
  1. Start the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev
  1. Open your browser and visit http://localhost:3000

๐Ÿ—๏ธ Build & Production Deployment

  1. Build the frontend static files:
npm run build
  1. Start the production server:
npm start
  1. Or deploy the .next or out directory to Vercel, Netlify, or any static hosting service.

๐Ÿ–ผ๏ธ Screenshots

Frontend Example 1 Frontend Example 2

๐Ÿ”ง How it Works

This project implements a GraphRAG approach:

  1. Document Processing: Text documents are loaded and split into manageable chunks.
  2. Knowledge Graph Extraction: An LLM identifies entities and relationships from text.
  3. Graph Storage: The extracted knowledge is stored in Neo4j as a graph.
  4. Vector Embeddings: Document chunks are embedded for semantic search.
  5. Retrieval: When querying, the system can use both graph traversal and vector similarity.
  6. Entity Extraction: A separate chain extracts entities from arbitrary text.

๐Ÿ“ Directory Structure

GraphRAG-with-Llama-3.1/
โ”œโ”€โ”€ .env.example             # Example environment variables
โ”œโ”€โ”€ .gitignore                # Specifies intentionally untracked files that Git should ignore
โ”œโ”€โ”€ main.py                   # Main script to run the application
โ”œโ”€โ”€ README.md                 # Documentation for the project
โ”œโ”€โ”€ requirements.txt          # List of Python dependencies
โ”œโ”€โ”€ src/                      # Source code directory
โ”‚   โ”œโ”€โ”€ config.py             # Configuration settings
โ”‚   โ”œโ”€โ”€ core/                 # Core logic and modules
โ”‚   โ”‚   โ”œโ”€โ”€ document_processor.py # Handles document loading and chunking
โ”‚   โ”‚   โ”œโ”€โ”€ embeddings.py       # Manages embeddings creation and vector index
โ”‚   โ”‚   โ”œโ”€โ”€ entity_extraction.py# Extracts entities from text
โ”‚   โ”‚   โ”œโ”€โ”€ graph_transformer.py# Converts documents to graph format
โ”‚   โ”‚   โ”œโ”€โ”€ neo4j_manager.py    # Manages Neo4j database operations
โ”‚   โ”‚   โ””โ”€โ”€ ...               # Other core modules
โ”‚   โ”œโ”€โ”€ utils.py              # Utility functions
โ”œโ”€โ”€ data/                     # Directory for storing data files
โ”‚   โ””โ”€โ”€ ...                   # Documents to be processed
โ”œโ”€โ”€ results/                  # Directory for storing output files
โ”‚   โ””โ”€โ”€ ...                   # Extracted graph documents
โ”œโ”€โ”€ web/                      # Next.js ๅ‰็ซฏ้กน็›ฎ็›ฎๅฝ•
โ”‚   โ”œโ”€โ”€ app/                  # Next.js ้กต้ขไธŽ็ป„ไปถ
โ”‚   โ”œโ”€โ”€ public/               # ้™ๆ€่ต„ๆบ
โ”‚   โ”œโ”€โ”€ package.json          # ๅ‰็ซฏไพ่ต–ไธŽ่„šๆœฌ
โ”‚   โ””โ”€โ”€ ...                   # ๅ…ถไป–ๅ‰็ซฏ็›ธๅ…ณๆ–‡ไปถ
โ””โ”€โ”€ ...                       # Other directories and files

๐Ÿค Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes.

๐Ÿ”— Reference

https://github.com/Coding-Crashkurse/GraphRAG-with-Llama-3.1

๐Ÿ“ License

MIT

About

a graphrag pipeline using local local LLMs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5