🚀 GraphRAGFlow

A Graph-based Retrieval Augmented Generation (GraphRAG) implementation using Ollama or OpenAI API support LLMs and Neo4j Graph database. This project processes documents, extracts entities and relationships using LLMs, and stores the knowledge graph in Neo4j for advanced question-answering capabilities.

here is demo in neo4j

✨ Features

Document processing and chunking
LLM-based knowledge graph extraction
Fix for handling string target attributes during graph document preprocessing
Neo4j integration for graph storage
Vector embeddings for semantic search
Entity extraction capabilities
Progress tracking for long-running operations
Batch processing with progress indicators
Modern Next.js frontend for interactive visualization and management

🚀 Quickstart

🐳 Neo4j Setup with Docker

If you don't have a Neo4j environment, you can easily set up your own using Docker:

Create a docker-compose.yml file in the project root with the following content:

version: '3'
services:
  neo4j:
    image: neo4j:5.13.0
    container_name: neo4j-graphrag
    ports:
      - "7474:7474"  # HTTP
      - "7687:7687"  # Bolt
    volumes:
      - ./neo4j/data:/data
      - ./neo4j/logs:/logs
      - ./neo4j/import:/import
      - ./neo4j/plugins:/plugins
    environment:
      - NEO4J_AUTH=neo4j/your_password  # Change this password
      - NEO4J_dbms_memory_heap_initial__size=1G
      - NEO4J_dbms_memory_heap_max__size=2G
      - NEO4J_dbms_memory_pagecache_size=1G
      # Enable vector index support
      - NEO4J_dbms_security_procedures_unrestricted=gds.*,apoc.*,vectorize.*
      - NEO4J_dbms_security_procedures_allowlist=gds.*,apoc.*,vectorize.*
      # Install Neo4j plugins (APOC, GDS, Vectorize)
      - NEO4J_PLUGINS=["apoc", "graph-data-science", "n10s"]

Start the Neo4j container:

docker-compose up -d

Access the Neo4j Browser at http://localhost:7474 to verify the installation

📦 Requirements

Python 3.8+
Ollama with models:
- qwen2.5 (default LLM model)
- nomic-embed-text (for embeddings)
OpenAI API support models:
Neo4j database instance
Required Python packages:
- langchain and langchain_experimental
- neo4j
- pydantic
- tqdm
- fastapi
- uvciorn
- pypdf

🛠️ Installation

recomend using uv for package management

Clone this repository:

git clone https://github.com/weijunjiang123/GraphRAG-with-Ollama.git
cd GraphRAG-with-Ollama

Install required packages with uv:

uv sync

Set up Neo4j database instance (local or cloud),if you don't setup checkout this
Make sure Ollama is running with the required models:

ollama pull qwen2.5
ollama pull nomic-embed-text

or you can config api key in .env

⚙️ Configuration

copy the .env.example to .env

cp .env.example .env

Modify the following variables in .env to match your environment:

checkout this for detail

▶️ Usage

Run the main script to process a document and build the knowledge graph:

uv run main.py

The process includes:

Loading and processing documents
Converting documents to graph format
Saving extracted graph documents
Initializing Neo4j graph
Adding graph documents to Neo4j
Creating vector and fulltext indices
Setting up entity extraction

🌐 Frontend Deployment (Next.js)

The frontend is located in the web/ directory and built with Next.js. You can use it for interactive visualization and management of the knowledge graph.

💻 Local Development

Enter the frontend directory:

cd web

Install dependencies:

npm install
# or
yarn install
# or
pnpm install
# or
bun install

Start the development server:

npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev

Open your browser and visit http://localhost:3000

🏗️ Build & Production Deployment

Build the frontend static files:

npm run build

Start the production server:

npm start

Or deploy the .next or out directory to Vercel, Netlify, or any static hosting service.

🖼️ Screenshots

🔧 How it Works

This project implements a GraphRAG approach:

Document Processing: Text documents are loaded and split into manageable chunks.
Knowledge Graph Extraction: An LLM identifies entities and relationships from text.
Graph Storage: The extracted knowledge is stored in Neo4j as a graph.
Vector Embeddings: Document chunks are embedded for semantic search.
Retrieval: When querying, the system can use both graph traversal and vector similarity.
Entity Extraction: A separate chain extracts entities from arbitrary text.

📁 Directory Structure

GraphRAG-with-Llama-3.1/
├── .env.example             # Example environment variables
├── .gitignore                # Specifies intentionally untracked files that Git should ignore
├── main.py                   # Main script to run the application
├── README.md                 # Documentation for the project
├── requirements.txt          # List of Python dependencies
├── src/                      # Source code directory
│   ├── config.py             # Configuration settings
│   ├── core/                 # Core logic and modules
│   │   ├── document_processor.py # Handles document loading and chunking
│   │   ├── embeddings.py       # Manages embeddings creation and vector index
│   │   ├── entity_extraction.py# Extracts entities from text
│   │   ├── graph_transformer.py# Converts documents to graph format
│   │   ├── neo4j_manager.py    # Manages Neo4j database operations
│   │   └── ...               # Other core modules
│   ├── utils.py              # Utility functions
├── data/                     # Directory for storing data files
│   └── ...                   # Documents to be processed
├── results/                  # Directory for storing output files
│   └── ...                   # Extracted graph documents
├── web/                      # Next.js 前端项目目录
│   ├── app/                  # Next.js 页面与组件
│   ├── public/               # 静态资源
│   ├── package.json          # 前端依赖与脚本
│   └── ...                   # 其他前端相关文件
└── ...                       # Other directories and files

🤝 Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes.

🔗 Reference

https://github.com/Coding-Crashkurse/GraphRAG-with-Llama-3.1

📝 License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.next		.next
.roo		.roo
asset		asset
data		data
examples		examples
graph-importer		graph-importer
neo4j		neo4j
results		results
src		src
web		web
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
GraphPipeline.svg		GraphPipeline.svg
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
docker-compose.yaml		docker-compose.yaml
main.py		main.py
openapi.json		openapi.json
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_api.py		run_api.py
struct.svg		struct.svg
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 GraphRAGFlow

✨ Features

🚀 Quickstart

🐳 Neo4j Setup with Docker

📦 Requirements

🛠️ Installation

⚙️ Configuration

▶️ Usage

🌐 Frontend Deployment (Next.js)

💻 Local Development

🏗️ Build & Production Deployment

🖼️ Screenshots

🔧 How it Works

📁 Directory Structure

🤝 Contributing

🔗 Reference

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

weijunjiang123/graphragflow

Folders and files

Latest commit

History

Repository files navigation

🚀 GraphRAGFlow

✨ Features

🚀 Quickstart

🐳 Neo4j Setup with Docker

📦 Requirements

🛠️ Installation

⚙️ Configuration

▶️ Usage

🌐 Frontend Deployment (Next.js)

💻 Local Development

🏗️ Build & Production Deployment

🖼️ Screenshots

🔧 How it Works

📁 Directory Structure

🤝 Contributing

🔗 Reference

📝 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages