ColabでOllamaを使用してLightRAGを動かしてみた - SEが最近起こったことを書くブログ

Google ColabでOllamaでLightRAGを動かしてみたので、手順をメモ

まずは、Ollamaの環境をセットアップ

!curl -fsSL https://ollama.com/install.sh | sh
!nohup ollama serve &
!ollama pull <利用するLLMモデル>
!ollama pull nomic-embed-text

ollamaのコンテキストサイズが小さいと知識グラフを作成できないため、 Modelfileを使い、コンテキストサイズを大きくする。

!ollama show --modelfile <利用するLLMモデル> > Modelfile

以下を追記する

PARAMETER num_ctx 32768

以下のコマンドでModelfileに従うモデルを作成する

!ollama create -f Modelfile <コンテキストサイズを大きくしたモデル名>

次にLightRAGをインストール

!pip install lightrag-hku

処理対象のドキュメントをダウンロード

curl https://raw.githubusercontent.com/gusye1234/nano-graphrag/main/tests/mock_data.txt > ./book.txt

知識グラフを作成する

from lightrag import LightRAG, QueryParam
from lightrag.llm import ollama_model_complete, ollama_embedding
from lightrag.utils import EmbeddingFunc
import nest_asyncio
nest_asyncio.apply()

import os

WORKING_DIR = "./dickens"

if not os.path.exists(WORKING_DIR):
    os.mkdir(WORKING_DIR)

# Initialize LightRAG with Ollama model
rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=ollama_model_complete,  # Use Ollama model for text generation
    llm_model_name=<コンテキストサイズを大きくしたモデル名>, # Your model name
    # Use Ollama embedding function
    embedding_func=EmbeddingFunc(
        embedding_dim=768,
        max_token_size=8192,
        func=lambda texts: ollama_embedding(
            texts,
            embed_model="nomic-embed-text"
        )
    ),
)

with open("./book.txt") as f:
    rag.insert(f.read())

注意点

コンテキストウィンドウを大きくしないと、ほぼEntityが抽出されなかったため、コンテキストサイズを大きくする対応は必要

参考にしたページ

github.com