SEが最近起こったことを書くブログ

ITエンジニアが試したこと、気になったことを書いていきます。

llama-cpp-agentを使ってPhi-3-miniでWebデータを収集するエージェントをColabで動かしてみた

LLM

llama-cpp-agentを使ってPhi-3-miniでWebデータを収集するエージェントを作ってみたので、実装メモ

ポイント

MessagesFormatterTypeを利用するLLMに合わせて設定する
- Phi-3の場合は、MessagesFormatterType.PHI_3
Web検索用のツールとして、WebSearchToolが用意されているので、それを使う
- LlmStructuredOutputSettings.from_functionで設定できる

詳細

Phi-3-mini-4k-instructのggufをダウンロードし、GPUを利用するための環境変数を設定し、llama-cpp-pythonをインストール Webデータを収集するエージェントを作成するために必要なライブラリをインストール

!wget https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-fp16.gguf
!CMAKE_ARGS="-DLLAMA_CUDA=on" pip install llama-cpp-python
!pip install llama-cpp-agent pypdf trafilatura bs4 readability-lxml httpx duckduckgo_search

# Import the Llama class of llama-cpp-python and the LlamaCppPythonProvider of llama-cpp-agent
from llama_cpp import Llama
from llama_cpp_agent.providers import LlamaCppPythonProvider

# Create an instance of the Llama class and load the model
llama_model = Llama(r"Phi-3-mini-4k-instruct-fp16.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40,n_ctx=4096)

# Create the provider by passing the Llama class instance to the LlamaCppPythonProvider class
provider = LlamaCppPythonProvider(llama_model)

from llama_cpp_agent.llm_output_settings import LlmStructuredOutputSettings
from enum import Enum
from typing import Union
import math
from llama_cpp_agent import MessagesFormatterType
from llama_cpp_agent.tools import WebSearchTool
from llama_cpp_agent.prompt_templates import web_search_system_prompt
from llama_cpp_agent.chat_history.messages import Roles

# Now let's create an instance of the LlmStructuredOutput class by calling the `from_functions` function of it and passing it a list of functions.

search_tool = WebSearchTool(provider, MessagesFormatterType.PHI_3, max_tokens_search_results=2000)

output_settings = LlmStructuredOutputSettings.from_functions([search_tool.get_tool()], allow_parallel_function_calling=True)

# Create a LlamaCppAgent instance as before, including a system message with information about the tools available for the LLM agent.
llama_cpp_agent = LlamaCppAgent(
    provider,
    debug_output=True,
    system_prompt=web_search_system_prompt,
    predefined_messages_formatter_type=MessagesFormatterType.PHI_3,
)

# Define some user input
user_input = "調査してほしいことを記入する"


settings = provider.get_provider_default_settings()

settings.temperature = 0.65
# settings.top_p = 0.85
# settings.top_k = 60
# settings.tfs_z = 0.95
settings.max_tokens = 2048

# Pass the user input together with output settings to `get_chat_response` method.
# This will print the result of the function the LLM will call, it is a list of dictionaries containing the result.
result = llama_cpp_agent.get_chat_response(
    user_input,llm_sampling_settings=settings, structured_output_settings=output_settings
)

while True:
    if result[0]["function"] == "write_message_to_user":
        break
    else:
        result = llama_cpp_agent.get_chat_response(result[0]["return_value"], role=Roles.tool,
                                            structured_output_settings=output_settings, llm_sampling_settings=settings)

result = llama_cpp_agent.get_chat_response(result[0]["return_value"], role=Roles.tool,
                                    llm_sampling_settings=settings)
print(result)

参考にしたコード