バージョンは v2.7.1 で OS は Windows 11
Tuesday, March 5 2024
GPT4All を使ってみた
By takagiwa on Tuesday, March 5 2024, 22:10
To content | To menu | To search
Tuesday, March 5 2024
By takagiwa on Tuesday, March 5 2024, 22:10
バージョンは v2.7.1 で OS は Windows 11
Thursday, February 29 2024
By takagiwa on Thursday, February 29 2024, 22:04
日本語を使うには一手間要りそう
Wednesday, February 28 2024
By takagiwa on Wednesday, February 28 2024, 22:04
Chat with RTX を自宅で動かしてみたら結構使えそう。
まだ会社の PC に Chat with RTX に対応できる GPU ボードもないし、もうちょっとなんとかなって欲しいところもあるので、同等のものを構築していくしかなさそうだ。
Windows での環境構築は LM Studio が易しそう。GPT4All も易しそう。ただいずれも Chat with RTX のような機能を最初から持っているわけではないらしい。
キーワードは「RAG (Retrieval-Augmented Generation)」らしい。
How to create a private ChatGPT that interacts with your local documents で案内されている PrivateGPT はそういうのに使えるらしい。
ひとまず Anaconda をインストール。
PrivateGPT の Installation に従ってインストール。
Anaconda Powershell Prompt を起動。
(base) PS E:\Projects>python --version Python 3.11.5 (base) PS E:\Projects>git clone https://github.com/imartinez/privateGPT Cloning into 'privateGPT'... remote: Enumerating objects: 1510, done. remote: Counting objects: 100% (23/23), done. remote: Compressing objects: 100% (22/22), done. remote: Total 1510 (delta 2), reused 8 (delta 0), pack-reused 1487 Receiving objects: 100% (1510/1510), 1.69 MiB | 791.00 KiB/s, done. Resolving deltas: 100% (819/819), done. (base) PS E:\Projects\privateGPT> (Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python - Retrieving Poetry metadata # Welcome to Poetry! This will download and install the latest version of Poetry, a dependency and package manager for Python. It will add the `poetry` command to Poetry's bin directory, located at: C:\Users\xxxxxxxx\AppData\Roaming\Python\Scripts You can uninstall at any time by executing this script with the --uninstall option, and these changes will be reverted. Installing Poetry (1.8.1) Installing Poetry (1.8.1): Creating environment Installing Poetry (1.8.1): Installing Poetry Installing Poetry (1.8.1): Creating script Installing Poetry (1.8.1): Done Poetry (1.8.1) is installed now. Great! To get started you need Poetry's bin directory (C:\Users\xxxxxxxx\AppData\Roaming\Python\Scripts) in your `PATH` environment variable. Alternatively, you can call Poetry explicitly with `C:\Users\xxxxxxxx\AppData\Roaming\Python\Scripts\poetry`. You can test that everything is set up by executing: `poetry --version`
パスを通してあげてからバージョン確認。
(base) PS E:\Projects\privateGPT> $ENV:PATH += ";C:\Users\xxxxxxxx\AppData\Roaming\Python\Scripts" (base) PS E:\Projects\privateGPT> poetry --version Poetry (version 1.8.1)
MinGW のインストールでは「mingw32-gcc-g++」にチェックを入れて Installation → Apply した。こちらもパスが通っていなかったので追加してあげる。
Visual Studio 2022 はパスの設定が面倒だったので、あとでコマンドプロンプトに戻る。
Anaconda Prompt を開く。
(base) E:\Projects\privateGPT>"D:\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars64.bat" ********************************************************************** ** Visual Studio 2022 Developer Command Prompt v17.5.1 ** Copyright (c) 2022 Microsoft Corporation ********************************************************************** [vcvarsall.bat] Environment initialized for: 'x64' (base) E:\Projects\privateGPT>python --version Python 3.11.5 (base) E:\Projects\privateGPT>poetry --version Poetry (version 1.8.1) (base) E:\Projects\privateGPT>cl --version Microsoft(R) C/C++ Optimizing Compiler Version 19.35.32215 for x64 Copyright (C) Microsoft Corporation. All rights reserved. cl : コマンド ライン warning D9002 : 不明なオプション '--version' を無視します。 cl : コマンド ライン error D8003 : ソース ファイル名がありません (base) E:\Projects\privateGPT>cmake --version cmake version 3.25.1-msvc1 CMake suite maintained and supported by Kitware (kitware.com/cmake). (base) E:\Projects\privateGPT>gcc --version gcc (MinGW.org GCC-6.3.0-1) 6.3.0 Copyright (C) 2016 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
(base) E:\Projects\privateGPT>poetry install --with ui
Creating virtualenv private-gpt--sQCGbRe-py3.11 in C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs
Installing dependencies from lock file
Package operations: 156 installs, 1 update, 0 removals
...
Installing the current project: private-gpt (0.2.0)
(base) E:\Projects\privateGPT>poetry run python -m private_gpt
16:34:10.531 [INFO ] private_gpt.settings.settings_loader - Starting application with profiles=['default']
16:34:14.922 [INFO ] private_gpt.components.llm.llm_component - Initializing the LLM in mode=local
Traceback (most recent call last):
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 798, in get
return self._context[key]
~~~~~~~~~~~~~^^^^^
KeyError: <class 'private_gpt.ui.ui.PrivateGptUi'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 798, in get
return self._context[key]
~~~~~~~~~~~~~^^^^^
KeyError: <class 'private_gpt.server.ingest.ingest_service.IngestService'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 798, in get
return self._context[key]
~~~~~~~~~~~~~^^^^^
KeyError: <class 'private_gpt.components.llm.llm_component.LLMComponent'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\llama_index\llms\llama_cpp.py", line 102, in __init__
from llama_cpp import Llama
ModuleNotFoundError: No module named 'llama_cpp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "E:\Projects\privateGPT\private_gpt\__main__.py", line 5, in <module>
from private_gpt.main import app
File "E:\Projects\privateGPT\private_gpt\main.py", line 11, in <module>
app = create_app(global_injector)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\launcher.py", line 50, in create_app
ui = root_injector.get(PrivateGptUi)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 91, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 974, in get
provider_instance = scope_instance.get(interface, binding.provider)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 91, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 800, in get
instance = self._get_instance(key, provider, self.injector)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 811, in _get_instance
return provider.get(injector)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 264, in get
return injector.create_object(self._cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 998, in create_object
self.call_with_injection(init, self_=instance, kwargs=additional_kwargs)
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 1031, in call_with_injection
dependencies = self.args_to_inject(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 91, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 1079, in args_to_inject
instance: Any = self.get(interface)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 91, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 974, in get
provider_instance = scope_instance.get(interface, binding.provider)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 91, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 800, in get
instance = self._get_instance(key, provider, self.injector)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 811, in _get_instance
return provider.get(injector)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 264, in get
return injector.create_object(self._cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 998, in create_object
self.call_with_injection(init, self_=instance, kwargs=additional_kwargs)
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 1031, in call_with_injection
dependencies = self.args_to_inject(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 91, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 1079, in args_to_inject
instance: Any = self.get(interface)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 91, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 974, in get
provider_instance = scope_instance.get(interface, binding.provider)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 91, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 800, in get
instance = self._get_instance(key, provider, self.injector)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 811, in _get_instance
return provider.get(injector)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 264, in get
return injector.create_object(self._cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 998, in create_object
self.call_with_injection(init, self_=instance, kwargs=additional_kwargs)
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\injector\__init__.py", line 1040, in call_with_injection
return callable(*full_args, **dependencies)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\components\llm\llm_component.py", line 38, in __init__
self.llm = LlamaCPP(
^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\llama_index\llms\llama_cpp.py", line 104, in __init__
raise ImportError(
ImportError: Could not import llama_cpp library.Please install llama_cpp with `pip install llama-cpp-python`.See the full installation guide for GPU support at `https://github.com/abetlen/llama-cpp-python`
エラーになった。またこの環境ではポート 8001 が他で使用中だったので、Makefile、settings-sagemaker.yaml、settings.yaml の 8001 を 8002 に書き換えた。
エラー対策で続きを入れてみる。
(base) E:\Projects\privateGPT>poetry install --with local Installing dependencies from lock file Package operations: 7 installs, 0 updates, 0 removals - Installing scipy (1.11.4) - Installing threadpoolctl (3.2.0) - Installing diskcache (5.6.3) - Installing scikit-learn (1.3.2) - Installing torchvision (0.16.2) - Installing llama-cpp-python (0.2.23) - Installing sentence-transformers (2.2.2) Installing the current project: private-gpt (0.2.0) (base) E:\Projects\privateGPT>poetry run python scripts/setup
もう一度実行してみる。
(base) E:\Projects\privateGPT>poetry run python -m private_gpt
ファイルのダウンロードがされて、これで http://localhost:8002 を開いたら UI にアクセスできた。
ただ何か送ると「AttributeError: 'NoneType' object has no attribute 'split'」というエラーになる。
17:04:53.533 [INFO ] uvicorn.access - 127.0.0.1:52251 - "GET /queue/data?session_hash=nxrm6zcbks HTTP/1.1" 200
Traceback (most recent call last):
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\queueing.py", line 495, in call_prediction
output = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\route_utils.py", line 231, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\blocks.py", line 1594, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\blocks.py", line 1188, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\utils.py", line 513, in async_iteration
return await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\utils.py", line 639, in asyncgen_wrapper
response = await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\chat_interface.py", line 487, in _stream_fn
first_response = await async_iteration(generator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\utils.py", line 513, in async_iteration
return await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\utils.py", line 506, in __anext__
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\gradio\utils.py", line 489, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\ui\ui.py", line 127, in _chat
all_messages = [*build_history(), new_message]
^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\ui\ui.py", line 109, in build_history
*[
^
File "E:\Projects\privateGPT\private_gpt\ui\ui.py", line 114, in <listcomp>
content=interaction[1].split(SOURCES_SEPARATOR)[0],
^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'
17:04:53.589 [INFO ] uvicorn.access - 127.0.0.1:52251 - "POST /run/predict HTTP/1.1" 200
一旦 PrivateGPT を再起動し、Mode に LLM Chat を選択して入れ直したら返事をしてくれた。「hello」と入れたら
Hello! How may I assist you today? I'm here to help answer any questions you have to the best of my ability. Please keep in mind that I cannot provide speculative or made-up information, and must always follow instructions given to me. Let me know if you have a specific question or topic you'd like me to help with.
「こんにちは」と入れたら
Hello there! I see you've greeted me in Japanese. That's great! If you have any questions or need assistance with something, feel free to ask. I'll do my best to help you in a respectful and honest manner. Let me know if you need anything specific. Is there a particular topic or question you have in mind?
日本語ということはわかるらしい。
試しに日本語の電子書籍の PDF を読み込ませて、ざっくり問い合わせてみた。英語で回答してくれたけれど、Chat with RTX 程には分析してはくれなかったらしい。
多分モデル?とか、日本語対応のいろいろを試してみる必要がありそう。要調査。
で……これはどこかのフォルダを指定してあらかじめ学習、という形式ではないらしい。アップロードフォームがある。それは面倒。
Vector Stores として他のプログラムを使っているらしい。これに大量に流し込めればいいんだろうか。ただちょっと見た感じではファイルの流し込み方がわからなかった。
あとは応答がとても遅いけれど、CPU (Core i5 13500K) で処理しているので、GPU を入れれば改善するはず。
探してみたら、フォルダまるごと登録というものがあるらしい。 LINK
(base) E:\Projects\privateGPT>poetry run python scripts\ingest_folder.py D:\Books --watch --log-file ingestLog.txt
Traceback (most recent call last):
File "E:\Projects\privateGPT\scripts\ingest_folder.py", line 102, in <module>
worker.ingest_folder(root_path, args.ignored)
File "E:\Projects\privateGPT\scripts\ingest_folder.py", line 38, in ingest_folder
self._ingest_all(self._files_under_root_folder)
File "E:\Projects\privateGPT\scripts\ingest_folder.py", line 42, in _ingest_all
self.ingest_service.bulk_ingest([(str(p.name), p) for p in files_to_ingest])
File "E:\Projects\privateGPT\private_gpt\server\ingest\ingest_service.py", line 92, in bulk_ingest
documents = self.ingest_component.bulk_ingest(files)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\components\ingest\ingest_component.py", line 127, in bulk_ingest
documents = IngestionHelper.transform_file_into_documents(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\components\ingest\ingest_helper.py", line 30, in transform_file_into_documents
documents = IngestionHelper._load_file_to_documents(file_name, file_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\components\ingest\ingest_helper.py", line 51, in _load_file_to_documents
return reader_cls().load_data(file_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\llama_index\readers\file\docs_reader.py", line 30, in load_data
pdf = pypdf.PdfReader(fp)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\pypdf\_reader.py", line 352, in __init__
self._encryption.verify(pwd) == PasswordType.NOT_DECRYPTED
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\pypdf\_encryption.py", line 953, in verify
key, rc = self.verify_v4(pwd) if self.V <= 4 else self.verify_v5(pwd)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\pypdf\_encryption.py", line 990, in verify_v5
key = AlgV5.verify_owner_password(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\pypdf\_encryption.py", line 532, in verify_owner_password
AlgV5.calculate_hash(R, password, o_value[32:40], u_value[:48])
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\pypdf\_encryption.py", line 577, in calculate_hash
e = aes_cbc_encrypt(k[:16], k[16:32], k1 * 64)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxxxxxxx\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\pypdf\_crypt_providers\_fallback.py", line 89, in aes_cbc_encrypt
raise DependencyError(_DEPENDENCY_ERROR_STR)
pypdf.errors.DependencyError: cryptography>=3.1 is required for AES algorithm
エラーになった。cryptography だと Requirement already satisfied になる。LINK によると pycryptodome を入れるらしい?
(base) E:\Projects\privateGPT>pip install pycryptodome
同じエラーで止まった。
(base) E:\Projects\privateGPT>poetry add cryptography
というかよく考えたら日本語をちゃんと認識したし PDF の中身も英訳して答えていたから、どこかに出力言語の設定がある気がする。
あとは Chat with RTX で使ったモデル?と使われているベクトル化?のを合わせてあげると同じような精度の回答が得られるのでは。
2023/Feb/29 追記
PrivateGPT は出力言語設定ではなくモデルで変えるらしい。
そういえば Chat with RTX も日本語化けてた。日本語対応モデルを使ってもまだ化けるらしいので、ベクトル化?embeddings で日本語対応でないとだめかな?
そして一晩放っておいたらまたエラー。
Traceback (most recent call last):
File "C:\Users\takagiwa\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\llama_index\readers\file\epub_reader.py", line 21, in load_data
import ebooklib
ModuleNotFoundError: No module named 'ebooklib'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:\Projects\privateGPT\scripts\ingest_folder.py", line 102, in <module>
worker.ingest_folder(root_path, args.ignored)
File "E:\Projects\privateGPT\scripts\ingest_folder.py", line 38, in ingest_folder
self._ingest_all(self._files_under_root_folder)
File "E:\Projects\privateGPT\scripts\ingest_folder.py", line 42, in _ingest_all
self.ingest_service.bulk_ingest([(str(p.name), p) for p in files_to_ingest])
File "E:\Projects\privateGPT\private_gpt\server\ingest\ingest_service.py", line 92, in bulk_ingest
documents = self.ingest_component.bulk_ingest(files)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\components\ingest\ingest_component.py", line 127, in bulk_ingest
documents = IngestionHelper.transform_file_into_documents(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\components\ingest\ingest_helper.py", line 30, in transform_file_into_documents
documents = IngestionHelper._load_file_to_documents(file_name, file_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Projects\privateGPT\private_gpt\components\ingest\ingest_helper.py", line 51, in _load_file_to_documents
return reader_cls().load_data(file_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\takagiwa\AppData\Local\pypoetry\Cache\virtualenvs\private-gpt--sQCGbRe-py3.11\Lib\site-packages\llama_index\readers\file\epub_reader.py", line 25, in load_data
raise ImportError(
ImportError: Please install extra dependencies that are required for the EpubReader: `pip install EbookLib html2text`
同じように入れてみた。
(base) E:\Projects\privateGPT>poetry add EbookLib html2text
とはいえこのままでは1ファイルからの処理しかできないようなので、ひとまずここまで。違うシステムを探してみよう。