【langchain】实战笔记-Langchain-Chatchat win10本地部署FAQ（持续更新）

Langchain-Chatchat win10本地部署问题集

码农丁丁

3884人浏览 · 2024-01-26 15:54:00

码农丁丁 · 2024-01-26 15:54:00 发布

在发布了《 Langchain-Chatchat-win10本地安装部署成功笔记（CPU）》后，有不少同学给我私信，说根据我的文档安装成功，也有不少人在安装过程报错了，我的文章是基于V0.2.6版本写的，大部分都是因为升级版本引起的。当前最新版本是V0.2.10，经过验证V0.2.9按照原来的文章或官网的安装部署文档说明进行私有化安装部署都会出现一些问题,建议用V0.2.10版本。

本文专注记录安装部署新版本出现的问题和解决方法。

版本更新记录

V0.2.10

重要提示

Langchain-Chatchat v0.2.10 版本中修改了configs中配置项，建议所有用户按照 Wiki 的开发部署中的相关描述重新完成项目中的配置文件生成。开发部署过程遇到问题请先到 Github Wiki / issues 中搜索。

本次更新中依赖 Python 包与其版本发生了大量更新，请使用pip install -r requirements.txt进行更新。

此版本为 v0.2.x系列最后一个版本，具备全新 Agent 功能的 v0.3.0即将上线，敬请期待。

新增功能

优化 PDF 文件的 OCR，过滤无意义的小图片 by @liunux4odoo #2525
支持 Gemini 在线模型 by @yhfgyyf #2630
支持 GLM4 在线模型 by @zRzRzRzRzRzRzR
elasticsearch更新https连接 by @xldistance #2390
增强对PPT、DOC知识库文件的OCR识别 by @596192804 #2013
更新 Agent 对话功能 by @zRzRzRzRzRzRzR
每次创建对象时从连接池获取连接，避免每次执行方法时都新建连接 by @Lijia0 #2480
实现 ChatOpenAI 判断token有没有超过模型的context上下文长度 by @glide-the
更新运行数据库报错和项目里程碑 by @zRzRzRzRzRzRzR #2659
更新配置文件/文档/依赖 by @imClumsyPanda @zRzRzRzRzRzRzR
添加日文版 readme by @eltociear #2787

问题修复

ApiRequest.agent_chat 返回 dict 而非 str by @liunux4odoo #2520
修复milvus_kwargs问题 by @hzg0601 #2540
纠正 make_text_splitter 中 chunk_* parameters 参数的使用 by @liunux4odoo #2564
过滤 sse_starlette 返回的 ping 包，避免 JSON Decoder error : ping -... by @liunux4odoo #2585
langchain 更新后，PGVector 向量库连接错误 by @HALIndex #2591
删除重复的引入和纠正拼写错误 by @tiandiweizun #2599
Minimax's model worker 错误 by @xyhshen
ES库无法向量检索.添加mappings创建向量索引 by @MSZheng20 #2688
KBService 中几处拼写错误 by @hzg0601 #2640
pytorch 自动检测设备 by @chatgpt-1, @Drincann, @zRzRzRzRzRzRzR #2514 #2570

V0.2.9

重要提示

Langchain-Chatchat v0.2.9 版本中修改了configs中配置项，建议所有用户按照 Wiki 的开发部署中的相关描述重新完成项目中的配置文件生成。开发部署过程遇到问题请先到 Github Wiki / issues 中搜索。

此外，v0.2.9 版本中知识库相关信息的数据库表发生了变化，如果继续使用之前版本的配置，可使用 python init_database.py --create-tables 仅更新数据库表，不重建知识库。

本次更新中依赖 Python 包与其版本发生了大量更新，请使用pip install -r requirements.txt进行更新。

新增功能

文件对话和知识库对话 API 接口实现全异步操作，防止阻塞 by @liunux4odoo in #2256
更新默认模型下载链接 by @YQisme in #2259
OCR 支持 GPU 加速（需要手动安装 rapidocr_paddle[gpu])；知识库支持 MHTML 和 Evernote 文件。 by @liunux4odoo in #2265
使用Reranker模型对召回语句进行重排 by @hzg0601 in #2435
知识库管理界面支持查看、编辑、删除向量库文档 by @liunux4odoo in #2471

问题修复

修改 duckduckgo 依赖错误 by @zRzRzRzRzRzRzR @hzg0601 in #2251 and #2252
修复Azure 不设置Max token的bug by @zRzRzRzRzRzRzR in #2254
fix: prompt template name error in file_chat by @liunux4odoo in #2366
优化EventSource回包 by @lookou in #1200
fix: 文档错误 by @jaluik in #2384
更新 self.dims_length 赋值错误 by @xldistance in #2380
修复knowledge_base_chat_iterator 传参错误 by @xldistance in #2386
fixed 迭代器参数传递错误，知识库问答报错TypeError: unhashable type: 'list' by @Astlvk in #2383
fix:使用在线embedding模型时报错 There is no current event loop in thread 'Any… by @Funkeke in #2393
fix Yi-34b model config error(close #2491) by @liunux4odoo in #2492
remove /chat/fastchat API endpoint by @liunux4odoo in #2506

V0.2.8

重要提示

Langchain-Chatchat v0.2.8 版本中修改了configs中配置项，建议所有用户按照 Wiki 的开发部署中的相关描述重新完成项目中的配置文件生成。开发部署过程遇到问题请先到 Github Wiki / issues 中搜索。

此外，v0.2.8 版本中知识库相关信息的数据库表发生了变化，且默认 embedding 模型修改为bge-large-zh，如保持默认设置，请按需重建知识库。如果继续使用之前版本的配置，可使用 python init_database.py --create-tables 仅更新数据库表，不重建知识库。

本次更新中依赖 Python 包与其版本发生了大量更新，请使用pip install -r requirements.txt进行更新。

新增功能

添加文件对话模式 by @liunux4odoo in #2071
知识库支持 .jsonl, .epub, .xlsx, .xlsd, .ipynb, .odt, .py, .srt, .toml, .doc, .ppt 文件 by @liunux4odoo in #2079
支持昆仑万维天工大模型 by @nathubs in #2166
完善对 ChatGLM3-6B 的支持 by @zRzRzRzRzRzRzR in #2021 and #2058
增加聊天记录在数据库中的存储 by @qiankunli in #2046
完善 WebUI 模型列表中对在线模型和本地模型的支持 by @liunux4odoo in #2060
支持软连接的知识库 by @zRzRzRzRzRzRzR in #2167
给 ApiModelWorker 添加 logger 成员变量，API请求出错时输出有意义的错误信息。 by @liunux4odoo in #2169
提供文档 summary_chunk ，支持单文件总结业务实现使用 MapReduceDocumentsChain 生成摘要 by @glide-the in #2175
单个知识库根据 doc_ids 摘要 by @glide-the in #2176
添加自定义命令用于管理多会话 by @liunux4odoo in #2229
优化知识库文档加载 by @liunux4odoo in #2091
统一在线模型异常报文、增加详细日志 by @glide-the in #2130
更新 ChatGLM3-6B agent的文档，提示词 by @zRzRzRzRzRzRzR in #2041
更新 requirements by @liunux4odoo @hzg0601 @zRzRzRzRzRzRzR @imClumsyPanda in #2033 , #2170 , #2213 and #2246
更新 README.md by @VignetteApril @weartist in #2034 and #2049
更新 config 模版 by @hzg0601 @zRzRzRzRzRzRzR @imClumsyPanda in #2110 and #2171
数据库和向量库中文档 metadata["source"] 改为相对路径，便于向量库迁移 by @liunux4odoo in #2153

问题修复

修复：知识库json文件的中文被转为 unicode 码，导致无法匹配 by @liunux4odoo in #2128
将 MiniMax 和千帆在线 Embedding 改为 10 个文本一批，防止接口数量限制 by @liunux4odoo in #2161
修复 startup.py by @hzg0601 in #2162 and #2173
bug 修复和提示词修改 by @zRzRzRzRzRzRzR in #2230
一些细节更新 by @zRzRzRzRzRzRzR in #2235
修复: MiniMax 和千帆在线 embedding 模型分批请求的 bug by @alanlaye617 in #2208
修复: chat 接口默认使用 memory 获取 10 条历史消息，导致最终拼接的 prompt 出错 by @liunux4odoo in #2247

模型文件共享

可能有些同学没有对应的模型文件，共享一下我用到的几个模型，可持续更新，要是访问不了，可私信。

链接：https://pan.baidu.com/s/1phG6lDVoOzhyuROqyZInBA?pwd=dyqv 
提取码：dyqv

快速安装命令

假设已安装torch，以下命令是基于V0.2.10版本进行安装部署的。

#1、拉取代码
git clone https://github.com/chatchat-space/Langchain-Chatchat.git
#2、到langchain源码根目录
cd Langchain-Chatchat
#3、安装依赖
pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/


#4、自动生产配置文件，configs目录下，复制所有example配置文件，去掉example
python copy_config_example.py


#5、修改model_config.py文件
# 选用的 Embedding 名称，内置模型
#EMBEDDING_MODEL = "bge-large-zh-v1.5"
EMBEDDING_MODEL = "bge-large-zh"
# Embedding 模型运行设备。设为 "auto" 会自动检测(会有警告)，也可手动设定为 "cuda","mps","cpu","xpu" 其中之一。
EMBEDDING_DEVICE = "cpu"


#内置模型地址
#"bge-large-zh": "BAAI/bge-large-zh",
"bge-large-zh": "E:\\llm_models\\bge-large-zh",


#LLM_MODELS = ["chatglm3-6b", "zhipu-api", "openai-api"] 
#默认启动模型，根据实际情况修改
LLM_MODELS = ["chatglm2-6b-int4", "openai-api"] 
LLM_DEVICE = "cpu"


    "openai-api": {
        "model_name": "gpt-4",
        "api_base_url": "https://api.openai.com/v1",
        #修改open -api key，如果不启动open api就不用修改添加api_key
        "api_key": "sk-xxxxxxxxxx",
        "openai_proxy": "",
    },
    "llm_model": {
        "chatglm2-6b": "THUDM/chatglm2-6b",
        "chatglm2-6b-32k": "THUDM/chatglm2-6b-32k",
        "chatglm3-6b": "THUDM/chatglm3-6b",
        "chatglm3-6b-32k": "THUDM/chatglm3-6b-32k",
        #添加一条chatglm2-6b-int4，如果用其他版本的，对应修改
        "chatglm2-6b-int4": "E:\\llm_models\\chatglm2-6b-int4",
        
#6、重新初始化数据库
python init_database.py --recreate-vs
#注意，不要用python3 init_database.py --recreate-vs


#7、启动
python startup.py -a

FAQ集：

问题1： No moule named 'langchain'

\L029\Langchain-Chatchat>python startup.py -a
Traceback (most recent call last):
  File "D:\opt\L029\Langchain-Chatchat\startup.py", line 21, in <module>
    from configs import (
  File "D:\opt\L029\Langchain-Chatchat\configs\__init__.py", line 1, in <module>
    from .basic_config import *
  File "D:\opt\L029\Langchain-Chatchat\configs\basic_config.py", line 3, in <module>
    import langchain
ModuleNotFoundError: No module named 'langchain'

A1：

pip3 install langchain

问题2： No module named 'fastapi'/ 'fastapi'/'httpx'等

>python startup.py -a
C:\Users\xxxxx\.conda\envs\l3\lib\site-packages\langchain\chat_models\__init__.py:31: LangChainDeprecationWarning: Importing chat models from langchain is deprecated. Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:


`from langchain_community.chat_models import ChatOpenAI`.


To install langchain-community run `pip install -U langchain-community`.
  warnings.warn(
C:\Users\xxxxx\.conda\envs\l3\lib\site-packages\langchain\llms\__init__.py:548: LangChainDeprecationWarning: Importing LLMs from langchain is deprecated. Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:


`from langchain_community.llms import OpenAI`.


To install langchain-community run `pip install -U langchain-community`.
  warnings.warn(
C:\Users\xxxxx\.conda\envs\l3\lib\site-packages\langchain\llms\__init__.py:548: LangChainDeprecationWarning: Importing LLMs from langchain is deprecated. Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:


`from langchain_community.llms import AzureOpenAI`.


To install langchain-community run `pip install -U langchain-community`.
  warnings.warn(
C:\Users\xxxxx\.conda\envs\l3\lib\site-packages\langchain\llms\__init__.py:548: LangChainDeprecationWarning: Importing LLMs from langchain is deprecated. Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:


`from langchain_community.llms import Anthropic`.


To install langchain-community run `pip install -U langchain-community`.
  warnings.warn(
Traceback (most recent call last):
  File "D:\opt\L029\Langchain-Chatchat\startup.py", line 35, in <module>
    from server.utils import (fschat_controller_address, fschat_model_worker_address,
  File "D:\opt\L029\Langchain-Chatchat\server\utils.py", line 14, in <module>
    import httpx
ModuleNotFoundError: No module named 'httpx'

A2：

这种情况下，很可能安装了其他东西，重装一下requirments.txt的依赖

我的环境问题主要是重新安装了一下torch的GPU版本，然后就启动不了了，重新安装依赖后就正常

pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/

问题3：ERROR: Could not build wheels for jq, which is required to install pyproject.toml-based projects

Building wheels for collected packages: jq, pysrt, streamlit-modal, sgmllib3k
  Building wheel for jq (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for jq (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [5 lines of output]
      running bdist_wheel
      running build
      running build_ext
      Executing: ./configure CFLAGS=-fPIC --prefix=C:\Users\xxx\AppData\Local\Temp\pip-install-j7c7k4bx\jq_9d733042867f41deaf65d6ad1794be6c\_deps\build\onig-install-6.9.8
      error: [WinError 2] 系统找不到指定的文件。
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for jq
  Building wheel for pysrt (setup.py) ... done
  Created wheel for pysrt: filename=pysrt-1.1.2-py3-none-any.whl size=13464 sha256=fbff3ed0c568e650bfd383891cbf637cc8539440f77b7cae1dfc2d60213bdb72
  Stored in directory: c:\users\xxx\appdata\local\pip\cache\wheels\8c\b7\58\74aa4c25086b8543b17de4454dcac97a641f7750e8e654b2c9
  Building wheel for streamlit-modal (setup.py) ... done
  Created wheel for streamlit-modal: filename=streamlit_modal-0.1.0-py3-none-any.whl size=4220 sha256=21e9cf2461d531b5836abebd3de05a7794089b31631172c3df82b263c635a2e4
  Stored in directory: c:\users\xxx\appdata\local\pip\cache\wheels\2e\0c\e0\0654ef5121fbcf01422a557ef2ac4cd8aff9b94b24b93c4363
  Building wheel for sgmllib3k (setup.py) ... done
  Created wheel for sgmllib3k: filename=sgmllib3k-1.0.0-py3-none-any.whl size=6061 sha256=6d9c52f7965461b9271561e621e92e6154a0d617780a976b5f07bfc794085458
  Stored in directory: c:\users\xxx\appdata\local\pip\cache\wheels\50\20\4b\e95fc891917d652cb6ecbfea035cf3ce640259cf857aaa21a7
Successfully built pysrt streamlit-modal sgmllib3k
Failed to build jq
ERROR: Could not build wheels for jq, which is required to install pyproject.toml-based projects

A3

conda install jq

需要修改处理json的代码

待解决问题

待解决问题1：

2024-01-26 14:14:43,285 - utils.py[line:295] - INFO: RapidOCRLoader used for D:\opt\L029\Langchain-Chatchat\knowledge_base\samples\content\llm/img/大模型应用技术原理-幕布图片-580318-260070.jpg
None is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`None is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`None is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

这个问题在初始化数据库发生的，待研究。

待解决问题2：

Batches: 100%|████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:34<00:00, 34.01s/it]
('samples', 'test_files/langchain-ChatGLM_closed.jsonl', '从文件 samples/test_files/langchain-ChatGLM_closed.jsonl 加载文档时出错：jq package not found, please install it with `pip install jq`')

虽然有方案说用conda install jq，事实上没有作用，从issue里看，需要修改源码。

开放原子开发者工作坊

开放原子开发者工作坊旨在鼓励更多人参与开源活动，与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动，如meetup、训练营等，主打技术交流，干货满满，真诚地邀请各位开发者共同参与！

更多推荐

“源”聚天大，码动未来|开放原子校源行（天津大学站）即将启幕！

11月22日，开放原子校源行（天津大学站）将在天津大学北洋园校区隆重举办。