十分钟部署最新GLM4大模型（同参数最强）（CPU）- Community

[ Content contribution] 十分钟部署最新GLM4大模型（同参数最强）（CPU）

Technology Exchange 2191 views · 17 replies ·

来我的哔哩哔哩看看如何

deepin

2024-07-17 12:37

Author

首先在下列链接下载模型文件

FP16模型（推荐运行内存：24GB）

int8量化模型（推荐运行内存：16GB）

int4量化模型（推荐运行内存：12GB）

int2量化模型（推荐运行内存：8GB）

然后打开你的终端，运行以下命令

sudo apt install git cmake ccache #安装编译依赖
git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp #克隆远程仓库
cmake -B build
cmake --build build --config Release -j #开始编译
cd build/bin #进入编译结果文件夹

然后打开你的文件管理器，把前面下载下来的模型文件重命名为glm，并复制到主目录的llama.cpp/build/bin文件夹里面。

再次执行下列命令

./llama-server -m ./glm.gguf --port 8080

在浏览器中访问 http://localhost:8080/index-new.html

鼠标滑到底部，开始对话吧！

完美！

Reply Like 3 Favorite View the author

All Replies

瑞莎Radxa

deepin

2024-07-17 14:47

帖子里的图片崩了诶

可以编辑帖子重新上传一下

Reply Like 0 View the author

131******66

deepin

2024-07-17 15:00

为什么我上传图片就在转圈圈呢？

Reply Like 0 View the author

raspbian

deepin

2024-07-17 16:48

能用docker部署就是好文明！

但我还得自己排查怎么用

Reply Like 0 View the author

TangentSky85954

deepin

2024-07-17 19:05

知识严重错误

Reply Like 0 View the author

TangentSky85954

deepin

2024-07-17 19:21

想问下用不满CPU资源有办法吗

Reply Like 0 View the author

来我的哔哩哔哩看看如何

deepin

2024-07-17 21:36

TangentSky85954：

知识严重错误

9B小模型是这样的

Reply Like 0 View the author

来我的哔哩哔哩看看如何

deepin

2024-07-17 21:37

131******66：

为什么我上传图片就在转圈圈呢？

此模型不支持图片上传，这不是多模态模型

Reply Like 0 View the author

来我的哔哩哔哩看看如何

deepin

2024-07-17 21:38

TangentSky85954：

想问下用不满CPU资源有办法吗

能发一下CPU型号吗

Reply Like 0 View the author

TangentSky85954

deepin

2024-07-17 22:05

来我的哔哩哔哩看看如何：

能发一下CPU型号吗

i7-12700K

下载的是FP16模型

Reply Like 0 View the author

来我的哔哩哔哩看看如何

deepin

2024-07-17 22:18

#10

TangentSky85954：

i7-12700K

下载的是FP16模型

12代的正常情况intel大小核的问题

Reply Like 0 View the author

TangentSky85954

deepin

2024-07-17 22:20

#11

来我的哔哩哔哩看看如何：

12代的正常情况intel大小核的问题

那咋办，难道关小核吗

Reply Like 0 View the author

来我的哔哩哔哩看看如何

deepin

2024-07-17 22:32

#12

TangentSky85954：

i7-12700K

下载的是FP16模型

尝试在启动命令后面加一个

-t 物理内核数

的参数

Reply Like 0 View the author

来我的哔哩哔哩看看如何

deepin

2024-07-17 22:33

#13

TangentSky85954：

那咋办，难道关小核吗

尝试在启动命令后面加一个

-t 12

的参数

Reply Like 0 View the author

TangentSky85954

deepin

2024-07-17 22:56

#14

来我的哔哩哔哩看看如何：

尝试在启动命令后面加一个

-t 12

的参数

看了一下，利用率高了20%左右

Reply Like 0 View the author

来我的哔哩哔哩看看如何

deepin

2024-07-18 07:48

#15

TangentSky85954：

看了一下，利用率高了20%左右

这个参数是强制跑满12个核心。现在token输出有变快吗

Reply Like 0 View the author

TangentSky85954

deepin

2024-07-18 12:49

#16

来我的哔哩哔哩看看如何：

这个参数是强制跑满12个核心。现在token输出有变快吗

有点变快

Reply Like 0 View the author

蓝鲸

deepin

2024-07-18 16:41

#17

怎么看都是吃资源的大户呀，我的电脑还是落后啊

Reply Like 0 View the author

Featured Collection

Change

New Thread

Popular Ranking

Change

Popular Events