Skip to content

Llama.cpp 待开展工作 #33

@noemotiovon

Description

@noemotiovon
  • 在parallel下,-kvu模式在不共享同一个系统 prompt模式下存在精度问题
  • 优化aclTensor创建流程,使用Wrapper包装一下,避免因为未释放而导致内存泄漏

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions