MLX库GGUF文件加载漏洞深度剖析：野指针解引用导致段错误

漏洞概述

CVE-2025-62609 是MLX机器学习框架中的一个中等严重性漏洞，存在于mlx::core::load_gguf()函数中。当加载恶意的GGUF（GPT-Generated Unified Format）文件时，该函数会解引用来自外部gguflib库的未经验证指针，导致应用程序崩溃（分段错误）。

环境信息

操作系统: Ubuntu 20.04.6 LTS
编译器: Clang 19.1.7

漏洞详情

位置: mlx/io/gguf.cpp

漏洞根源于 extract_tensor_data() 函数（第59-79行），具体是第64-67行的 memcpy 操作。该函数由 load_arrays() 函数（第177行）调用。

有缺陷的代码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


std::tuple<allocator::Buffer, Dtype> extract_tensor_data(gguf_tensor* tensor) {
  std::optional<Dtype> equivalent_dtype = gguf_type_to_dtype(tensor->type);
  if (equivalent_dtype.has_value()) {
    allocator::Buffer buffer = allocator::malloc(tensor->bsize);
    memcpy(
        buffer.raw_ptr(),
        tensor->weights_data,  // 来自gguflib的不可信指针，未经校验
        tensor->num_weights * equivalent_dtype.value().size());
    return {buffer, equivalent_dtype.value()};
  }
  // ...
}

问题: tensor->weights_data 指针直接来自外部库，在调用 memcpy 前未进行空值（NULL）或有效性检查。

修复方案

修复方法是在执行 memcpy 前添加指针验证。

修复后的代码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


std::tuple<allocator::Buffer, Dtype> extract_tensor_data(gguf_tensor* tensor) {
  std::optional<Dtype> equivalent_dtype = gguf_type_to_dtype(tensor->type);
  if (equivalent_dtype.has_value()) {
    // 修复：验证指针
    if (!tensor->weights_data) {
      throw std::runtime_error("[load_gguf] NULL tensor data pointer");
    }

    allocator::Buffer buffer = allocator::malloc(tensor->bsize);
    memcpy(
        buffer.raw_ptr(),
        tensor->weights_data,
        tensor->num_weights * equivalent_dtype.value().size());
    return {buffer, equivalent_dtype.value()};
  }
  // ...
}

概念验证 (PoC)

安装MLX:
1

pip install mlx

执行漏洞利用代码:

1

python3 -c "import mlx.core as mx; mx.load('exploit.gguf', format='gguf')"

AddressSanitizer 输出（使用检测版本构建）

当使用AddressSanitizer检测工具运行PoC时，会输出详细的崩溃堆栈跟踪，明确指出了问题发生在 memcpy 操作（源自 extract_tensor_data 函数）读取无效内存地址时。核心错误信息如下：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


AddressSanitizer:DEADLYSIGNAL
=================================================================
==5855==ERROR: AddressSanitizer: SEGV on unknown address 0x7fc432f64bc0 (pc 0x7fc430841c12 bp 0x7ffc04847ab0 sp 0x7ffc04847268 T0)
==5855==The signal is caused by a READ memory access.
    #0 0x7fc430841c12  /build/glibc-B3wQXB/glibc-2.31/string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
    #1 0x55aac829756b in __asan_memcpy (...)
    #2 0x55aacaa6e8dc in mlx::core::extract_tensor_data(gguf_tensor*) /home/user1/mlx/mlx/io/gguf.cpp:64:5
    #3 0x55aacaa773fc in mlx::core::load_arrays[abi:cxx11](gguf_ctx*) /home/user1/mlx/mlx/io/gguf.cpp:226:35
    #4 0x55aacaa782a9 in mlx::core::load_gguf(...) /home/user1/mlx/mlx/io/gguf.cpp:250:17
    ...
SUMMARY: AddressSanitizer: SEGV /build/glibc-B3wQXB/glibc-2.31/string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
==5855==ABORTING

影响

攻击向量: 恶意的GGUF文件（通常是模型权重文件，来源可能不可信）。
影响范围: 所有平台上的MLX用户，只要他们使用未经验证的输入调用了此脆弱方法。
后果: 分段错误（无法通过异常处理程序捕获），导致应用程序崩溃。

受影响版本与修复版本

受影响版本: MLX pip包版本 <= 0.29.3
已修复版本: MLX pip包版本 0.29.4

致谢

此漏洞由ARIMLABS的安全研究员发现并报告：

Markiyan Melnyk (报告者)
Mykyta Mudryi (发现者)
Markiyan Chaklosh (发现者)