注册 登录  
 加关注
查看详情
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

银河里的星星

落在人间

 
 
 

日志

 
 

cuda安装 nvidia.ko错误  

2010-04-27 22:51:48|  分类: 高性能计算 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
root@gpu-node1 cuda_test]# ./a.out
FATAL: Error inserting nvidia (/lib/modules/2.6.18-164.el5PAE/kernel/drivers/video/nvidia.ko): Invalid module format
查看/var/log/ nvidia-installer.log,可以看到如下信息:
nvidia: disagrees about version of symbol struct_module.

[root@gpu-node1 install]# dmesg|grep gcc
Linux version 2.6.18-164.el5PAE (mockbuild@builder16.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Thu Sep 3 04:10:44 EDT 2009
[root@gpu-node1 install]# gcc --version
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)
Copyright (C) 2006 Free Software Foundation, I

在看dmesg发现启动的是Linux version 2.6.18-164.el5PAE,但是使用的是
./devdriver_3.0_linux_32_195.36.15.run --kernel-source-path /usr/src/kernels/2.6.18-164.el5-i686
所以导致内核不一致。
再次查看:grub.conf,发现有2个选项
default=1
timeout=5
splashimage=(hd0,7)/boot/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.18-164.el5)
        root (hd0,7)
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=/
        initrd /boot/initrd-2.6.18-164.el5.img
title CentOS (2.6.18-164.el5PAE)
        root (hd0,7)
        kernel /boot/vmlinuz-2.6.18-164.el5PAE ro root=LABEL=/
        initrd /boot/initrd-2.6.18-164.el5PAE.img
title Other
        rootnoverify (hd0,1)
        chainloader +1
title centos64
        rootnoverify (hd0,8)
        chainloader +1
~                             
系统默认进入的是CentOS (2.6.18-164.el5PAE),将该选项删除,直接进入第1个再安装。
./devdriver_3.0_linux_32_195.36.15.run --kernel-source-path /usr/src/kernels/2.6.18-164.el5-i686
这时终于成功了。

同时一开始用yum install kernel kernel-headers kernel-devel的时候,默认连接的是ustc的源,导致下的内核版本与安装的也不同,出现这种错误“nvidia: disagrees about version of symbol struct_module.”
一般就是因为内核与内核源码不一致。出现错误注意观测输出信息,以及/var/log下的安装日志。

因此注意做如下检查:
OKay having reconfigured your kernel did you reinstall your kernel and reboot to the new one, and did you check grub points to the new kernel image?

Also when you installed did you check that you had the /boot partition mounted? Due to the fact the gentoo handbook says to set the /boot partition to noauto a lot of new users forget to mount their boot partition before copying a new kernel to /boot and get confused.

You may also need to visit /lib/modules/{kernel-version}/kernel/drivers/ and poke around to find the nvidiafb module and remove it.

Something that con be helpful to check you got your kernel rebuild right is enabling the exporting of the currently running kernel config though /proc/config.gz, then you can gzcat /proc/config.gz | grep -i <searchterm>, to check you got it right and loaded the right kernel.



以下为引用:

http://forums.gentoo.org/viewtopic-t-811924-start-0.html
http://www.linuxquestions.org/questions/linux-general-1/unable-to-install-nvidia-drivers-587637/


-> Kernel module compilation complete.
ERROR: Unable to load the kernel module 'nvidia.ko'.  This happens most
       frequently when this kernel module was built against the wrong or
       improperly configured kernel sources, with a version of gcc that differs
       from the one used to build the target kernel, or if a driver such as
       rivafb/nvidiafb is present and prevents the NVIDIA kernel module from
       obtaining ownership of the NVIDIA graphics device(s).

参考这里的讨论:http://www.nvnews.net/vbulletin/showthread.php?t=49951&page=3

Looks like I spoke too soon, the problem was solved by using the "-k $(uname -r)"  (thank you jong0357 and the all of you)

把命令改为:
./devdriver_3.0_linux_32_195.36.15.run --kernel-source-path /usr/src/kernels/2.6.18-164.15.1.el5-i686 -k $(uname -r)


  评论这张
 
阅读(2422)| 评论(1)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2018