首页
直播
统计
壁纸
留言
友链
关于
Search
1
PVE开启硬件显卡直通功能
2,573 阅读
2
在k8s(kubernetes) 上安装 ingress V1.1.0
2,079 阅读
3
二进制安装Kubernetes(k8s) v1.24.0 IPv4/IPv6双栈
1,937 阅读
4
Ubuntu 通过 Netplan 配置网络教程
1,861 阅读
5
kubernetes (k8s) 二进制高可用安装
1,808 阅读
默认分类
登录
/
注册
Search
chenby
累计撰写
202
篇文章
累计收到
124
条评论
首页
栏目
默认分类
页面
直播
统计
壁纸
留言
友链
关于
搜索到
202
篇与
cby
的结果
2021-12-30
服务器被入侵,异常进程无法杀掉,随机进程名
故事情节: 有一天在聚餐中,我有一个朋友和我说他的服务器上有有个异常的进程他一直在占满CPU在运行,我在一顿谦虚之后答应了他,有空登录上他的服务器看一下具体情况。 这一天正是五月一日,一年一度的劳动节来了,我在家里闲着没事干在看某综艺,这时手机响了,来了一条微信消息,看到他给我发来了俩张图,突然勾起了我内心的好奇。 就是以上三张图,在proc目录中的exe指向的文件已被删除,我看到这里,我好奇这个进程肯定是被隐藏掉了。这时,我急中生智跟这位朋友要了root账号密码。登录服务器用top命令一看,发现一个奇怪的进程在运行,我使用kill命令将其杀后,等了十来分钟后,发现没有被启动,这时我和这位朋友说干掉了,他问我是不是kill掉了,我说嗯,他又补充到,这个进程杀掉过段时间会起来的,我问他大概多久就会启动,他说不清楚大概一天内肯定会启动。这时我慌了,如果是一天内才启动,我还得明天才能看见,那实在没办法了。我又开始看我的综艺了。 没过多久,我又看了一下,发现这个进程换了个名字又启动了。还干满了CPU,就在这时,我在研究这个进程运行文件的时候发现: 这个进程会连到一个韩国的服务器上,我访问这个IP发现是一个正常的网站,没有异常情况。 同时在查看运行目录的时候,发现如下问题 发现运行文件的命令也没有,同时运行目录也被删掉了。就在这时卡住了脖子,不知如何是好,这时突然想起来一个定时运行的脚本。打开脚本是这样的: 发现这个脚本是base64编码加密的,在网上找了一个解密的工具,进解密后发现这个是脚本完整脚本如图: 在下大概看了一下脚本内容,如下是执行一个临时文件并赋予一个执行权限在执行完成后将其删除,所以刚刚在看得时候发现执行的目录下得文件报红出现丢失的情况 最骚的是这里,关键东西在这里了。使用拼接组成一个URL进行下载病毒文件。通过一系列操作,先查看本地IP,又看了是我是谁,又看了机器的架构,还看了机器的主机名,同时还看了本地的网卡所有的IP。最关键的是把网络这一块搞成一个md5sum。在最后查看了定时任务并搞成了一个base64的字符串 再往下就是下载脚本执行并添加定时任务了,有意思的是这个脚本的2017年的,至今还再用。到最后我取消了他所有权限,并改了名字,同时把定时任务将其删除。到此该病毒已被清理。本文使用 文章同步助手 同步
2021年12月30日
781 阅读
0 评论
0 点赞
2021-12-30
华为人工智能atlasA800-9000物理服务器离线安装及CANN安装和MindSpore安装和Tensorflow安装
目录华为人工智能atlas A800-9000 物理服务器全程离线安装驱动以及CANN安装部署和MindSpore安装部署和Tensorflow安装部署A800-9000 物理服务器安装驱动使用镜像配置本地apt源创建普通用户并设置密码安装驱动以及固件验证是否安装成功CANN开发环境部署安装安装环境以及依赖安装完成后查看版本安装Python3.7.5使用Python3.7.5环境安装pip依赖包安装开发套件包CANN训练环境部署安装说明安装训练软件包安装MindSpore安装whl包配置环境变量测试是否可行安装mindinsight安装whl包配置环境变量启动及使用安装Tensorflow编译hdf5配置环境变量及软连接安装whl包安装Pytorch华为人工智能atlas A800-9000 物理服务器全程离线安装驱动以及CANN安装部署和MindSpore安装部署和Tensorflow安装部署背景Atlas 800 训练服务器(型号:9000)是基于华为鲲鹏920+昇腾910处理器的AI训练服务器,具有最强算力密度、超高能效与高速网络带宽等特点。该服务器广泛应用于深度学习模型开发和训练,适用于智慧城市、智慧医疗、天文探索、石油勘探等需要大算力的行业领域。链接:https://e.huawei.com/cn/products/cloud-computing-dc/atlas/atlas-800-training-9000CANN (Compute Architecture for Neural Networks)是华为公司针对AI场景推出的异构计算架构,通过提供多层次的编程接口,支持用户快速构建基于昇腾平台的AI应用和业务。链接:https://e.huawei.com/cn/products/cloud-computing-dc/atlas/cannMindSpore,新一代AI开源计算框架。创新编程范式,AI科学家和工程师更易使用,便于开放式创新;该计算框架可满足终端、边缘计算、云全场景需求,能更好保护数据隐私;可开源,形成广阔应用生态。链接:https://www.mindspore.cn/TensorFlow最初由谷歌大脑团队开发,用于Google的研究和生产,于2015年11月9日在Apache 2.0开源许可证下发布。 链接: https://www.tensorflow.org/A800-9000物理服务器安装驱动使用镜像配置本地apt源root@ubuntu:/etc/apt# mkdir/media/cdrom root@ubuntu:/etc/apt# mount/home/cby/ubuntu-18.04.5-server-arm64.iso /media/cdrom mount: /media/chrom: WARNING:device write-protected, mounted read-only. root@ubuntu:/etc/apt# apt-cdrom-m -d=/media/cdrom/ add root@ubuntu:/etc/apt# cat/etc/apt/sources.list创建普通用户并设置密码root@ubuntu:~#groupadd HwHiAiUser root@ubuntu:~#useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser root@ubuntu:~#passwd HwHiAiUser Enter newUNIX password: Retype newUNIX password: passwd:password updated successfully安装驱动以及固件root@ubuntu:~# cd /home/cby/ root@ubuntu:/home/cby# ll total 98324 drwxr-xr-x 4 cby cby 4096 Apr 21 21:41 ./ drwxr-xr-x 4 root root 4096 Apr 21 21:44 ../ -rw-r--r-- 1 cby cby 99728721 Apr 21 21:41A800-9000-npu-driver\_20.2.0\_ubuntu18.04-aarch64.run -rw-r--r-- 1 cby cby 912335 Apr 21 21:41 A800-9000-npu-firmware\_1.76.22.3.220.run root@ubuntu:/home/cby# chmod +x\*.run root@ubuntu:/home/cby# ll total 98324 drwxr-xr-x 4 cby cby 4096 Apr 21 21:41 ./ drwxr-xr-x 4 root root 4096 Apr 21 21:44 ../ -rwxr-xr-x 1 cby cby 99728721 Apr 21 21:41 A800-9000-npu-driver\_20.2.0\_ubuntu18.04-aarch64.run\* -rwxr-xr-x 1 cby cby 912335 Apr 21 21:41 A800-9000-npu-firmware\_1.76.22.3.220.run\* root@ubuntu:/home/cby# aptinstall gcc root@ubuntu:/home/cby# aptinstall make root@ubuntu:/home/cby#./A800-9000-npu-driver\_20.2.0\_ubuntu18.04-aarch64.run –run root@ubuntu:/home/cby#./A800-9000-npu-firmware\_1.76.22.3.220.run –run*注意:安装完成后需要重启服务器验证是否安装成功root@ubuntu:/home/cby#npu-smi infoCANN开发环境部署安装 安装环境以及依赖root@ubuntu:/home/cby# apt install g++ root@ubuntu:/home/cby# cd cmake/ root@ubuntu:/home/cby/cmake# ll total 4356 drwxr-xr-x 2 root root 4096 Apr 21 23:48 ./ drwxr-xr-x 7 cby cby 4096 Apr 21 23:48 ../ -rw-r--r-- 1 cby cby 2971248 Apr 21 23:45 cmake\_3.10.2-1ubuntu2.18.04.1\_arm64.deb -rw-r--r-- 1 cby cby 1331524 Apr 21 23:45 cmake-data\_3.10.2-1ubuntu2.18.04.1\_all.deb -rw-r--r-- 1 cby cby 69166 Apr 21 23:47 libjsoncpp1\_1.7.4-3\_arm64.deb -rw-r--r-- 1 cby cby 71788 Apr 21 23:48 librhash0\_1.3.6-2\_arm64.deb root@ubuntu:/home/cby/cmake# apt install./\* root@ubuntu:/home/cby/cmake# make–versionroot@ubuntu:/home/cby# apt install./zlib1g-dev\_1%3a1.2.11.dfsg-0ubuntu2\_arm64.deb root@ubuntu:/home/cby# apt install./libbz2-dev\_1.0.6-8.1ubuntu0.2\_arm64.deb root@ubuntu:/home/cby# apt install ./libsqlite3-dev\_3.22.0-1ubuntu0.4\_arm64.debroot@ubuntu:/home/cby# cd libssl-dev/ root@ubuntu:/home/cby/libssl-dev#apt install ./\*root@ubuntu:/home/cby# cd libxslt1-dev/ root@ubuntu:/home/cby/libxslt1-dev#ll total 13596 drwxr-xr-x 2 root root 4096 Apr 22 00:37 ./ drwxr-xr-x 10 cby cby 4096 Apr 22 00:37 ../ -rw-r--r-- 1 cby cby 18528 Apr 22 00:30gir1.2-harfbuzz-0.0\_1.7.2-1ubuntu1\_arm64.deb -rw-r--r-- 1 cby cby 170204 Apr 22 00:27icu-devtools\_60.2-3ubuntu3.1\_arm64.deb -rw-r--r-- 1 cby cby 983364 Apr 22 00:37libglib2.0-0\_2.56.4-0ubuntu0.18.04.8\_arm64.deb -rw-r--r-- 1 cby cby 61832 Apr 22 00:33libglib2.0-bin\_2.56.4-0ubuntu0.18.04.8\_arm64.deb -rw-r--r-- 1 cby cby 1297600 Apr 22 00:31libglib2.0-dev\_2.56.4-0ubuntu0.18.04.8\_arm64.deb -rw-r--r-- 1 cby cby 99676 Apr 22 00:31libglib2.0-dev-bin\_2.56.4-0ubuntu0.18.04.8\_arm64.deb -rw-r--r-- 1 cby cby 14528 Apr 22 00:32libgraphite2-dev\_1.3.11-2\_arm64.deb -rw-r--r-- 1 cby cby 280584 Apr 22 00:28libharfbuzz-dev\_1.7.2-1ubuntu1\_arm64.deb -rw-r--r-- 1 cby cby 12556 Apr 22 00:30libharfbuzz-gobject0\_1.7.2-1ubuntu1\_arm64.deb -rw-r--r-- 1 cby cby 5348 Apr 22 00:29libharfbuzz-icu0\_1.7.2-1ubuntu1\_arm64.deb -rw-r--r-- 1 cby cby 8890124 Apr 22 00:26libicu-dev\_60.2-3ubuntu3.1\_arm64.deb -rw-r--r-- 1 cby cby 14412 Apr 22 00:28libicu-le-hb0\_1.0.3+git161113-4\_arm64.deb -rw-r--r-- 1 cby cby 29760 Apr 22 00:27libicu-le-hb-dev\_1.0.3+git161113-4\_arm64.deb -rw-r--r-- 1 cby cby 18756 Apr 22 00:26libiculx60\_60.2-3ubuntu3.1\_arm64.deb -rw-r--r-- 1 cby cby 120696 Apr 22 00:35libpcre16-3\_2%3a8.39-9\_arm64.deb -rw-r--r-- 1 cby cby 113240 Apr 22 00:35libpcre32-3\_2%3a8.39-9\_arm64.deb -rw-r--r-- 1 cby cby 459316 Apr 22 00:33libpcre3-dev\_2%3a8.39-9\_arm64.deb -rw-r--r-- 1 cby cby 15124 Apr 22 00:35libpcrecpp0v5\_2%3a8.39-9\_arm64.deb -rw-r--r-- 1 cby cby 673384 Apr 22 00:25libxml2-dev\_2.9.4+dfsg1-6.1ubuntu1.3\_arm64.deb -rw-r--r-- 1 cby cby 395564 Apr 22 00:24libxslt1-dev\_1.1.29-5ubuntu0.2\_arm64.deb -rw-r--r-- 1 cby cby 42802 Apr 22 00:33pkg-config\_0.29.1-0ubuntu2\_arm64.deb -rw-r--r-- 1 cby cby 144176 Apr 22 00:37python3-distutils\_3.6.9-1~18.04\_all.deb root@ubuntu:/home/cby/libxslt1-dev#apt install ./\*root@ubuntu:/home/cby# cd libffi-dev/ root@ubuntu:/home/cby/libffi-dev#ls libffi-dev\_3.2.1-8\_arm64.deb root@ubuntu:/home/cby/libffi-dev#apt install ./\*root@ubuntu:/home/cby#apt install unzip root@ubuntu:/home/cby# apt install./libblas-dev\_3.7.1-4ubuntu1\_arm64.debroot@ubuntu:/home/cby# cd gfortran/ root@ubuntu:/home/cby/gfortran#ll total 7844 drwxr-xr-x 2 root root 4096 Apr 22 00:50 ./ drwxr-xr-x 12cby cby 4096 Apr 22 00:50 ../ -rw-r--r-- 1 cby cby 1344 Apr 22 00:48gfortran\_4%3a7.4.0-1ubuntu2.3\_arm64.deb -rw-r--r-- 1 cby cby 7464740 Apr 22 00:48gfortran-7\_7.5.0-3ubuntu1~18.04\_arm64.deb -rw-r--r-- 1 cby cby 248176 Apr 22 00:50libgfortran4\_7.5.0-3ubuntu1~18.04\_arm64.deb -rw-r--r-- 1 cby cby 300500 Apr 22 00:49libgfortran-7-dev\_7.5.0-3ubuntu1~18.04\_arm64.deb root@ubuntu:/home/cby/gfortran#apt install ./\*root@ubuntu:/home/cby# cd libblas3/ root@ubuntu:/home/cby/libblas3#apt install ./libblas3\_3.7.1-4ubuntu1\_arm64.debroot@ubuntu:/home/cby# cdlibopenblas-dev/ root@ubuntu:/home/cby/libopenblas-dev#ll total 3412 drwxr-xr-x 2 root root 4096 Apr 22 00:56 ./ drwxr-xr-x 14cby cby 4096 Apr 22 00:56 ../ -rw-r--r-- 1 cby cby 1813748 Apr 22 00:55libopenblas-base\_0.2.20+ds-4\_arm64.deb -rw-r--r-- 1 cby cby 1668126 Apr 22 00:54libopenblas-dev\_0.2.20+ds-4\_arm64.deb root@ubuntu:/home/cby/libopenblas-dev#apt install ./\*安装完成后查看版本gcc --version g++ --version make --version cmake --version dpkg -l zlib1g| grepzlib1g| grep ii dpkg -l zlib1g-dev|grep zlib1g-dev| grep ii dpkg -l libbz2-dev|grep libbz2-dev| grep ii dpkg -llibsqlite3-dev| grep libsqlite3-dev| grep ii dpkg -l openssl| grepopenssl| grep ii dpkg -l libssl-dev|grep libssl-dev| grep ii dpkg -l libxslt1-dev|grep libxslt1-dev| grep ii dpkg -l libffi-dev|grep libffi-dev| grep ii dpkg -l unzip| grepunzip| grep ii dpkg -l pciutils|grep pciutils| grep ii dpkg -l net-tools|grep net-tools| grep ii dpkg -l libblas-dev|grep libblas-dev| grep ii dpkg -l gfortran|grep gfortran| grep ii dpkg -l libblas3|grep libblas3| grep ii dpkg -llibopenblas-dev| grep libopenblas-dev| grep ii安装Python3.7.5root@ubuntu:/home/cby/python#tar xvf Python3.7.5.tar root@ubuntu:/home/cby/python# cdPython-3.7.5/ root@ubuntu:/home/cby/python/Python-3.7.5#./configure --prefix=/usr/local/python3.7.5 --enable-loadable-sqlite-extensions--enable-shared root@ubuntu:/home/cby/python/Python-3.7.5#make root@ubuntu:/home/cby/python/Python-3.7.5#make install root@ubuntu:/home/cby# sudo ln -s/usr/local/python3.7.5/bin/pip3 /usr/local/bin/pip3.7.5 root@ubuntu:/home/cby# sudo ln-s /usr/local/python3.7.5/bin/python3 /usr/local/bin/python3.7.5 root@ubuntu:/home/cby/cann\_xunlian#sudo ln -s /usr/local/python3.7.5/bin/python3 /usr/local/bin/python3.7 root@ubuntu:/home/cby/cann\_xunlian#sudo ln -s /usr/local/python3.7.5/bin/pip3 /usr/local/bin/pip3.7 root@ubuntu:/home/cby# vim ~/.bashrc exportLD\_LIBRARY\_PATH=/usr/local/python3.7.5/lib:$LD\_LIBRARY\_PATH root@ubuntu:/home/cby#python3.7.5 --version Python 3.7.5 root@ubuntu:/home/cby# pip3.7.5--version pip 19.2.3 from /usr/local/python3.7.5/lib/python3.7/site-packages/pip(python 3.7)使用Python3.7.5环境安装pip依赖包root@ubuntu:/home/cby/pip-pack# tar xvfpip\_pack.tar root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./attrs-20.3.0-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./numpy-1.17.2-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./decorator-5.0.6-py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./mpmath-1.2.1-py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./sympy-1.4-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./pycparser-2.20-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./cffi-1.12.3.tar.gz root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./PyYAML-5.3.1.tar.gz root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./six-1.15.0-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./pathlib2-2.3.5-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./psutil-5.8.0.tar.gz root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./protobuf-3.15.8-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./scipy-1.6.0-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./chardet-3.0.4-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./idna-2.10-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./urllib3-1.25.10-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./certifi-2020.6.20-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./certifi-2020.6.20-py2.py3-none-any.whl root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./requests-2.24.0-py2.py3-none-any.wh root@ubuntu:/home/cby/pip-pack/pip\_pack#pip3.7.5 install ./xlrd-1.2.0-py2.py3-none-any.whl *注意:以上pip包的安装必须以该顺序依次进行安装安装开发套件包root@ubuntu:/home/cby/cann#./Ascend-cann-tfplugin\_20.2.rc1\_linux-aarch64.run –install root@ubuntu:/home/cby/cann#./Ascend-cann-toolkit\_20.2.rc1\_linux-aarch64.run –install 出现install success后表示安装成功。 CANN训练环境部署安装说明 训练环境的Python3.7.5和环境以及依赖,和开发环境下的安装方式一样,可参考《CANN开发环境部署安装》文档进行安装。在已经搭建好的开发环境中,进行安装训练环境仅需安装一下训练软件包和实用工具包即可。安装训练软件包root@ubuntu:/home/cby/cann\_xunlian# chmod+x ./\*.run root@ubuntu:/home/cby/cann\_xunlian# ./Ascend-cann-nnae\_20.2.rc1\_linux-aarch64.run–install root@ubuntu:/home/cby/cann\_xunlian#./Ascend-cann-toolbox\_20.2.rc1\_linux-aarch64.run –install 出现install success后表示安装成功。安装MindSpore安装whl包 安装Ascend 910 AI处理器配套软件包提供的whl包,whl包随配套软件包发布,升级配套软件包之后需要重新安装。root@ubuntu:/home/cby/mindspore\_ascend#pip3.7.5 install /usr/local/Ascend/ascend-toolkit/latest/fwkacllib/lib64/hccl-0.1.0-py3-none-any.whl root@ubuntu:/home/cby/mindspore\_ascend#pip3.7.5 install /usr/local/Ascend/ascend-toolkit/latest/fwkacllib/lib64/te-0.4.0-py3-none-any.whl root@ubuntu:/home/cby/mindspore\_ascend#pip3.7.5 install /usr/local/Ascend/ascend-toolkit/latest/fwkacllib/lib64/topi-0.4.0-py3-none-any.whl root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install easydict-1.9.tar.gz root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install ./wheel-0.36.2-py2.py3-none-any.whl root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install ./astunparse-1.6.3-py2.py3-none-any.whl root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install ./Pillow-8.2.0-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install ./asttokens-2.0.4-py2.py3-none-any.whl root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install ./cffi-1.14.5-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install ./pyparsing-2.4.7-py2.py3-none-any.whl root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install ./packaging-20.9-py2.py3-none-any.whl root@ubuntu:/home/cby/mindspore\_ascend/pip#pip3.7.5 install ../mindspore\_ascend-1.1.1-cp37-cp37m-linux\_aarch64.whl*注意:安装时必须以此顺序进行安装配置环境变量\# control log level.0-DEBUG, 1-INFO, 2-WARNING, 3-ERROR, default level is WARNING. export GLOG\_v=2 # Conda environmentaloptions LOCAL\_ASCEND=/usr/local/Ascend # the root directoryof run package # lib libraries thatthe run package depends on exportLD\_LIBRARY\_PATH=${LOCAL\_ASCEND}/add-ons/:${LOCAL\_ASCEND}/ascend-toolkit/latest/fwkacllib/lib64:${LOCAL\_ASCEND}/driver/lib64:${LOCAL\_ASCEND}/opp/op\_impl/built-in/ai\_core/tbe/op\_tiling:${LD\_LIBRARY\_PATH} # Environmentvariables that must be configured exportTBE\_IMPL\_PATH=${LOCAL\_ASCEND}/ascend-toolkit/latest/opp/op\_impl/built-in/ai\_core/tbe # TBE operatorimplementation tool path exportASCEND\_OPP\_PATH=${LOCAL\_ASCEND}/ascend-toolkit/latest/opp # OPP path exportPATH=${LOCAL\_ASCEND}/ascend-toolkit/latest/fwkacllib/ccec\_compiler/bin/:${PATH} # TBE operatorcompilation tool path exportPYTHONPATH=${TBE\_IMPL\_PATH}:${PYTHONPATH} # Python library thatTBE implementation depends on测试是否可行Python代码内容:import numpy as np from mindspore importTensor import mindspore.opsas ops importmindspore.context as context context.set\_context(device\_target="Ascend") x =Tensor(np.ones(\[1,3,3,4\]).astype(np.float32)) y =Tensor(np.ones(\[1,3,3,4\]).astype(np.float32)) print(ops.tensor\_add(x,y))出现此结果即是安装部署完成\[\[\[\[2. 2. 2. 2.\] \[2. 2. 2. 2.\] \[2. 2. 2. 2.\]\] \[\[2. 2. 2. 2.\] \[2. 2. 2. 2.\] \[2. 2. 2. 2.\]\] \[\[2. 2. 2. 2.\] \[2. 2. 2. 2.\] \[2. 2. 2. 2.\]\]\]\]安装mindinsight安装whl包root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./itsdangerous-1.1.0-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./Werkzeug-1.0.1-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./MarkupSafe-1.1.1-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./Jinja2-2.11.3-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./click-7.1.2-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./Flask-1.1.2-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./Flask\_Cors-3.0.10-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./yapf-0.31.0-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./future-0.18.2.tar.gz root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./treelib-1.6.1.tar.gz root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./grpcio-1.37.0-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./google\_pasta-0.2.0-py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./pytz-2021.1-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./python\_dateutil-2.8.1-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./pandas-1.2.3-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./gunicorn-20.1.0.tar.gz root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./marshmallow-3.11.1-py2.py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./threadpoolctl-2.1.0-py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./joblib-1.0.1-py3-none-any.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./scikit\_learn-0.24.1-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/mindinsight/Mindinsight#pip3.7.5 install ./mindinsight-1.1.1-cp37-cp37m-linux\_aarch64.whl*注意:安装必须以此顺序进安装配置环境变量在配置文件中配置如下变量PATH=$PATH:/usr/local/python3.7.5/bin/root@ubuntu:/home/cby#source /etc/profile启动及使用root@ubuntu:/home/cby#mindinsight start Workspace:/root/mindinsight Webaddress: http://127.0.0.1:8080 servicestart state: success 出现该消息后,说明可视化已经启动成功,若需要外机访问的话,需要进行反向代理到0.0.0.0上面即可,比如frp工具即可实现该操作 在训练完成的Python代码目录下,使以下命令即可启动并展示该目录下的训练数据,debugger的参数可使用false或者truemindinsightstart --summary-base-dir . --port 8080 --enable-debugger True --debugger-port50051使用如下命令即可启动训练root@ubuntu:/home/cby/lenet/lenet#python3.7.5 lenet.py --device_target=Ascend安装Tensorflow编译hdf5root@ubuntu:/home/cby/Tensorflow/Tensorflow#cd hdf5-1.10.5/ root@ubuntu:/home/cby/Tensorflow/Tensorflow/hdf5-1.10.5#./configure --prefix=/usr/include/hdf5 root@ubuntu:/home/cby/Tensorflow/Tensorflow/hdf5-1.10.5#make root@ubuntu:/home/cby/Tensorflow/Tensorflow/hdf5-1.10.5#make install配置环境变量及软连接exportCPATH="/usr/include/hdf5/include/:/usr/include/hdf5/lib/" root@ubuntu:/home/cby/Tensorflow/Tensorflow/hdf5-1.10.5#ln -s /usr/include/hdf5/lib/libhdf5.so /usr/lib/libhdf5.so root@ubuntu:/home/cby/Tensorflow/Tensorflow/hdf5-1.10.5#ln -s /usr/include/hdf5/lib/libhdf5\_hl.so /usr/lib/libhdf5\_hl.so安装whl包root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./Cython-0.29.21-py2.py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./h5py-2.10.0-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./grpcio-1.30.0.tar.gz root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./gast-0.2.2.tar.gz root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./opt\_einsum-3.3.0-py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./Keras\_Applications-1.0.8-py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./Keras\_Preprocessing-1.1.2-py2.py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./astor-0.8.1-py2.py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./typing\_extensions-3.7.4.3-py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./zipp-3.4.1-py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./importlib\_metadata-3.10.1-py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./Markdown-3.2.2-py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./tensorboard-1.15.0-py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./wrapt-1.12.1.tar.gz root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./tensorflow\_estimator-1.15.1-py2.py3-none-any.whl root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./termcolor-1.1.0.tar.gz root@ubuntu:/home/cby/Tensorflow/Tensorflow#pip3.7.5 install ./tensorflow-1.15.0-cp37-cp37m-linux\_aarch64.whl注意:必须依次安装安装Pytorchroot@ubuntu:/home/cby/pytorch/Pytorch#pip3.7.5 install ./apex-0.1+ascend-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/pytorch/Pytorch#pip3.7.5 install ./torch-1.5.0+ascend.post2-cp37-cp37m-linux\_aarch64.whl root@ubuntu:/home/cby/pytorch/Pytorch#pip3.7.5 install ./future-0.18.2.tar.gz该文章所配套的软件包关注微信公众号回复 ai 即可获取所需要的所有软件包 Linux运维交流社区Linux运维交流社区,互联网新闻以及技术交流。20篇原创内容公众号
2021年12月30日
621 阅读
0 评论
0 点赞
2021-12-30
从APNIC获取中国IP地址列表
关于APNIC 全球IP地址块被IANA(Internet Assigned Numbers Authority)分配给全球三大地区性IP地址分配机构,它们分别是:ARIN (American Registry for Internet Numbers) 负责北美、南美、加勒比以及非洲撒哈啦部分的IP地址分配。同时还要给全球NSP(Network Service Providers)分配地址。RIPE (Reseaux IP Europeens) 负责欧洲、中东、北非、西亚部分地区(前苏联)APNIC (Asia Pacific Network Information Center) 负责亚洲、太平洋地区APNIC IP地址分配信息总表的获取:APNIC提供了每日更新的亚太地区IPv4,IPv6,AS号分配的信息表:http://ftp.apnic.net/apnic/stats/apnic/delegated-apnic-latest该文件的格式与具体内容参见:ftp://ftp.apnic.net/pub/apnic/stats/apnic/README.TXT通过该文件我们能够得到APNIC辖下IPv4地址空间的分配情况。脚本获取IP地址#!/bin/bash wget -c http://ftp.apnic.net/stats/apnic/delegated-apnic-latest cat delegated-apnic-latest | awk -F '|' '/CN/&&/ipv4/ {print $4 "/" 32-log($5)/log(2)}' | cat > ipv4.txt cat delegated-apnic-latest | awk -F '|' '/CN/&&/ipv6/ {print $4 "/" 32-log($5)/log(2)}' | cat > ipv6.txt cat delegated-apnic-latest | awk -F '|' '/HK/&&/ipv4/ {print $4 "/" 32-log($5)/log(2)}' | cat > ipv4-hk.txt cat delegated-apnic-latest | awk -F '|' '/HK/&&/ipv6/ {print $4 "/" 32-log($5)/log(2)}' | cat > ipv6-hk.txt执行脚本:[root@cby cby]# ./ip.sh --2021-04-29 12:17:13-- http://ftp.apnic.net/stats/apnic/delegated-apnic-latest Resolving ftp.apnic.net (ftp.apnic.net)... 203.119.102.40, 2001:dd8:8:701::40 Connecting to ftp.apnic.net (ftp.apnic.net)|203.119.102.40|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 3352151 (3.2M) [text/plain] Saving to: ‘delegated-apnic-latest’ delegated-apnic-latest 100%[=============================================================>] 3.20M 61.3KB/s in 44s 2021-04-29 12:17:58 (74.0 KB/s) - ‘delegated-apnic-latest’ saved [3352151/3352151] [root@cby cby]# ls delegated-apnic-latest index.html ip.sh ipv4-hk.txt ipv4.txt ipv6-hk.txt ipv6.txt 每日凌晨十二点十分会进行同步,若需要IP地址,可以访问如下地址:http://aliyun.chenby.cn/定时任务:[root@cby cby]# crontab -l 10 0 * * * /www/server/cron/3ab48c27ec99cb9787749c362afae517 >> /www/server/cron/3ab48c27ec99cb9787749c362afae517.log 2>&1 10 0 * * * rm -rf /www/wwwroot/www.chenby.cn/cby/ipv4.txt /www/wwwroot/www.chenby.cn/cby/ipv4-hk.txt /www/wwwroot/www.chenby.cn/cby/ipv6.txt /www/wwwroot/www.chenby.cn/cby/ipv6-hk.txt /www/wwwroot/www.chenby.cn/cby/delegated-apnic-latest 11 0 * * * /www/wwwroot/www.chenby.cn/cby/ip.sh >> /home/ip.txt本文使用 文章同步助手 同步
2021年12月30日
418 阅读
0 评论
0 点赞
2021-12-30
Linux文件系统故障,Input/output error
事情是这样的,在启动某一个应用程序的时候,出现 Input/output error 的报错,磁盘以及目录无法使用的情况下,进行了重启,重启完成后是可以正常使用的,过一段时间后就会再次出现这个问题,一番Google之后怀疑是磁盘出现问题,根据网友的解决方案尝试之后发现,这个方法可行,下文是命令及回显: 使用ls命令查看的时候出现这个报错[root@webc ~]# ls /data/ ls: 无法访问/data/: 输入/输出错误 [root@webc ~]# 这个是xfs的文件系统,所以使用如下命令进行修复[root@webc ~]# xfs_repair /dev/sdc1 xfs_repair: cannot open /dev/sdc1: 设备或资源忙 这时这个问题,不要慌,先把磁盘卸载了在进行修复[root@webc ~]# umount /dev/sdc1 [root@webc ~]# xfs_repair /dev/sdc1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. [root@webc ~]# [root@webc ~]# [root@webc ~]# xfs_repair /dev/sdc1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. [root@webc ~]# xfs_repair /dev/sdc1 -L Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... agi unlinked bucket 31 is 7620063 in ag 5 (inode=10745038303) sb_icount 533632, counted 533568 sb_ifree 617, counted 614 sb_fdblocks 2852137932, counted 2860186916 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 correcting bt key (was 91997, now 92001) in inode 10745038303 data fork, btree block 1343129285 correcting bt key (was 226254, now 226257) in inode 10745038303 data fork, btree block 1345535075 correcting bt key (was 241554, now 241557) in inode 10745038303 data fork, btree block 1345535075 correcting bt key (was 795517, now 795515) in inode 10745038303 data fork, btree block 1343659983 data fork in regular inode 10745038303 claims used block 1353137709 correcting nextents for inode 10745038303 bad data fork in inode 10745038303 cleared inode 10745038303 - agno = 6 - agno = 7 - agno = 8 correcting nextents for inode 17197661037, was 870903 - counted 870911 - agno = 9 - agno = 10 correcting bt key (was 1923723, now 1923730) in inode 21481716216 data fork, btree block 2687659655 correcting bt key (was 1997785, now 1997794) in inode 21481716216 data fork, btree block 2687659655 correcting nextents for inode 21481716216, was 918874 - counted 918898 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 4 - agno = 2 - agno = 5 - agno = 6 - agno = 1 - agno = 7 - agno = 9 - agno = 8 - agno = 10 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (15:166217) is ahead of log (1:2). Format log to cycle 18. done [root@webc ~]# 修复完成后在把磁盘挂上,即可生效[root@webc ~]# mount /dev/sdc1 /data/ 查看一下这个磁盘是否可以正常使用[root@webc ~]# cd /data/vm/ [root@webc vm]# ls CentOS7-Clone-1 CentOS7-Clone-3 CentOS7-Clone-4 CentOS7-Clone-5 CentOS8 Ubuntu此刻文件系统已修复完毕 注意: 修复其他文件系统使用fsck命令进行修复 例如ext4文件系统fsck -t ext4 -y /dev/sda1不同的文件系统,命令会有些许不同,灵活变通一下Linux运维交流社区Linux运维交流社区,互联网新闻以及技术交流。28篇原创内容公众号本文使用 文章同步助手 同步
2021年12月30日
940 阅读
0 评论
0 点赞
2021-12-30
华为 A800-9000 服务器 离线安装MindX DL
MindX DL(昇腾深度学习组件)是支持 Atlas 800 训练服务器、Atlas 800 推理服务器的深度学习组件参考设计,提供昇腾 AI 处理器资源管理和监控、昇腾 AI 处理器优化调度、分布式训练集合通信配置生成等基础功能,快速使能合作伙伴进行深度学习平台开发。 操作系统使用的是Ubuntu-1804,CPU是华为自研ARM架构。一、安装前准备配置apt网络源hello@ubuntu:/etc/apt$ sudo cp sources.list~ sources.list hello@ubuntu:/etc/apt$ cat sources.list # # deb cdrom:[Ubuntu-Server 18.04.5 LTS _Bionic Beaver_ - Release arm64 (20200810)]/ bionic main restricted #deb cdrom:[Ubuntu-Server 18.04.5 LTS _Bionic Beaver_ - Release arm64 (20200810)]/ bionic main restricted # See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to # newer versions of the distribution. deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic main restricted # deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic main restricted ## Major bug fix updates produced after the final release of the ## distribution. deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates main restricted # deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates main restricted ## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu ## team. Also, please note that software in universe WILL NOT receive any ## review or updates from the Ubuntu security team. deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic universe # deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic universe deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates universe # deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates universe ## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu ## team, and may not be under a free licence. Please satisfy yourself as to ## your rights to use the software. Also, please note that software in ## multiverse WILL NOT receive any review or updates from the Ubuntu ## security team. deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic multiverse # deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic multiverse deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates multiverse # deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates multiverse ## N.B. software from this repository may not have been tested as ## extensively as that contained in the main release, although it includes ## newer versions of some applications which may provide useful features. ## Also, please note that software in backports WILL NOT receive any review ## or updates from the Ubuntu security team. deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-backports main restricted universe multiverse # deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-backports main restricted universe multiverse ## Uncomment the following two lines to add software from Canonical's ## 'partner' repository. ## This software is not part of Ubuntu, but is offered by Canonical and the ## respective vendors as a service to Ubuntu users. # deb http://archive.canonical.com/ubuntu bionic partner # deb-src http://archive.canonical.com/ubuntu bionic partner deb http://ports.ubuntu.com/ubuntu-ports bionic-security main restricted # deb-src http://ports.ubuntu.com/ubuntu-ports bionic-security main restricted deb http://ports.ubuntu.com/ubuntu-ports bionic-security universe # deb-src http://ports.ubuntu.com/ubuntu-ports bionic-security universe deb http://ports.ubuntu.com/ubuntu-ports bionic-security multiverse # deb-src http://ports.ubuntu.com/ubuntu-ports bionic-security multiverse 2.配置kubernetes网络源root@ubuntu:~/123/offline-pkg-arm64# cat <<EOF >/etc/apt/sources.list.d/kubernetes.list > deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main > EOF 3.创建目录并下载基础包root@ubuntu:~/123# mkdir offline-pkg-arm64 root@ubuntu:~/123# cd offline-pkg-arm64/ root@ubuntu:~/123/offline-pkg-arm64# sudo apt update root@ubuntu:~/123/offline-pkg-arm64# apt-get download conntrack cri-tools haveged keyutils libhavege1 libltdl7 libnfsidmap2 libtirpc-dev libtirpc1 nfs-common nfs-kernel-server rpcbind socat sshpass root@ubuntu:~/123/offline-pkg-arm64# wget --no-check-certificate https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/arm64/docker-ce_18.06.3~ce~3-0~ubuntu_arm64.deb root@ubuntu:~/123/offline-pkg-arm64# apt-get download kubelet=1.17.3-00 kubeadm=1.17.3-00 kubectl=1.17.3-00 kubernetes-cni=0.8.6-00 4.下载docker镜像并导出保存root@ubuntu:~/123# mkdir docker_images root@ubuntu:~/123# cd docker_images/ root@ubuntu:~/123/docker_images# docker pull calico/node:v3.11.3 root@ubuntu:~/123/docker_images# docker save -o calico-node_arm64.tar.gz calico/node:v3.11.3 root@ubuntu:~/123/docker_images# docker pull calico/pod2daemon-flexvol:v3.11.3 root@ubuntu:~/123/docker_images# docker save -o calico-pod2daemon-flexvol_arm64.tar.gz calico/pod2daemon-flexvol:v3.11.3 root@ubuntu:~/123/docker_images# docker pull calico/cni:v3.11.3 root@ubuntu:~/123/docker_images# docker save -o calico-cni_arm64.tar.gz calico/cni:v3.11.3 root@ubuntu:~/123/docker_images# docker pull calico/kube-controllers:v3.11.3 root@ubuntu:~/123/docker_images# docker save -o calico-kube-controllers_arm64.tar.gz calico/kube-controllers:v3.11.3 root@ubuntu:~/123/docker_images# docker pull coredns/coredns:1.6.5 root@ubuntu:~/123/docker_images# docker save -o coredns_arm64.tar.gz coredns/coredns:1.6.5 root@ubuntu:~/123/docker_images# docker pull cruse/etcd-arm64:3.4.3-0 root@ubuntu:~/123/docker_images# docker save -o etcd_arm64.tar.gz cruse/etcd-arm64:3.4.3-0 root@ubuntu:~/123/docker_images# docker pull cruse/kube-apiserver-arm64:v1.17.3 root@ubuntu:~/123/docker_images# docker save -o kube-apiserver_arm64.tar.gz cruse/kube-apiserver-arm64:v1.17.3 root@ubuntu:~/123/docker_images# docker pull cruse/kube-controller-manager-arm64:v1.17.3 root@ubuntu:~/123/docker_images# docker save -o kube-controller-manager_arm64.tar.gz cruse/kube-controller-manager-arm64:v1.17.3 root@ubuntu:~/123/docker_images# docker pull cruse/kube-proxy-arm64:v1.17.3-beta.0 root@ubuntu:~/123/docker_images# docker save -o kube-proxy_arm64.tar.gz cruse/kube-proxy-arm64:v1.17.3-beta.0 root@ubuntu:~/123/docker_images# docker pull cruse/kube-scheduler-arm64:v1.17.3-beta.0 root@ubuntu:~/123/docker_images# docker save -o kube-scheduler_arm64.tar.gz cruse/kube-scheduler-arm64:v1.17.3-beta.0 root@ubuntu:~/123/docker_images# docker pull cruse/pause-arm64:3.1 root@ubuntu:~/123/docker_images# docker save -o pause_arm64.tar.gz cruse/pause-arm64:3.1 root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# docker login -u 15648907522 -p RtZOXgmpYAQd5cj93uFCabNXUWB7wOftGw4pFdcal4XZH4bf06hvFxTOrYtr1nRao ascendhub.huawei.com root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/vc-controller-manager_arm64:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/vc-scheduler_arm64:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/vc-webhook-manager_arm64:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/vc-webhook-manager-base_arm64:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/hccl-controller_arm64:v20.2.0 root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/ascend-k8sdeviceplugin_arm64:v20.2.0 root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/cadvisor_arm64:v0.34.0-r40 root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/vc-controller-manager_arm64:v1.0.1-r40 volcanosh/vc-controller-manager:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/vc-scheduler_arm64:v1.0.1-r40 volcanosh/vc-scheduler:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/vc-webhook-manager_arm64:v1.0.1-r40 volcanosh/vc-webhook-manager:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/vc-webhook-manager-base_arm64:v1.0.1-r40 volcanosh/vc-webhook-manager-base:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/hccl-controller_arm64:v20.2.0 hccl-controller:v20.2.0 root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/ascend-k8sdeviceplugin_arm64:v20.2.0 ascend-k8sdeviceplugin:v20.2.0 root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/cadvisor_arm64:v0.34.0-r40 google/cadvisor:v0.34.0-r40 root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/vc-controller-manager_arm64:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/vc-scheduler_arm64:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/vc-webhook-manager_arm64:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/vc-webhook-manager-base_arm64:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/hccl-controller_arm64:v20.2.0 root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/ascend-k8sdeviceplugin_arm64:v20.2.0 root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/cadvisor_arm64:v0.34.0-r40 root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# docker save -o Ascend-K8sDevicePlugin-v20.2.0-arm64-Docker.tar.gz ascend-k8sdeviceplugin:v20.2.0 root@ubuntu:~/123/docker_images# docker save -o hccl-controller-v20.2.0-arm64.tar.gz hccl-controller:v20.2.0 root@ubuntu:~/123/docker_images# docker save -o huawei-cadvisor-v0.34.0-r40-arm64.tar.gz google/cadvisor:v0.34.0-r40 root@ubuntu:~/123/docker_images# docker save -o vc-controller-manager-v1.0.1-r40-arm64.tar.gz volcanosh/vc-controller-manager:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker save -o vc-scheduler-v1.0.1-r40-arm64.tar.gz volcanosh/vc-scheduler:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker save -o vc-webhook-manager-base-v1.0.1-r40-arm64.tar.gz volcanosh/vc-webhook-manager-base:v1.0.1-r40 root@ubuntu:~/123/docker_images# docker save -o vc-webhook-manager-v1.0.1-r40-arm64.tar.gz volcanosh/vc-webhook-manager:v1.0.1-r40 注* 其中部分镜像是需要在华为hub里面进行获取权限后进行下载 https://support.huaweicloud.com/usermanual-mindxdl202/atlasmindx_03_0047.html 5.完成后的目录root@ubuntu:~/123# tree . ├── docker_images │ ├── Ascend-K8sDevicePlugin-v20.2.0-arm64-Docker.tar.gz │ ├── calico-cni_arm64.tar.gz │ ├── calico-kube-controllers_arm64.tar.gz │ ├── calico-node_arm64.tar.gz │ ├── calico-pod2daemon-flexvol_arm64.tar.gz │ ├── coredns_arm64.tar.gz │ ├── etcd_arm64.tar.gz │ ├── hccl-controller-v20.2.0-arm64.tar.gz │ ├── huawei-cadvisor-v0.34.0-r40-arm64.tar.gz │ ├── kube-apiserver_arm64.tar.gz │ ├── kube-controller-manager_arm64.tar.gz │ ├── kube-proxy_arm64.tar.gz │ ├── kube-scheduler_arm64.tar.gz │ ├── pause_arm64.tar.gz │ ├── vc-controller-manager-v1.0.1-r40-arm64.tar.gz │ ├── vc-scheduler-v1.0.1-r40-arm64.tar.gz │ ├── vc-webhook-manager-base-v1.0.1-r40-arm64.tar.gz │ └── vc-webhook-manager-v1.0.1-r40-arm64.tar.gz ├── offline-pkg-arm64 │ ├── conntrack_1%3a1.4.4+snapshot20161117-6ubuntu2_arm64.deb │ ├── cri-tools_1.13.0-01_arm64.deb │ ├── docker-ce_18.06.3~ce~3-0~ubuntu_arm64.deb │ ├── haveged_1.9.1-6_arm64.deb │ ├── keyutils_1.5.9-9.2ubuntu2_arm64.deb │ ├── kubeadm_1.17.3-00_arm64.deb │ ├── kubectl_1.17.3-00_arm64.deb │ ├── kubelet_1.17.3-00_arm64.deb │ ├── kubernetes-cni_0.8.6-00_arm64.deb │ ├── libhavege1_1.9.1-6_arm64.deb │ ├── libltdl7_2.4.6-2_arm64.deb │ ├── libnfsidmap2_0.25-5.1_arm64.deb │ ├── libtirpc1_0.2.5-1.2ubuntu0.1_arm64.deb │ ├── libtirpc-dev_0.2.5-1.2ubuntu0.1_arm64.deb │ ├── nfs-common_1%3a1.3.4-2.1ubuntu5.5_arm64.deb │ ├── nfs-kernel-server_1%3a1.3.4-2.1ubuntu5.5_arm64.deb │ ├── rpcbind_0.2.3-0.6ubuntu0.18.04.4_arm64.deb │ ├── socat_1.7.3.2-2ubuntu2_arm64.deb │ └── sshpass_1.06-1_arm64.deb ├── offline-pkg-arm64.zip └── yamls ├── ascendplugin-310-v20.2.0.yaml ├── ascendplugin-volcano-v20.2.0.yaml ├── cadvisor-v0.34.0-r40.yaml ├── calico.yaml ├── hccl-controller-v20.2.0.yaml ├── npu-exporter-v20.2.0.yaml └── volcano-v1.0.1-r40.yaml 3 directories, 46 files root@ubuntu:~/123#注* 其中yamls文件在下方链接中下载 https://gitee.com/ascend/mindxdl-deploy/tree/20201230-V20.2.0/ 6.配置免密登陆root@ubuntu:~# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:07dTbsAycQqT2w7HdCwjIyJig5T20FQ/eHZGxWg7pbY root@ubuntu The key's randomart image is: +---[RSA 2048]----+ | .+... .+. | |o+ . o .+ + | |+o+ ...=BoO + | |...o .o.+/ O | | S @ + . | | E + = | | . o o | | o | | | +----[SHA256]-----+ root@ubuntu:~# root@ubuntu:~# ssh-copy-id -i 127.0.0.1 7.配置安装ansibleroot@ubuntu:~# root@ubuntu:~# apt install ansible root@ubuntu:~# vim /etc/ansible/hosts #配置内容如下 [all:vars] # default shared directory, you can change it as yours nfs_shared_dir=/data/atlas_dls # NFS service IP nfs_service_ip=192.168.1.110 # Master IP master_ip=192.168.1.110 # dls install package dir dls_root_dir=/root/123 # set proxy proxy="" # Command for logging in to the Asend hub ascendhub_login_command="login_command" # Generally, you do not need to change the value or delete it. ascendhub_prefix="ascendhub.huawei.com/public-ascendhub" # versions deviceplugin_version="v20.2.0" cadvisor_version="v0.34.0-r40" volcano_version="v1.0.1-r40" hccl_version="v20.2.0" [nfs_server] ubuntu ansible_host=192.168.1.110 ansible_ssh_user="root" ansible_ssh_pass="123123" [localnode] ubuntu ansible_host=192.168.1.110 ansible_ssh_user="root" ansible_ssh_pass="123123" [training_node] ubuntu ansible_host=192.168.1.110 ansible_ssh_user="root" ansible_ssh_pass="123123" [inference_node] [A300T_node] [arm] ubuntu ansible_host=192.168.1.110 ansible_ssh_user="root" ansible_ssh_pass="123123" [x86] [workers:children] training_node inference_node A300T_node root@ubuntu:~/mindxdl/deploy/offline/steps# vim /etc/ansible/ansible.cfg log_path = /var/log/ansible.log host_key_checking = False deprecation_warnings = False 注* 参数说明,请根据实际写入:nfs-host-ip:NFS节点服务器IP地址,即服务器IP地址,如果不安装NFS可设置为空字符串,如:""。 master-host-ip:管理节点服务器IP地址,即服务器IP地址。 install_dir:基础软件包、镜像包和yamls文件夹的上传目录。 proxy_address:代理地址,请根据实际情况配置,如果不需要代理,设置为空字符串,如:""。 login_command:从Ascend Hub中心获取镜像需要使用的登录命令,仅在线安装需要配置,如:"docker login -u xxxxxx@xxxxxx -p xxxxxxxx ascendhub.huawei.com",注意不要遗漏命令前后的引号,获取方式请参见获取MindX DL镜像中1~2。离线安装可设置为空字符串,如:""。 single-node-host-name:请使用单节点主机名,可通过hostname命令查看。 IP:服务器IP地址。 username:登录服务器的用户名。建议使用root用户,避免权限不足。 passwd:登录服务器的用户密码。 二、一键安装root@ubuntu:~/sshpass# apt install sshpass root@ubuntu:~/mindxdl/deploy/offline/steps# dos2unix * root@ubuntu:~/mindxdl/deploy/offline/steps# chmod 500 entry.sh root@ubuntu:~/mindxdl/deploy/offline/steps# bash -x entry.sh三、安装后进行验证 1.docker信息查看root@ubuntu:~# docker info Containers: 35 Running: 30 Paused: 0 Stopped: 5 Images: 18 Server Version: 18.06.3-ce Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: systemd Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: ascend runc Default Runtime: ascend Init Binary: docker-init containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e runc version: a592beb5bc4c4092b1b1bac971afed27687340c5 init version: fec3683 Security Options: apparmor seccomp Profile: default Kernel Version: 4.15.0-112-generic Operating System: Ubuntu 18.04.5 LTS OSType: linux Architecture: aarch64 CPUs: 192 Total Memory: 503.6GiB Name: ubuntu ID: MUTU:QOYU:2P6F:P2QB:4JKZ:QNKE:PPMQ:PQLL:3PDG:QEYU:LMDK:KNMF Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: docker.mirrors.ustc.edu.cn 127.0.0.0/8 Registry Mirrors: https://dockerhub.azk8s.cn/ https://docker.mirrors.ustc.edu.cn/ http://hub-mirror.c.163.com/ Live Restore Enabled: false WARNING: No swap limit support 2. kubectl的pod信息查看root@ubuntu:~# kubectl get pod --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE cadvisor cadvisor-nsn4r 1/1 Running 0 5m23s default hccl-controller-645bb466f-5fqq6 1/1 Running 0 5m34s kube-system ascend-device-plugin-daemonset-vxj8s 1/1 Running 0 5m23s kube-system calico-kube-controllers-8464785d6b-bnjdn 1/1 Running 0 5m50s kube-system calico-node-blshl 1/1 Running 0 5m51s kube-system coredns-6955765f44-5jr59 1/1 Running 0 5m50s kube-system coredns-6955765f44-wbzvz 1/1 Running 0 5m50s kube-system etcd-ubuntu 1/1 Running 0 5m43s kube-system kube-apiserver-ubuntu 1/1 Running 0 5m43s kube-system kube-controller-manager-ubuntu 1/1 Running 0 5m43s kube-system kube-proxy-b78fm 1/1 Running 0 5m51s kube-system kube-scheduler-ubuntu 1/1 Running 0 5m43s volcano-system volcano-admission-74776688c8-g9p9q 1/1 Running 0 5m31s volcano-system volcano-admission-init-sbktn 0/1 Completed 0 5m31s volcano-system volcano-controllers-6786db54f-vn797 1/1 Running 0 5m31s volcano-system volcano-scheduler-844f9b547b-xxjm7 1/1 Running 0 5m31s root@ubuntu:~# root@ubuntu:~# kubectl describe node ubuntu Name: ubuntu Roles: master,worker Labels: accelerator=huawei-Ascend910 beta.kubernetes.io/arch=arm64 beta.kubernetes.io/os=linux host-arch=huawei-arm kubernetes.io/arch=arm64 kubernetes.io/hostname=ubuntu kubernetes.io/os=linux masterselector=dls-master-node node-role.kubernetes.io/master= node-role.kubernetes.io/worker=worker workerselector=dls-worker-node Annotations: huawei.com/Ascend910: Ascend910-1,Ascend910-2,Ascend910-3,Ascend910-0 kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 projectcalico.org/IPv4Address: 192.168.1.110/24 projectcalico.org/IPv4IPIPTunnelAddr: 10.30.243.192 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Thu, 05 Aug 2021 16:34:33 +0800 Taints: <none> Unschedulable: false Lease: HolderIdentity: ubuntu AcquireTime: <unset> RenewTime: Thu, 05 Aug 2021 16:41:29 +0800 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Thu, 05 Aug 2021 16:35:06 +0800 Thu, 05 Aug 2021 16:35:06 +0800 CalicoIsUp Calico is running on this node MemoryPressure False Thu, 05 Aug 2021 16:40:30 +0800 Thu, 05 Aug 2021 16:34:27 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Thu, 05 Aug 2021 16:40:30 +0800 Thu, 05 Aug 2021 16:34:27 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Thu, 05 Aug 2021 16:40:30 +0800 Thu, 05 Aug 2021 16:34:27 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Thu, 05 Aug 2021 16:40:30 +0800 Thu, 05 Aug 2021 16:35:19 +0800 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 192.168.1.110 Hostname: ubuntu Capacity: cpu: 192 ephemeral-storage: 920422204Ki huawei.com/Ascend910: 4 hugepages-2Mi: 0 memory: 528101392Ki pods: 110 Allocatable: cpu: 192 ephemeral-storage: 848261101802 huawei.com/Ascend910: 4 hugepages-2Mi: 0 memory: 527998992Ki pods: 110 System Info: Machine ID: 3996e745414f461b9e0e990f6d0b597e System UUID: CD56756C-607E-BD02-EB11-5292EAFB068C Boot ID: adb96127-7fdc-4d84-8867-a13005f9b535 Kernel Version: 4.15.0-112-generic OS Image: Ubuntu 18.04.5 LTS Operating System: linux Architecture: arm64 Container Runtime Version: docker://18.6.3 Kubelet Version: v1.17.3 Kube-Proxy Version: v1.17.3 PodCIDR: 10.30.0.0/24 PodCIDRs: 10.30.0.0/24 Non-terminated Pods: (15 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE --------- ---- ------------ ---------- --------------- ------------- --- cadvisor cadvisor-nsn4r 500m (0%) 1 (0%) 300Mi (0%) 2000Mi (0%) 6m17s default hccl-controller-645bb466f-5fqq6 500m (0%) 500m (0%) 300Mi (0%) 300Mi (0%) 6m28s kube-system ascend-device-plugin-daemonset-vxj8s 500m (0%) 500m (0%) 500Mi (0%) 500Mi (0%) 6m17s kube-system calico-kube-controllers-8464785d6b-bnjdn 0 (0%) 0 (0%) 0 (0%) 0 (0%) 6m44s kube-system calico-node-blshl 250m (0%) 0 (0%) 0 (0%) 0 (0%) 6m45s kube-system coredns-6955765f44-5jr59 100m (0%) 0 (0%) 70Mi (0%) 170Mi (0%) 6m44s kube-system coredns-6955765f44-wbzvz 100m (0%) 0 (0%) 70Mi (0%) 170Mi (0%) 6m44s kube-system etcd-ubuntu 0 (0%) 0 (0%) 0 (0%) 0 (0%) 6m37s kube-system kube-apiserver-ubuntu 250m (0%) 0 (0%) 0 (0%) 0 (0%) 6m37s kube-system kube-controller-manager-ubuntu 200m (0%) 0 (0%) 0 (0%) 0 (0%) 6m37s kube-system kube-proxy-b78fm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 6m45s kube-system kube-scheduler-ubuntu 100m (0%) 0 (0%) 0 (0%) 0 (0%) 6m37s volcano-system volcano-admission-74776688c8-g9p9q 500m (0%) 500m (0%) 300Mi (0%) 300Mi (0%) 6m25s volcano-system volcano-controllers-6786db54f-vn797 500m (0%) 500m (0%) 300Mi (0%) 300Mi (0%) 6m25s volcano-system volcano-scheduler-844f9b547b-xxjm7 500m (0%) 500m (0%) 300Mi (0%) 300Mi (0%) 6m25s Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 4 (2%) 3500m (1%) memory 2140Mi (0%) 4040Mi (0%) ephemeral-storage 0 (0%) 0 (0%) huawei.com/Ascend910 0 0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal NodeHasSufficientMemory 7m10s (x8 over 7m11s) kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 7m10s (x7 over 7m11s) kubelet, ubuntu Node ubuntu status is now: NodeHasNoDiskPressure Normal NodeHasSufficientPID 7m10s (x6 over 7m11s) kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientPID Normal Starting 6m37s kubelet, ubuntu Starting kubelet. Normal NodeHasSufficientMemory 6m37s kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 6m37s kubelet, ubuntu Node ubuntu status is now: NodeHasNoDiskPressure Normal NodeHasSufficientPID 6m37s kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientPID Normal NodeAllocatableEnforced 6m37s kubelet, ubuntu Updated Node Allocatable limit across pods Normal Starting 6m33s kube-proxy, ubuntu Starting kube-proxy. Normal NodeReady 6m17s kubelet, ubuntu Node ubuntu status is now: NodeReady root@ubuntu:~#注* 再此信息中可以看到CPU和加速卡的信息Capacity: cpu: 192 ephemeral-storage: 920422204Ki huawei.com/Ascend910: 4 hugepages-2Mi: 0 memory: 528101392Ki pods: 110 Allocatable: cpu: 192 ephemeral-storage: 848261101802 huawei.com/Ascend910: 4 hugepages-2Mi: 0 memory: 527998992Ki pods: 110**详情可以查看华为官方文档: **https://support.huaweicloud.com/mindxdl201/Linux运维交流社区Linux运维交流社区,互联网新闻以及技术交流。30篇原创内容公众号本文使用 文章同步助手 同步
2021年12月30日
954 阅读
0 评论
0 点赞
1
...
37
38
39
...
41