技术标签: allocation optimization fortran integer compiler interface
在安装HPL之前,系统中必须已经安装了编译器、并行环境MPI以及基本线性代数子方程(BLAS)或矢量图形信号处理库(VSIPL)两者之一。
编译器必须支持C语言和Fortran77语言。并行环境MPI一般采用MPICH,当然也可以是其它版本的MPI,如LAM-MPI。HPL运行需要BLAS库或者VSIPL库,且库的性能对最终测得的Linpack性能有密切的关系。常用的BLAS库有GOTO、Atlas、ACML、ESSL、MKL等,
并行环境MPI我采用的是安装Infi-MPI,BLAS库我选择的是GotoBLAS。HPL是从 [url]www.netlib.org/benchmark/hpl[/url] 网站上下载HPL包hpl.tar.gz,目前HPL的最新版本为2.0。
使用root帐户
具体步骤如下:
一. Goto Blas 的安装 (GOTOBLAS)2007-07-07 18:29下载
GotoBLAS-1.15.tar.gz
1.cp GotoBLAS-1.15.tar.gz /usr/local/share/
tar xzvf GotoBLAS-1.15.tar.gz
cd GotoBLAS
2.如果机器是32位的
./quickbuild.32bit
64位的,则运行 ./quickbuild.32bit
3、编辑Makefile.rule,详细情况见附件;更改getarch.c里面的archtecture,使之符合自己的情况,即选择自己机器的相应配置。
Makefile.rule
#
# Beginning of user configuration
#
# This library's version
REVISION = -r1.26
# Which C compiler do you prefer? Default is gcc.
C_COMPILER = GNU
# C_COMPILER = INTEL
# C_COMPILER = PGI
# Now you don't need Fortran compiler to build library.
# If you don't spcifly Fortran Compiler, GNU g77 compatible
# interface will be used.
# F_COMPILER = G77
# F_COMPILER = G95
# F_COMPILER = GFORTRAN
F_COMPILER = INTEL
# F_COMPILER = PGI
# F_COMPILER = PATHSCALE
# F_COMPILER = IBM
# F_COMPILER = COMPAQ
# F_COMPILER = SUN
# F_COMPILER = F2C
# If you need 64bit binary; some architecture can accept both 32bit and
# 64bit binary(X86_64, SPARC, Power/PowerPC or WINDOWS).
#BINARY64 = 1
# If you want to build threaded BLAS
SMP = 1
# You can define maximum number of threads. Basically it should be
# less than actual number of cores. If you don't specify one, it's
# automatically detected by script.
MAX_THREADS = 16
# If you want to use legacy threaded Level 3 implementation.
# Some architecture prefer this algorithm, but it's rare.
# USE_SIMPLE_THREADED_LEVEL3 = 1
# If you want to use GotoBLAS with accerelator like Cell or GPGPU
# This is experimental and currently won't work well.
# USE_ACCERELATOR = 1
# Define accerelator type (won't work)
# USE_CELL_SPU = 1
# Theads are still working for a while after finishing BLAS operation
# to reduce thread activate/deactivate overhead. You can determine
# time out to improve performance. This number should be from 4 to 30
# which corresponds to (1 << n) cycles. For example, if you set to 26,
# thread will be running for (1 << 26) cycles(about 25ms on 3.0GHz
# system). Also you can control this mumber by GOTO_THREAD_TIMEOUT
# CCOMMON_OPT += -DTHREAD_TIMEOUT=26
# If you need cross compiling
# (you have to set architecture manually in getarch.c!)
# Example : HOST ... G5 OSX, TARGET = CORE2 OSX
# CROSS_SUFFIX = i686-apple-darwin8-
# CROSS_VERSION = -4.0.1
# CROSS_BINUTILS =
# If you need Special memory management;
# Using HugeTLB file system(Linux / AIX / Solaris)
# HUGETLB_ALLOCATION = 1
# Using bigphysarea memory instead of normal allocation to get
# physically contiguous memory.
# BIGPHYSAREA_ALLOCATION = 1
# To get maxiumum performance with minimum impact to the system,
# mixing memory allocation may be worth to try. In this case,
# you have to define one of ALLOC_HUGETLB or BIGPHYSAREA_ALLOCATION.
# Another allocation will be done by mmap or static allocation.
# (Not implemented yet)
# MIXED_MEMORY_ALLOCATION = 1
# Using static allocation instead of dynamic allocation
# You can't use it with ALLOC_HUGETLB
STATIC_ALLOCATION = 1
# If you want to use CPU affinity
# CCOMMON_OPT += -DUSE_CPU_AFFINITY
# If you want to use memory affinity (NUMA)
# You can't use it with ALLOC_STATIC
# NUMA_AFFINITY = 1
# If you want to use interleaved memory allocation.
# Default is local allocation(it only works with NUMA_AFFINITY).
# CCOMMON_OPT += -DINTERLEAVED_MAPPING
# If you want to drive whole 64bit region by BLAS. Not all Fortran
# compiler supports this. It's safe to keep comment it out if you
# are not sure.
# INTERFACE64 = 1
# If you have special compiler to run script to determine architecture.
GETARCH_CC +=
GETARCH_FLAGS +=
#
# End of user configuration
#
ifdef BINARY32
BINARY64 =
endif
ifndef GOTOBLAS_MAKEFILE
export GOTOBLAS_MAKEFILE = 1
MACHINE =
OSNAME =
PGCPATH =
ARCH =
SUBARCH =
ARCHSUBDIR =
CONFIG =
FU =
LIBSUBARCH =
CORE =
endif
ifndef MACHINE
MACHINE := $(shell uname -m | sed -e s/i.86/i386/)
endif
ifndef OSNAME
OSNAME := $(shell uname -s | sed -e s//-.*//)
endif
ifneq ($(OSNAME), Darwin)
ifneq ($(OSNAME), CYGWIN_NT)
ifeq ($(MACHINE), i386)
BINARY64 =
NATIVEARCH = YES
endif
endif
endif
ifeq ($(MACHINE), ia64)
BINARY64 = YES
NATIVEARCH = YES
endif
ifeq ($(MACHINE), alpha)
BINARY64 = YES
NATIVEARCH = YES
endif
ifeq ($(OSNAME), AIX)
NATIVEARCH = YES
GETARCH_FLAGS += -maix64
endif
ifeq ($(OSNAME), Darwin)
ifndef BINARY64
NATIVEARCH = YES
endif
EXTRALIB += -lSystemStubs
endif
# If you need to access over 4GB chunk on 64bit system.
ifdef BINARY64
CCOMMON_OPT += -D__64BIT__
GETARCH_FLAGS += -D__64BIT__
ifdef INTERFACE64
CCOMMON_OPT += -DUSE64BITINT
endif
endif
文章浏览阅读1.8k次,点赞22次,收藏21次。最坏情况下,前5次全输,需要87步即可达到260分,即第92轮时,因此可以通过本题。利用SQL注入修改data数据的值,本题data是数组,且会插入数据库,最终的payload需要改一下让前后闭合,且TP5,在网上找一个链子的EXP改一下。当然,前一题的SQL注入点依然存在,不过依然需要鉴权进入后台,这意味着,只需要我们能进入后台,就能通过load_file的方式读取flag。简单来说,就是能set任意的值,例如下方的payload,就能注入一个snowwolf的键,且值为wolf,4代表数据长度。_2023强网杯赛题
文章浏览阅读836次,点赞2次,收藏2次。简介本文将给您介绍 AppAdmin 后台管理系统开发框架。AppAdmin后台管理系统开发框架是一套Java开发的整合了当前众多比较流行的Java后台开发框架的系统,使用H5响应式布局。整合了 spring + springMVC + hibernate (JPA) + shiro + ehcache 等框架,功能包括基本的系统管理、权限、角色、存储(oss、本地、ftp)、缓存、站内信、..._基于java类的curbecms
文章浏览阅读9.2k次。背景:java -jar启动报错,但是本地idea运行正常。环境:jdk1.8jackson: <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-core</artifactId> <version>2.9.5</version&_com.fasterxml.jackson.core.tsfbuilder
文章浏览阅读77次。2024阿里云和腾讯云均推出专属幻兽帕鲁Palworld游戏优惠服务器,阿里云配置分为4核16G和4核32G服务器,4核16G配置32.25元/1个月、10M带宽66.30元/1个月、4核32G配置113.24元/1个月,4核32G配置3个月339.72元。幻兽帕鲁服务器官方推荐是4核16G配置,Windows和Linux服务器操作系统,Windows-Steam,Linux-SteamCMD,默认端口port=8211,玩家players=32。14带宽3个月277.2元,一年1584元。
文章浏览阅读85次。自考网络教育计算机组成原理作业考试题及答案三套计算机组成原理 一、单项选择题(本大题共100分,共 40 小题,每小题 2.5 分)1. CPU从主存取出一条指令并执行该指令的时间叫做( ) A. 机器周期 B. 指令周期 C. 时钟周期 D. 总线周期2. 同步控制是( ). A. 只适用于CPU控制的方式 B. 只适用于外围设备控制的方式 C. 由统一时序信号控制的方式 D. 所有指令控制时间..._30台计算机组成的网络
文章浏览阅读1.1k次。封印者闪退掉线黑屏怎么办?游戏无法登陆如何解决?封印者是最近不删档的游戏,受到了不少玩家喜爱。有不少玩家反映在玩封印者出现了闪退等问题,那么如何解决上述问题呢?下面就和说玩网小编一起去看看吧。1、网络问题,有时候网络不好,链接不上游戏,就会出现闪退。解决方法:建议在玩家在WIFI环境下开始游戏,或者是3G/4G等网络环境较好的地方开始游戏。2、玩家手机内存不够,玩家手机的内存不够了,就会出现卡顿、..._封印者闪退解决方案
文章浏览阅读1.4k次。指针一、指针是什么1、指针是什么?指针理解的2个要点:1. 指针是内存中一个最小单元的编号,也就是地址;2. 平时口语中说的指针,通常指的是指针变量,是用来存放内存地址的变量总结:指针就是地址,口语中说得指针通常值得是指针变量2、指针变量我们可以通过&(取地址操作符)取出变量的内存真实地址,吧地址可以存放到一个变量中,这个变量就是指针变量。实例#include <stdio.h> { int a=10; _指针指向的是值还是地址
文章浏览阅读85次。暴力法递归法_力扣 2. 两数相加 add two numbers 调试
文章浏览阅读7.6k次,点赞15次,收藏35次。KITTI Depth以及ScanNet评估指标 指标 名称 表达式 abs rel. absolute relative error mae mean absolute error log mae mean absolute logarithmic error imae inverse mean absolu..._单目系统中的量化评估指标
文章浏览阅读870次,点赞7次,收藏20次。探索 ansible-role-nginx: 简化Nginx服务器配置的Ansible角色项目地址:https://gitcode.com/jdauphant/ansible-role-nginx在现代Web服务管理中,自动化运维工具起着至关重要的作用,而Ansible就是其中的一把利剑。今天我们要介绍的是一个由jdauphant维护的Ansible角色——ansible-role-nginx...
文章浏览阅读555次,点赞2次,收藏4次。导入:我想作为一名Python程序猿,大家对于tkinter大家应该不陌生了吧,那么在接下几篇博文里我将跟大伙一起来实现tkinter的用户登录界面。注意:该界面没有注册哦~tkinter的基础:想要先实例化一个窗口我们就得学会以下代码:import tkinter as tkwindow = tk.Tk()window.title("xxx")window.geometry("300x500")#注意,引号中的窗口大小必须用"x",而不是“*”window.mainloop()运行了_python tkinter 注册页面
文章浏览阅读167次。Different firewall (security gateway) vendor has different solution to handle the passing traffic. This post compiles some useful Internet posts that interpret major vendors’ solutions including:1. C..._traffic@flow: nat: