落在人间
2010-03-26 05:48:02| 分类: 高性能计算 | 标签: |举报 |字号大中小 订阅
用微信 “扫一扫”
将文章分享到朋友圈。
用易信 “扫一扫”
[ Return to FAQ ]
It is probably easier to define what a "homogeneous" cluster is -- a "heterogeneous" cluster is anything that is not a "homogeneous" cluster.
A homogeneous cluster is one where all the nodes have the same:
The first requirement -- same architecture -- has a bit of leeway. For example, two Pentium III machines with different amounts of RAM or a different CPU speed would still be considered homogeneous. In general, homogeneity is determined by whether the software compiled on one machine can run natively on another. In the case of the same CPU but different amounts of RAM or a different CPU speed, this is most likely true. This is not necessarily true between a Pentium II and a Pentium III, for example.
For example, the following cluster is considered homogeneous:
Note that it wasn't necessary to list the Linux kernel version and/or de<glibcde< version because they're all the same by virtue of being the same Linux distribution and version.
The following are some example clusters that are not homogeneous -- they are heterogeneous:
[ Top of page | Return to FAQ ]
Yes -- that's one of the reasons that LAM/MPI exists.
LAM/MPI will work between just about any flavor of POSIX (with a few restrictions). That is, you can have two completely different machines (e.g., a Sun machine and an Intel-based machine), and LAM will run on both of them. More importantly, you can run a single parallel job that spans both of them.
LAM will transparently do any data conversion necessary.
An important restriction is that LAM does not currently support systems that have datatypes that are different sizes. For example, if an integer is 64 bits on one machine and is 32 bits on another, LAM's behavior is undefined. Also, LAM requires that floating point formats be the same. That is, endianness can be different, but the same general format must be obeyed by all participating machines (e.g., older Alpha machines do not adhere to IEEE floating point standard by default -- such machines can be used in parallel jobs with other similar machines, but to use them in a heterogeneous situation would require adherence to the IEEE floating point standards so that all nodes in the parallel job understand the same floating point formats).
Indeed, what is the Right Thing for an MPI to do in these kinds of situations, anyway? There really is no good answer -- having MPI truncate when 64 bit integers are sent to 32 bit integers is not desirable, nor is having the MPI translate from one floating point format to another (for similar loss of precision reasons).
Strictly speaking, yes.
BUT different versions of LAM will not work together. In order to successfully de<lambootde< and de<mpirunde<, you must use the same version of LAM/MPI on all nodes, regardless of their operating system, architecture, etc.
So a better answer is really: Yes, but don't ever, ever do this.
In general, LAM must be compiled and installed separatedly for each different kind of node in a heterogeneous cluster.
Other questions in this FAQ discuss how to install LAM across a [homogeneous] cluster -- there are two general schemes:
Both of these methods are possible for heterogeneous clusters as well. Physically installing LAM on each node in the cluster is the safest, least complicated way to do this. However, it is potentially the most labor intensive, and most difficult to maintain over time.
In most cases, there will be multiple nodes of each kind in a heterogeneous cluster. As such, it may be useful to consider a heterogeneous cluster to be a group of homogeneous clusters. So although local policies and requirements may vary, the LAM Team recommends that LAM is installed on a networked filesystem in each homogeneous cluster.
NOTE: There are some scalability issues with using networked filesystems on large clusters. As such, it may not be sufficient or desirable to use the common filesystem model at your site, depending on the size of your cluster and your choice of networked filesystem. YMMV.
For example, consider a cluster of 16 Pentium II nodes running Red Hat 7.0 and second group of 16 Pentium III nodes running Red Hat 7.1. Both the architecture difference and operating system difference make these sub-clusters heterogeneous.
In the common filesystem model, LAM will need to be installed twice for the heterogeneous cluster described above -- once for the PII/RH7.0 machines, and once for the PIII/RH7.1 machines. Each machine in the cluster will need to either mount the appropriate LAM installation, and/or user paths will need to be set appropriately on each node in the cluster to point to the appropriate LAM installation.
The same holds true for more obviously-heterogeneous clusters, such as a group of UltraSparc machines running Solaris and a group of Pentium III machines running some flavor of Linux.
In additions to the normal requirements for de<lambootde<, the additional requirements must be satisfied:
de<de< /home/lam/sparc-sun-solaris2.8 /home/lam/linux-redhat7.1 /home/lam/linux-suse7.2
/home/lam/sparc-sun-solaris2.8 /home/lam/linux-redhat7.1 /home/lam/linux-suse7.2
If de</home/lamde< is NFS mounted on all nodes in the cluster, the user's de<$PATHde< must be set to use one of those three trees as appropriate for the kind of node that they are logged in to. This is typically set in the user's dot files (e.g., de<$HOME/.profilede<, de<$HOME/.cshrcde<, etc.), or in a system-wide default dot file (these vary between different operating systems).
There are three cases:
de<de<shell$ mpiexec -arch linux my_mpi_program.linux : \ -arch solaris my_mpi_program.solaris
shell$ mpiexec -arch linux my_mpi_program.linux : \ -arch solaris my_mpi_program.solaris
LAM will look for Linux architecture nodes in the current universe and launch the executable de<my_mpi_program.linuxde<. Similarly, LAM will launch the executable de<my_mpi_program.solarisde< on all Solaris nodes in the universe. The string after the de<-archde< switch specifies a text string to match from the output of the GNU de<config.guessde< script (i.e., the output from de<laminfode< in the architecture line).
The de<-archde< switch to de<mpiexecde< can be be used in other cases (e.g., absolute path names); this is just one example. See the manual page for de<mpiexec(1)de< for more details.
de<de<sun1sun2hp1hp2redhat1redhat2suse1suse2
sun1sun2hp1hp2redhat1redhat2suse1suse2
we can use the following app schema file to launch the "right" copy of de<foode< for each architecture:
de<de<n0-1 /home/jshmo/mpi/sun-sparc-solaris2.6/foon2-3 /home/jshmo/mpi/hppa2.0w-hp-hpux11.00/foon4-5 /home/jshmo/mpi/linux-redhat7.1/foon6-7 /home/jshmo/mpi/linux-suse7.2/foo
n0-1 /home/jshmo/mpi/sun-sparc-solaris2.6/foon2-3 /home/jshmo/mpi/hppa2.0w-hp-hpux11.00/foon4-5 /home/jshmo/mpi/linux-redhat7.1/foon6-7 /home/jshmo/mpi/linux-suse7.2/foo
Remember, it may be necessary to have different versions of the MPI binary for each OS version as well as each machine architecture. For example, you may need to have a separate versions for Solaris 2.5 and 2.6. This is also true when running between different linux distributions -- as shown in the example above where Red Hat and SuSE are considered different operating systems and therefore have their own copy of de<foode<.
By definition, a mixture of 32 and 64 bit machines is a heterogenous cluster.
LAM/MPI allows two possibilities for mixing 32 and 64 bit machines in a single parallel job:
This solution works well and avoids many complicated situations which arise out of mixing 32 and 64 bit executables and are outside the scope of MPI (see discussion below).
There is, unfortunately, no good answer to this. Obvious choices include invoking an error or truncating the data, neither of which are attractive. Debugging such applications is non-trivial and therefore this is not the preferred solution.
NOTE: LAM/MPI has not been tested in this kind of configuration! It may work (if the user application stays away from messages with mismatched data sizes), but it may not... Consider yourself warned.
推荐过这篇日志的人:
他们还推荐了:
网易公司版权所有 ©1997-2018
评论