Cluster Computing
Overview
The traditional method of frequency domain electromagnetic (EM) analysis on a single computer requires each frequency to be simulated one after another in series. With Sonnet Cluster, the high frequency designer can leverage computing resources composed of multiple computers operating in parallel to shorten the overall simulation time. An N-fold speed increase is possible with a computing environment composed of N computers, each with similar computing speed.
Sonnet Cluster maps the Sonnet EM simulation to a parallel computing environment by intelligently splitting the user-defined frequency sweeps into sub-jobs, which are simulated in parallel on Worker computers. Each sub-job includes only a portion of the overall number of frequencies to be simulated. Sonnet Cluster manages the process, automatically scheduling and assigning the sub-jobs to the available computer resources. As the sub-jobs are completed, Sonnet Cluster gathers the data from each sub-job and re-combines it into the main Sonnet project.
Requirements
Licensing
Sonnet Cluster requires the following licenses for operation:
- One Sonnet Software Cluster Computing (emcluster) license. This license allows for the splitting,combing, and overall management of sub-jobs in the parallel computing environment. You may test to see if you have an emcluster license by selecting Admin > Setup > [Cluster Computing] and clicking the Test License button.
- One EM solver license per Worker computer. A separate instance of the EM solver runs on each Worker computer, and each instance requires one copy of an EM solver license to run.
Any type of EM solver license may be used for each client computer. Any limitations of the license still apply.
Hardware and Operating System
Sonnet Cluster is available on both Windows and Linux operating systems. Complete hardware and operating system requirements may be found at the following Sonnet Software web link:
https://www.sonnetsoftware.com/requirements
Functionality
The Sonnet Cluster system is made up of the following software components:
- Client Application
- Manager Program
- Worker Program
Each of these components runs on one or more computers. A sample Sonnet Cluster system is shown below, with arrows showing the flow of data between each component.
Client Application
The Client Application is run on the end-user's desktop or laptop computer. When a user submits a job to the cluster from a Job Queues tab, the Client Application is automatically launched in the background. The Client Application performs the following functions:
- It initiates communication with the Manager Program.
- It packs the Sonnet project and sends the packed project to the Manager Program.
- It waits until the Manager Program initiates the EM simulation on the cluster.
- It receives simulation data and status messages from the Manager Program as the simulation progresses.
- It supports disconnect and reconnect from/to the Manager Program.
Manager Program
The Manager Program may be run on a Worker computer, or it may be run on an independent computer. This program performs the following functions:
- It receives packed projects submitted by one or multiple users through their Client Applications.
- It manages the cluster's job queue. Jobs are simulated in the order they are received by the Manager Program.
- It monitors the status of all Worker Programs assigned to the Manager Program.
- When a Worker Program becomes idle, the Manager Program initiates simulation of the next job in the queue on the idle Worker Program.
- It splits the next job into sub-jobs, with each sub-job being run on one idle Worker Program. See Frequency Splitting below for details on how jobs are split into sub-jobs.
- It receives simulation data and status messages from Worker Programs running the sub-jobs and forwards these to the appropriate Client Application.
- It combines simulation data from Worker Programs and then launches a final simulation on one Worker Program to complete the simulation of an ABS sweep if necessary.
- It maintains simulation data for each job if the user chooses to disconnect the Client Application from the Manager Program. It sends simulation data to the Client Application when the user subsequently chooses to reconnect.
Worker Program
A separate Worker Program runs an EM solver on each Worker computer. The Worker Program performs the following functions:
- It reports statuses to the Manager Program that indicate when the Worker computer is idle, running a job, or is down.
- It receives new sub-jobs from the Manager Program which need to be simulated.
- It launches EM solver simulations on sub-jobs.
- It receives simulation data and status messages from the EM solver while a sub-job is simulated and forwards these to the Manager Program.
Installation and Configuration
Sonnet Installation
A full Sonnet installation is required on each Client, Manager, and Worker computer. There are no restrictions on the combinations of hardware or operating systems that may be used in the Sonnet Cluster system. For example, the Client computers could run on a Windows operating system while the Manager and Worker computers run on a Linux operating system. Furthermore, while it is recommended that all Worker computers use identical hardware and operating systems for maximum efficiency, this is not a requirement.
Sonnet utilizes Reprise licensing. All Client, Manager, and Worker computers must have an open socket communication channel to the Reprise license manager. Please refer to Licensing for details.
Sonnet Cluster Configuration
Once Sonnet has been installed on the Worker, Manager, and Client computers, some configuration is required on each computer to utilize Sonnet Cluster. We recommend configuring the computers in the order given below.
Configure Worker Computers
To configure a Worker computer for Sonnet Cluster, follow these steps:
Open Sonnet on the Worker computer.
Select Admin > Setup > [Cluster Computing].
Enable the Use this computer as a Worker checkbox.
Click the Worker tab.
In the My Manager sub-tab, enter the hostname or IP address of the Manager computer.
Optionally, click the Advanced sub-tab.
The Advanced sub-tab allows you to change the port number and/or the data folder.
Port Number: This is the port number for communication between the Manager and Worker. The default port number is 56400. You may specify an alternate port number if you wish.
Remember this port number because it will be used later when setting up the Manager computer.
Data Folder: This folder is the temporary location where the Sonnet simulation of a sub-job will occur on the Worker. You may specify an alternative folder if you wish.
Click the Status tab.
To start the Worker Program, click the Start button.
The status of the Worker Program is displayed in the Status tab.
An open socket communication between the Manager and Worker must be allowed on the specified port number. This may require configuration of your firewall to allow communication on this port number.
Configure Manager Computer
The required steps for configuring the Manager are as follows:
Open Sonnet on the Manager computer.
Select Admin > Setup > [Cluster Computing].
Enable the Use this computer as a Manager checkbox.
Click the Manager tab.
Optionally, click on the Advanced sub-tab.
The Advanced sub-tab allows you to change the port number and/or the data folder.
Port Number: This is the port number for communication between the Manager and Client Application and it must not be the same number as the port number used when configuring your Worker Computer. The default port number is 56300. You may specify an alternate port number if you wish.
Remember this port number because it will be used later when setting up the Client computers.
Data Folder: This folder is the temporary location where the Sonnet simulation results from all Workers will be combined by the Manager. You may specify an alternative folder if you wish.
Click the My Workers sub-tab.
Here you will assign pre-configured Workers to the Manager (see Configure Worker Computers for instructions for configuring your Worker Computers).
Click the Add Worker button.
A new Worker entry will be created.
Enter the hostname or IP address of a Worker computer in the Hostname/IP column.
In the Port column, enter the port number.
This is the port number used for communication between the Manager and Worker. The default port number of 56400 is pre-filled in this new entry. If you used a different port number when configuring your Worker Computer, enter that port number.
Click the Refresh button.
The present status of the Worker computer will be displayed.
Continue to assign the remaining Workers to the Manager using the procedure in the previous steps.
Click the Jobs sub-tab.
Click the Start button.
This starts the Manager Program. The status of the Manager Program will be displayed at the bottom of the Jobs sub-tab.
Open socket communications between the Manager and Client Application and between the Manager and Worker must be allowed on the specified port numbers. This may require configuration of your firewall to allow communication on these port numbers.
Configure Client Application
The required steps for configuring a Client Application are as follows:
Open Sonnet on the Client computer.
Select Admin > Setup > [Em Server List].
Click the Add Server button.
In the Server column, enter the hostname or IP address of the Manager computer.
In the Port column, enter the port number.
This is the port number used for communication between the Client Application and Manager Program. The default port number of 56300 is pre-filled in this new entry. If you used a different port number when configuring your Manager Computer, enter that port number.
Press the Refresh Statuses button to view the status of the Manager Program.
Sonnet Cluster Management
You may wish to stop or start the Worker and/or Manager Programs for various reasons, such as performing maintenance on that computer. The following procedures may be used to stop and start the Worker and/or Manager Programs.
Stopping the Manager Program
The procedure for stopping the Manager Program is as follows:
Launch Sonnet from the Manager computer.
Select Admin > Setup > [Cluster Computing].
Click the Manager sub-tab.
Optionally, click the Block button and wait for all jobs to complete.
To facilitate the maintenance process, Sonnet provides the capability to block new, incoming jobs from being added to the cluster queue, while still allowing the jobs which are already in the queue to be completed. With this feature, an administrator can block new jobs from being added to the queue, wait for the jobs in the queue to be completed, take down the cluster system, and the perform the necessarily maintenance.
Press the Stop button.
After a few moments, the Manager Program stops.
Stopping a Worker Program
The procedure for stopping the Worker Programs is as follows:
Launch Sonnet from the Worker computer.
Select Admin > Setup > [Cluster Computing].
Click the Worker sub-tab.
Press the Stop button.
After a few moments, the Worker Program stops.
Starting a Manager or Worker Program
The procedure for starting the Manager and Worker Programs is as follows:
Launch Sonnet from the Manager or Worker computer.
Select Admin > Setup > [Cluster Computing].
Click the Manager or Worker sub-tab as appropriate.
Press the Start button to start the process.
For the Manager Program, optionally click the Unblock button.
If you previously used the Block button to block new incoming jobs, you should click the Unblock button to allow new incoming jobs.
Reboot and Logout Issues
By default, if you reboot your computer, the Manager or Worker process will not restart automatically. This is a common scenario with Windows computers when automatic updates cause the computer to reboot. In addition, if you log out, the process could stop. Therefore, you may wish to have the process automatically start when your computer boots. The procedure for doing this is dependent on your operating system and is explained in the topic Running a Sonnet Program at Boot Up.
Frequency Splitting
Generally, there are two classes of frequency sweeps which the user may define within a Sonnet project: discrete frequency sweeps and Adaptive frequency sweeps. Each class of frequency sweep is described below along with a corresponding algorithm which Sonnet Cluster employs to split the sweep into sub-jobs.
Discrete Frequency Sweeps
Discrete frequency sweeps are those sweeps for which the EM solver performs a full electromagnetic simulation at every frequency included in the sweep. An example of a discrete sweep is a Linear sweep from 1.0 GHz to 10.0 GHz with a step size of 1.0 GHz. In this example, an electromagnetic analysis is performed at ten different frequencies. Sonnet Cluster will employ the following algorithm to simulate these ten frequencies on the Worker computers:
- Determine the number of worker computers which are available. To be available, the Worker Program must be running and must also be in an Idle state.
- Divide the discrete frequencies between the available Workers. For the example of ten discrete frequencies, if there are ten or more available Workers, then the Manager will choose ten Workers and assign one discrete frequency to each of those Workers. If there are fewer than ten available Workers, the Manager will distribute the discrete frequencies as evenly as possible among the available Workers. In this example, if there are three available Workers, one Worker will get four discrete frequencies and the other two Workers will get three frequencies.
Adaptive Frequency Sweeps
Adaptive frequency sweeps are those sweeps for which the EM solver performs a full electromagnetic simulation at selected frequencies, then fits those selected frequencies with a rational polynomial function to obtain interpolated results at the remaining frequencies. An example of an Adaptive sweep is an ABS sweep from 10.0 GHz to 70.0 GHz, with a resolution of 0.20 GHz. The Sonnet Cluster will employ the following algorithm to simulate this ABS sweep:
- Split the ABS band into uniformly distributed frequencies. By default, the ABS sweeps are split into seven uniformly distributed frequencies. For the above example, these seven frequencies would be 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, and 70.0 GHz.
- Determine the number of Worker computers which are available. To be available, the Worker Program must be running and must also be in an Idle state.
- Divide the seven uniform frequencies between the available Workers. For this example, if there are seven or more available Workers, then the Manager will choose seven Workers and assign one frequency to each of those workers. If there are fewer than seven available Workers, the Manager will distribute the frequencies as evenly as possible among the available Workers. In this example, if there are three available workers, one Worker will get three frequencies and the other two workers will get two frequencies each.
- When all seven uniformly distributed frequencies have been completed, combine the results back into a single project.
- Launch an Adaptive simulation on the project with the combined results. Sonnet will automatically incorporate the results from the seven initial frequencies into the Adaptive sweep when it computes the rational polynomial fit. In most cases, the seven uniformly distributed frequencies are sufficient to obtain convergence. However, if the solution has not converged, the EM solver will continue with the Adaptive sweep, selectively choosing one or more additional frequencies for a full electromagnetic simulation, until the required level of accuracy is obtained.
FAQ
How do I submit a job to the cluster?How do I submit a job to the cluster?
If you have not already done so, configure the Client Application on your computer. Then, submit your job to the cluster queue in the same way you would submit a job to any queue (see Analyzing Projects).
Both the Worker and Manager Programs store their intermediate results in temporary folders. The location of these folders may be displayed by using the following procedures:
Worker: On the Worker computer, select Admin > Setup > [Cluster Computing], click the Worker tab and then click the Advanced sub-tab. The temporary folder is shown in the Data Folder section.
Manager: On the Manager computer, select Admin > Setup > [Cluster Computing], click the Manager tab and then click the Advanced sub-tab. The temporary folder is shown in the Data Folder section
Do I have a Sonnet Cluster license?Do I have a Sonnet Cluster license?
See Licensing.
No. All Workers must be assigned to the Manager prior to starting the Manager. If the Manager is already running, stop it, assign new Workers, and then restart the Manager.
Why did the Test Connection on my Worker fail?Why did the Test Connection on my Worker fail?
The Manager Program is not running, or the communication port between the Manager and Worker is blocked by a firewall.
Yes. However please be aware that EM simulations running on the Worker may require substantial system resources in terms of CPU and RAM. This can impact the performance of the Manager. Our recommendation is to dedicate an independent computer for the Manager. Since EM simulations will not be run on a dedicated Manager computer, the system requirements for that computer are much lower.
Yes, but the Manager and Worker computers should be stable, reliable systems that have maximum up-time to ensure that the cluster is accessible.
Yes, it is perfectly fine to run both managers on the same computer. However, it is not a requirement.