MATLAB: Am I unable to validate the CCS/HPC Server 2008 configuration in the Parallel Computing Toolbox

MATLAB Parallel Server

I have MATLAB Parallel Server set up on a Windows cluster running CCS/HPC Server 2008. When I attempt to validate the cluster configuration it fails. How can I resolve this issue?

Best Answer

There are several issues that can prevent the validation of the cluster. Run the following tests below to make sure that your configuration is setup properly. If at any point you receive an error message, you can submit a request to Installation support using the link at the bottom of the page. When submitting a request, be sure to include the following:
- Your license number
- The release of MATLAB on the client and the cluster
- The output of your validation (click details to get the full information)
- The results of the tests below
Also when submitting a request please reference Solution 1-BJRNU9.
1) Test the licensing of MATLAB Parallel Server
The first step is to ensure that the licensing for MATLAB Parallel Server works on your cluster. This will also test to see if MATLAB is crashing on startup on your cluster. To test this, go to one of the cluster nodes and open up a Windows Command Prompt by clicking on the Start Menu and go to All Programs, Accessories, and click on Command Prompt. In the command prompt, run the following commands:
cd $MATLAB\bin (where $MATLAB is the installation folder for MATLAB on the cluster)
matlab.exe -dmlworker -nodisplay -logfile C:\output.txt -r "ver;exit"
This will generate an output.txt file in C:\ that contains the ver output on the cluster. If the log file contains a network license manager error, this is the issue. In that case, check the support site for the license manager error number and take the appropriate action to resolve the license error before proceeding.
2) Check the releases of MATLAB on the cluster and the client where you validated
If you get the output of the "ver" command in the log file, check the releases of all the products in the list. The release of each product should match for all the products. Additionally, the release should match the release that is installed on the client where you ran the validation. To check the release on the client, run the ver command in MATLAB's command window. If the release of Parallel Computing Toolbox and MATLAB do not match the release of MATLAB and MATLAB Parallel Server on the cluster, you will not be able to use this configuration until the installations are at the same release.
3) Check to make sure that your configuration meets the scheduler requirements
In order to use MATLAB Parallel Server with CCS/HPC Server 2008, there are some additional requirements in the setup. Check the scheduler requirements page here for more details:
Additionally, this configuration requires the following:
- If your client machine is not on the cluster, you will need to install the Microsoft Compute Cluster Pack on the client.
- This configuration requires that the data for the jobs be stored on a shared file space between the clients and the cluster nodes. When creating the configuration, set the "DataLocation" variable to be a path that is accessible to all computers. Ex: \\server\share\user\data
- If the cluster nodes have a local installation of MATLAB Parallel Server and the MATLAB Parallel Server installation is installed in path with spaces such as C:\Program Files, you will need to modify the client configuration's "ClusterMatlabRoot" variable to use the old 8.3 character name format for the path. For example, if MATLAB is installed in C:\Program Files\MATLAB\R2009b, the ClusterMatlabRoot must be set to C:\PROGRA~1\MATLAB\R2009b. If the installation of MATLAB is on a shared server space, this is not an issue.
- In order to use the configuration each node must have the "CCP_SCHEDULER" environment variable set to point to the head node of the cluster. This is also true for the clients running MATLAB if they are not located in the cluster.
4) Check to ensure you have correctly configured the client configuration
In your client MATLAB, go to the Parallel menu to Manage Configurations. Right click on your ccs/hpcserver configuration and select Properties. You must set the appropriate values for ClusterMatlabRoot (the directory where is MATLAB installed on the cluster), DataLocation (where the data will be stored, NOTE: This must be accessible from the same path from all computers). You may want to set SchedulerHostname to be your head node as well.
For R2009b and higher, make sure you set ClusterVersion to the appropriate version for your cluster as well.
If you have confirmed all of the settings above, do all stages fail during validation, or just parallel and Matlabpool? If you are able to pass the Distributed Job phase, the validation may be reporting false errors. To confirm you can manually validate your cluster. To do so:
1. Distributed job:
To run a simple distributed job, run the following:
ccs = findResource('scheduler','configuration','<ConfigurationName>')
Where "ConfigurationName" is the name of the configuration you created
job = createJob(ccs);
createTask(job, @sum, 1, {[1 1]});
createTask(job, @sum, 1, {[2 2]});
createTask(job, @sum, 1, {[3 3]});
submit(job)
waitForState(job, 'finished', 60)
To confirm the job completed, run the following:
results = getAllOutputArguments(job)
If you get the following output, your cluster is configured and operating correctly.
results =
[2]
[4]
[6]
2. Parallel job:
After completing the distributed job, run the following:
pj = createParallelJob(ccs);
createTask(pj, @labindex, 1, {});
set(pj, 'MaximumNumberOfWorkers', 3);
set(pj, 'MinimumNumberOfWorkers', 3);
submit(pj)
waitForState(pj, 'finished', 60)
To confirm the job completed, run the following:
results = getAllOutputArguments(pj)
If you get the following output, your cluster is configured and operating correctly.
results =
[1]
[2]
[3]
3. MATLAB pool job:
To test MATLABPool or PMODE, simply run the command:
matlabpool open <ConfigName> <#ofLabs>
Where "Configname" is the name of the configuration and "#ofLabs" is the number of nodes to use in the cluster.
If your prompt is returned, your configuration is working. To quit MATLAB pool, simply type "exit".
If the MATLAB pool did not start and you did not receive an error message, try running:
setSchedulerMessageHandler(@disp)
and then try the MATLAB pool commands above. This should capture the error messages and forward them to the MATLAB command window.
If the manual tests passed, your configuration is working and you should be able to submit jobs.
If you are still having an issue, contact Installation support here:
NOTE
: Starting in R2019a the following name changes occurred:
  • MATLAB Distributed Computing Server was renamed to MATLAB Parallel Server 
  • mdce_def was renamed to mjs_def
  • mdce binary was renamed to mjs