MATLAB: Access HDFS from Matlab

accesshadoophdfs

Hi
we have installed Hadoop on two Linux (Ubuntu) machines (2 Datanode / 1 Namenode). Now, we want to access the data from a third computer where our Matlab R2014b is installed on a Windows operating system.
We have two questions:
1. How should we specify the Environment variables (HADOOP_PREFIX) on our Windows machine? 2. Do we need to install Hadoop on our Windows machine?
Thanks for your support.

Best Answer

Hi Kalsi,
Thanks for your feedback. Ludwig and I are currently working in this project. The problem was that the configuration in core-site.xml which contain namenode address (fs.default.name) still refers to local IP address. On the other hand, Matlab requires a correct IP address which directly links to the location of hdfs system. So, by changing the IP fs.default.name to public IP address, the Matlab is now able to connect to hdfs storage system.
The Matlab-hadoop configuration we are developing now consist of three computers which connected each other using private network. All these three computers use Ubuntu OS in which hadoop is installed. One of these computers has two network cards, one for local connection and the other for public connection.
The public network card is used by the other computer client to access to this Hadoop cluster. The problem is when we change the df.default.name (namenode address) to public IP address, the hadoop can not start the other two data nodes since the other two data nodes refers to namenode local IP address. I know that this is not Matlab related problem, but do you know how to configure it correctly ?
Thanks in advance,