You'd find out how this works by running your code, but let's run through the rules anyway.
array[a][b]
means (array[a])[b]
, i.e. array
lists the individual array[a]
s. So depending on the value of $a$, array[a]
is either $\{0,\,2\}$, $\{1,\,3\}$ or undefined, whence e.g. array[1][0]
means {1, 3}[0]
, i.e. $1$.
My guess is that any implementation of $x_i^{(j)}$ you'll use from a library, or are expected to write with this book's guidance, treats $x$ as a list of the $x^{(j)}$s, so you'd need x[j][i]
. But check with an example when you get there. Similarly, $x_{i,\,j}^{(k)}$ would be x[k][i][j]
.
Having said all that, arrays may not be the right approach anyway, if you want very efficient calculations. I'm not an expert on the Java implications (but see here), so I'll talk about more general issues.
In practice, machine learning often relies on a data type other than standard arrays, so we can do calculations faster. The language used for machine learning may therefore be determined by the availability of suitable types. Python is slower than Java ceteris paribus due to being an interpreted language, but is popular in machine learning because of "numpy arrays", which are the basis of scipy, scikit, tensorflow etc. Not that you need Python to take advantage of such techniques: Java has equivalents.
If you ever make use of such software, there are indexing complications. You'll be allowed to rewrite x[j][i]
as x[j, i]
and x[k][i][j]
as x[k][i, j]
, and I expect you'd be allowed to use x[k, i, j]
too. But more importantly, the most efficient way to do operations such as matrix multiplication wouldn't be the usual sum-over-a-loop syntax.
Response to a comment, which is too long to be a comment
As for the case $x_{l,\,u}^{(j)}$, See Chapter 6 for enlightenment on that.
In Sec 6.1's inline equation $a_{l,\,u}=\mathbf{w}_{l,\,u}\mathbf{z}+b_{l,\,u}$, we can rewrite the first term as $\sum_kw_{l,\,u,\,k}z_k$ or $\sum_kw_{k,\,l,\,u}z_k$, so we expect index lists containing l, u
to place them together. But do we sum over a leftmost or rightmost $k$ index? Comparing two inline expressions in Sec. 6.2.2 answers that. The $t$th item in $\mathbf{X}$, with indexing starting at $1$ instead of $0$, is denoted $\mathbf{x}^t$, and later we see $h_{l,\,u}^t$. It seems we want to place l, u
last, as per your first guess and my answer.
In the notation $x_{l,\,u}^{(j)}$, this is the usual Western down-and-right reading order. To relate this to the phrase "input feature $j$ of unit $u$ in layer $l$", I can only recommend watching the prepositions: in layer $l$ there is a unit $u$ which has a feature $j$, so l, u, j
is a more natural ordering than your alternative suggestion of l, j, u
, but for efficient dot-product calculations it's been changed by a cyclic permutation to j, l, u
.
Best Answer
I checked the file with my pdf-viewer and got a message that this font cannot be displayed and is replaced. Then $10\%$ looked like $10\bullet\bullet$. It might be that your viewer displays it as $10 H$.