We will count $n\!\times\!3$ grids containing no $2\!\times\!2$ block of $1$s ("three-row square-free grids"):
If we encode the columns by integers using the entries as the digits of a binary number, then for three rows, we have three classes of columns: $A=\{0,1,2,4,5\}$, $B=\{3,6\}$ and $C=\{7\}$, with transfer matrix $M=\Big(\begin{smallmatrix}5&2&1\\5&1&0\\5&0&0\end{smallmatrix}\Big)$.
The number of three-row square-free grids is thus the sum of the entries in the vector $$(5,2,1)\,M^{n-1}.$$
The closed form is complex, involving roots of cubic polynomials.
The first few terms are $8, 57, 417, 3032, 22077, 160697, 1169792, 8515337, 61986457, 451223152$. This is A181246 in OEIS.
Alternatively, if we let $a(z)$, $b(z)$ and $c(z)$ be the (ordinary) generating functions for $n\!\times\!3$ square-free grids where the final column is in classes $A$, $B$ and $C$ respectively, and we let $g_3(z)$ be the OGF for all three-row square-free grids, then solving the equations
$$\begin{array}{rcl}a(z) &=& 5 z + z\:\! (5 a(z) + 5 b(z) + 5 c(z))\\ b(z) &=& 2 z + z\:\! (2 a(z) + b(z))\\c(z) &=& z + z\:\! a(z)\\g_3(z)&=&a(z)+b(z)+c(z)\end{array}$$
gives
$$g_3(z)\;=\;\frac{z\:\! (8 + 9 z - 5 z^2)}{1 - 6 z - 10 z^2 + 5 z^3}.$$
For four-row grids, five classes are required, with transfer matrix $\left(
\begin{smallmatrix}
8 & 4 & 1 & 2 & 1 \\
8 & 2 & 1 & 1 & 0 \\
8 & 4 & 0 & 0 & 0 \\
8 & 2 & 0 & 0 & 0 \\
8 & 0 & 0 & 0 & 0
\end{smallmatrix}
\right)$, giving generating function $\frac{8 z (2+7 z+z^2-8 z^3)}{1-10 z-54 z^2-16 z^3+64 z^4}$.
There are many numerical results in OEIS: three rows, four rows, five rows, six rows, seven rows, eight rows, and nine rows.
Let us call $w$ and $h$ the width and height of the rectangle. If we start by considering perfectly square cells of size $s \times s$, the maximal area that can be covered is given by the product between the maximal number of packable cells $\displaystyle \lfloor{\frac{w}{s}}\rfloor \cdot \lfloor{ \frac{h}{s}}\rfloor$ and the area $s^2$ of each cell. It is not difficult to show that the remaining uncovered area R is given by
$$ R =h \cdot [w\pmod s]+ w \cdot [h \pmod s] - [w \pmod s] \cdot [h \pmod s] $$
For example, covering a $200 \times 50$ rectangle with area $A=10000$ using $7 \times 7$ cells, the maximal number of cells is $\displaystyle \lfloor{\frac{200}{7}}\rfloor \cdot \lfloor{ \frac{50}{7}}\rfloor=28 \cdot 7=196$, and the maximal covered area is $196 \cdot 7^2=9604$. The remaining uncovered area, which in this example is $10000-9604=396$, corresponds to
$$R= 50 \cdot [200\pmod 7]+ 200 \cdot [50 \pmod 7] - [200 \pmod 7] \cdot [50 \pmod 7]\\ =50 \cdot 4+ 200 \cdot 1- 4 \cdot 1=396$$
The formula for $R$ showed above, which for $s$ equal to max(cellMinWidth, cellMinHeight) also reflects the residual uncovered area obtained by applying the solution proposed in the OP, explains why this approach works well when $w$ and $h$ are similar, and fails when the difference between $w$ and $h$ is big. In fact, noting that both mod terms can range between $0$ and $s$, we get that the value of $R$ ranges between $0$ and $s(w+h-s)$. So, for a rectangle with given area $A=w \cdot h$ to be covered with cells of size $s \times s$, the condition that minimizes the sum $w+h$ (and that therefore minimizes the upper bound of $R$) clearly corresponds to the case $w=h$ (this is easily checked by setting $h=A/w$ and noting that $w+h=w+A/w$ has a minimum in $w=\sqrt{A}$). Accordingly, if we vary $w$ and $h$ by keeping their product constant (i.e., if we consider rectangles with constant area and variable proportions), the sum $w+h$ and the upper bound of $R$ progressively increases as the rectangle proportion departs from $1$ and as the difference between $w$ and $h$ becomes larger. This is also visible starting from the same example of a $200 \times 50$ rectangle as above: covering it with $s \times s$ cells, the value of $R$ ranges between $0$ and $s(250-s)$, but the upper bound of $R$ decreases to $s(200-s)$ if we consider a $100 \times 100$ rectangle, and raises to $s(1010-s)$ if we consider a $1000 \times 10$ rectangle (in both cases, despite no changes in the rectangle area). This explains why this method leads to a high probability of having a large uncovered area when the difference between $w$ and $h$ is large.
These considerations suggest that, when choosing the value of cell side $s$ to divide a $w \times h$ rectangle in the "best" way, a good idea could be to test all possible values of $s$, starting from the minimal values allowed, and to choose the value that minimizes $R$ according to the formula above. This analysis can be easily and rapidly obtained by a simple algorithm on a PC. Once the value of $s$ that minimizes $R$ has been found, we can also make a minimal variation in the cell size to achieve a $100\%$ covering, by adding to the cell width and height (until now considered equal in our analysis) a small quantity obtained by dividing the residual uncovered edges among all cells. The formulas for these small final adjustments are $\displaystyle \frac{w \pmod s}{\lfloor {w/s} \rfloor}$ and $\displaystyle \frac{h \pmod s}{\lfloor {h/s} \rfloor}$. For instance, if we find that in a $90 \times 34$ rectangle the best value for $s$ is $11$, we can initially cover the rectangle with a $8 \times 3$ grid of square cells of size $11$. Then, we can take the remaining edges $90 \pmod {11}=2$ and $34 \pmod {11}=1$ and distribute them among all cells, obtaining a $100\%$ covering using cells with width $11 +\frac{2}{8}=11.25$ and height $11+\frac{1}{3}\approx 11.33$. This keeps a nearly square shape for the cells and allows a complete covering.
Lastly, if we want to find a more rapid method to choose a good value of $s$ that avoids the need of testing all its possible values, it could be more convenient to privilege minimization of the mod term that, in the formula for $R$, is multiplied by the longest dimension of the rectangle. Taking again the same example of a $200 \times 50$ rectangle, to divide it in cells of size $s$ it is convenient to choose a value of $s$ that makes $50\pmod s$ as small as possible (privileging this, rather than minimization of $200\pmod s$), since then this mod term has to be multiplied by $200$. Although this method does not guarantee that the best value of $s$ is found, it might be useful to create simplified, faster algorithms to find "good" values of $s$.
Best Answer
Notation
It depends on the context. In general I'd agree, but since the counts you quote doesn't include a separate handling of the $3\times2$ case, in this situation I'd rather assume that $2\times3$ is meant to cover both orientations, both two rows and three columns and two columns and three rows.
Total count
If all you are interested in is the total number of non-square rectangles, then it doesn't matter how you write down each individual count. In fact, I'd not iterate over all possible shapes, but instead use a different approach.
For a grid of $m\times n$ tiles, there are $(m+1)\times(n+1)$ tile corners. If you consider each of them as a possible corner of a rectangle, and define each rectangle by two opposite corners, then you have a total of
$$\bigl((m+1)(n+1)\bigr)^2$$
possible combinations of two points. Obviously, that number is way too large, because we counted some things we shouldn't, and counted others more than once. So let's address those issues.
You don't want either dimension of such a rectangle to be zero. So if you picked the first point, then there are $m+n+1$ possible positions where that second point would be in the same row and/or the same column as the first. Subtract that and you are at
$$(m+1)(n+1)\bigl((m+1)(n+1)-(m+n+1)\bigr)=(m+1)(n+1)mn\;.$$
As Arthur pointed out in a comment, you can think of this as picking the second point from the $m\times n$ possible corner positions which you obtain by removing the “forbidden” row and column. At this point you have two points in distinct rows and columns, but nothing is sorted yet. So every possible rectangle is counted four times: any of its four corners might be the first point, with the opposite corner as the second point. So the number of rectangles is
$$\tfrac14(m+1)(n+1)mn\;.$$
Now you still have to get rid of those squares. How many are there? Let's look for a pattern. There are $m\times n$ squares of size $1$. There are $(m-1)\times(n-1)$ squares of size $2$, and so on. The maximal size such a square can have is the smaller of the dimensions of your tile grid. So there are
$$\sum_{i=1}^{\min(m,n)}(m+1-i)(n+1-i)$$
squares in total. Assuming $m\le n$ this simplifies to
$$\sum_{i=1}^{m}(m+1-i)(n+1-i)=\tfrac16 m(m+1)(3n-m+1)\;.$$
Now take the number of all rectangles, subtract all squares, and you are done. If you expand and then factor the result, you get a total count of
$$\tfrac1{12}m(m+1)(3n^2 + 2m - 3n - 2)= \tfrac1{12}m(m+1)\bigl(3n(n-1) + 2(m-1)\bigr)\;.$$
For $m=n=3$ this indeed yields a count of $22$. For $m=5,n=7$ (remember to ensure $m\le n$) this gives a count of $335$ non-square rectangles. You can verify that count with a brute-force computation.
Bonus question
If you want to, it might be interesting to think about why that formula always results in an integer if $m,n$ are integers.