I think this reflects the nature of your data whereby you can get increasing accuracy at the expense of the more useful measure of loss.
By way of explanation imagine that your data is the stock market and you want to classify up or down days for the purpose of investing. In this scenario not all days are of equal value - a relatively few days will make or break your investing career - and increasing the classification accuracy of the days that have have little or no movement is irrelevant to your desired outcome of making a profit. It would be much better to have pretty poor accuracy in aggregate but actually be highly accurate in predicting the few days when the market makes monster moves.
Other answers explain well how accuracy and loss are not necessarily exactly (inversely) correlated, as loss measures a difference between raw prediction (float) and class (0 or 1), while accuracy measures the difference between thresholded prediction (0 or 1) and class. So if raw predictions change, loss changes but accuracy is more "resilient" as predictions need to go over/under a threshold to actually change accuracy.
However, accuracy and loss intuitively seem to be somewhat (inversely) correlated, as better predictions should lead to lower loss and higher accuracy, and the case of higher loss and higher accuracy shown by OP is surprising. I have myself encountered this case several times, and I present here my conclusions based on the analysis I had conducted at the time. There may be other reasons for OP's case.
Let's consider the case of binary classification, where the task is to predict whether an image is a cat or a horse, and the output of the network is a sigmoid (outputting a float between 0 and 1), where we train the network to output 1 if the image is one of a cat and 0 otherwise. I believe that in this case, two phenomenons are happening at the same time.
Some images with borderline predictions get predicted better and so their output class changes (eg a cat image whose prediction was 0.4 becomes 0.6). This is the classic "loss decreases while accuracy increases" behavior that we expect.
Some images with very bad predictions keep getting worse (eg a cat image whose prediction was 0.2 becomes 0.1). This leads to a less classic "loss increases while accuracy stays the same". Note that when one uses cross-entropy loss for classification as it is usually done, bad predictions are penalized much more strongly than good predictions are rewarded. For a cat image, the loss is $log(1-prediction)$, so even if many cat images are correctly predicted (low loss), a single misclassified cat image will have a high loss, hence "blowing up" your mean loss. See this answer for further illustration of this phenomenon. (Getting increasing loss and stable accuracy could also be caused by good predictions being classified a little worse, but I find it less likely because of this loss "asymmetry").
So I think that when both accuracy and loss are increasing, the network is starting to overfit, and both phenomena are happening at the same time. The network is starting to learn patterns only relevant for the training set and not great for generalization, leading to phenomenon 2, some images from the validation set get predicted really wrong, with an effect amplified by the "loss asymmetry". However, it is at the same time still learning some patterns which are useful for generalization (phenomenon one, "good learning") as more and more images are being correctly classified.
I sadly have no answer for whether or not this "overfitting" is a bad thing in this case: should we stop the learning once the network is starting to learn spurious patterns, even though it's continuing to learn useful ones along the way?
Finally, I think this effect can be further obscured in the case of multi-class classification, where the network at a given epoch might be severely overfit on some classes but still learning on others.
Best Answer
Thanks for all the comments. First, I looked at this problem as overfitting and spend so much time on methods to solve this such as regularization and augmentation. Finally, after trying different methods, I couldn't improve the validation accuracy. Thus, I went through the data. I found a bug in my data preparation which was resulting in similar tensors being generated under different labels. I generated the correct data and the problem was solved to some extent (The validation accuracy increased around 60%). Then finally I improved the validation accuracy to 90% by the technique that @Jonathan mentioned in his comment: adding more "conv2d + maxpool" layers.