I think the key is "unexpected" qualifier in your graph. In order to detect the unexpected you need to have an idea of what's expected.
I would start with a simple time series model such as AR(p) or ARMA(p,q). Fit it to data, add seasonality as appropriate. For instance, your SAR(1)(24) model could be: $y_{t}=c+\phi y_{t-1}+\Phi_{24}y_{t-24}+\Phi_{25}y_{t-25}+\varepsilon_t$, where $t$ is time in hours. So, you'd be predicting the graph for the next hour. Whenever the prediction error $e_t=y_t-\hat y_t$ is "too big" you throw an alert.
When you estimate the model you'll get the variance $\sigma_\varepsilon$ of the error $\varepsilon_t$. Depending on your distributional assumptions, such as normal, you can set the threshold based on the probability, such as $|e_t|<3\sigma_\varepsilon$ for 99.7% or one-sided $e_t>3\sigma_\varepsilon$.
The number of visitors is probably quite persistent, but super seasonal. It might work better to try seasonal dummies instead of the multiplicative seasonality, then you'd try ARMAX where X stands for exogenous variables, which could be anything like holiday dummy, hour dummies, weekend dummies etc.
A few years ago my team implemented a impulse detection algorithm in Holt-Winters (HW) context, this time with strong seasonality and no trend.
The main idea was to look for an unusual difference between prediction at time $t$ and real value: an outlier that goes several times beyond the std. deviation of the noise (the std. deviation being estimated from the past errors).
This article was our starting point: http://www.jmlr.org/papers/volume9/li08a/li08a.pdf. It is worth reading. But soon we realized their precise idea did not and could not work (page 2222 point 3) even if the global outlier idea was OK.
There were many difficult points. One of them is once the impulse has started but not reached the threshold of "it's an impulse", HW is already influenced. We used sort of geometric sequences to balance the fact that is has already been influenced. This worked but was not easy and required a bit of work.
We also needed to work on repeated impulses and implement a rewind because sometimes it's not possible to process things online and you have to recompute things from the past, after eliminating the past impulses.
And this was just for impulses. Ramp is something else.
I don't believe ARIMA would be very helpful for this specific problem. It is more sophisticated but most often not better than HW. One problem: less robust, which is a problem especially with anomalies.
I would recommend to get your hands dirty and try something step by step until it works in most cases, fixing problems one by one. At least, I don't known any mature method to solve this generally.
Best Answer
An other way to perform anomaly detection in an unsupervised manner, taking into account seasonality, is by mean of neural networks such as LSTM or auto encoders (see this recent paper).
Basically, the idea is to measure abnormality by mean of reconstruction error. Your network will be trained on a sequence of (supposedly) legitimate events, and for a new datapoint, will try to reconstruct it. Because the function your network has been learning is tailored for the normal datapoints, a reconstruction error (could) denote an anomaly.
ANN such as LSTM are designed to not update their internal state for every packet going through the network, thus retaining the impact of older data points for a longer period of time.