You have an irregular intraday table (sentiments) and a daily timetable (returns). It sounds like you want to turn the sentiment data into a daily series? I'm gonna guess that maybe you want, on each trading day, a count of negative, neutral, and positive posts. Posts on weekends I guess you want to ignore.
Let's say you have this:
posts =
10×1 timetable
time sentiment
____________________ _________
27-Apr-2020 02:28:13 Negative
27-Apr-2020 06:01:41 Neutral
27-Apr-2020 09:56:51 Neutral
27-Apr-2020 13:57:48 Negative
27-Apr-2020 21:09:31 Positive
28-Apr-2020 00:31:11 Neutral
28-Apr-2020 02:26:17 Negative
28-Apr-2020 09:59:27 Neutral
28-Apr-2020 18:51:41 Negative
28-Apr-2020 19:01:19 Neutral
There are a bunch of ways to get the sentiment counts as separate daily count variables; here I'll show groupcountsto first get counts. In less recent releases of MATLAB, you can use groupsummary, or varfun. groupcounts has a nice way to compute daily counts, but here you want daily-by-sentiment counts, so discretize the times to dates before calling groupcounts.
>> posts.time = dateshift(posts.time,'start','day');
>> posts2 = groupcounts(posts,["time" "sentiment"])
posts2 =
5×3 table
time sentiment GroupCount
___________ _________ __________
27-Apr-2020 Negative 2
27-Apr-2020 Neutral 2
27-Apr-2020 Positive 1
28-Apr-2020 Negative 2
28-Apr-2020 Neutral 3
Now you need to make separate variables for each sentiment; that's unstack.
>> posts3 = unstack(posts2,'GroupCount','sentiment')
posts3 =
2×4 table
time Negative Neutral Positive
___________ ________ _______ ________
27-Apr-2020 2 2 1
28-Apr-2020 2 3 NaN
The NaN is a bit annoying; it's because unstack uses sum by default for aggregation. In the R2020a version of MATLAB that's just out, you can work around that by specifying @numel, but it's also easy to use replacemissing.
>> posts4 = fillmissing(posts3,'constant',0,'DataVariables',["Negative" "Neutral" "Positive"])
posts4 =
2×4 table
time Negative Neutral Positive
___________ ________ _______ ________
27-Apr-2020 2 2 1
28-Apr-2020 2 3 0
Now you are in business. Figure out the weekdays during the period you care about, create a daily datetime vector, and synchronize your posts and your returns timetables to that time vector.
Best Answer