Formula for weighted Pearson correlation can be easily found on the web, StackOverflow, and Wikipedia and is implemented in several R packages e.g. psych, or weights and in Python's statsmodels package. It is calculated like regular correlation but with using weighted means,
$$ m_X = \frac{\sum_i w_i x_i}{\sum_i w_i}, ~~~~ m_Y = \frac{\sum_i w_i y_i}{\sum_i w_i} $$
weighted variances,
$$ s_X = \frac{\sum_i w_i (x_i - m_X)^2}{ \sum_i w_i}, ~~~~
s_Y = \frac{\sum_i w_i (y_i - m_Y)^2}{ \sum_i w_i} $$
and weighted covariance
$$ s_{XY} = \frac{\sum_i w_i (x_i - m_X)(y_i - m_Y)}{ \sum_i w_i} $$
having all this you can easily compute the weighted correlation
$$ \rho_{XY} = \frac{s_{XY}}{\sqrt{s_X s_Y}} $$
As for your second question, as I understand it, you would have data about correlations between political orientation and preference for the twenty artists and users binary answers about his/her preference and you want to get some kind of aggregate measure of it.
Let's start with averaging correlations. There are multiple methods for averaging probabilities, but there don't seem to be so many approaches to averaging correlations. One thing that could be done is to use Fisher's $z$-transformation as described on MathOverflow, i.e.
$$ \bar\rho = \tanh \left(\frac{\sum_{j=1}^K \tanh^{-1}(\rho_j)}{K} \right) $$
It reduces the skewness of the distribution and makes it closer to normal. This procedure was also described by Bushman and Wang (1995) and Corey, Dunlap, and Burke (1998).
Next, you have to notice that if $r = \mathrm{cor}(X,Y)$, then $-r = \mathrm{cor}(-X,Y) = \mathrm{cor}(X,-Y)$, so positive correlation of musical preference with some political orientation is the same as negative correlation of musical dislike to such political orientation, and the other way around.
Now, let's define $r_j$ as correlation of musical preference of $j$-th artist to some political orientation, and $x_{ij}$ as $i$-th users preference for $j$-th artist, where $x_{ij} = 1$ for preference and $x_{ij} = -1$ for dislike. You can define your final estimate as
$$ \bar r_i = \tanh \left(\frac{\sum_{j=1}^K \tanh^{-1}(r_j x_{ij})}{K} \right) $$
i.e. compute average correlation that inverts the signs for correlations accordingly for preferred and disliked artists. By applying such a procedure you end up with the average "correlation" of users' preference and political orientation, that as regular correlation ranges from $-1$ to $1$.
But...
Don't you think that all of this is overkill for something that is basically a multiple regression problem? Instead of all the weighting and averaging you could simply use weighted multiple regression (linear or logistic depending if you predict binary preference or degree off preference in either direction) where weights are based on sizes of subsamples. You would use musical preference for each artist as a predictor. In the end, you'll use the user's preference to make predictions. This approach is simpler and more statistically elegant. It also applies relative weights to the artists while averaging the correlations doesn't correct for their relative "impact" on the final score. Moreover, regression takes into consideration the base rate (or default political orientation), while averaging correlations does not. Imagine that the vast majority of the population prefers party $A$, this should make you less eager to predict $B$'s, and regression accounts for that by including the intercept. The only problem is multicollinearity but when averaging correlations you ignore it rather than dealing with it.
Bushman, B.J., & Wang, M.C. (1995). A procedure for combining sample correlation coefficients and vote counts to obtain an estimate and a confidence interval for the population correlation coefficient. Psychological Bulletin, 117(3), 530.
Corey, D.M., Dunlap, W.P., and Burke, M.J. (1998). Averaging Correlations: Expected Values and Bias in Combined Pearson rs and
Fisher's z Transformations, The Journal of General Psychology, 125(3), 245-261.
Best Answer
Here's a surprisingly vast array of the answer copying indexes, with little discussion of their merits though: http://www.bjournal.co.uk/paper/BJASS_01_01_06.pdf.
There's a field of (educational) psychology called item response theory (IRT) that provides the statistical background for questions like these. If you an American, and took an SAT, ACT or GRE, you dealt with a test developed with IRT in mind. The basic postulate of IRT is that each student $i$ is characterized by their ability $a_i$; each question is characterized by its difficulty $b_j$; and the probability to answer a question correctly is $$ \pi(a_i,b_j;c) = {\rm Prob}[\mbox{student $i$ answers question $j$ correctly}] = \Phi( c(a_i-b_j) ) $$ where $\Phi(z)$ is the cdf of the standard normal, and $c$ is an additional sensitivity/discrimination parameter (sometimes, it is made question-specific, $c_j$, if there's enough information, i.e., enough test takers, to identify the differences). A hidden assumption here that given the students ability $i$, answers to different questions are independent. This assumption is violated if you have a battery of questions about say the same paragraph of text, but let's abstract from it for a minute.
For "Yes/No" questions, this may be the end of the story. For more than two category questions, we can make an additional assumption that all wrong choices are equally likely; for a question $j$ with $k_j$ choices, probability of each wrong choice is $\pi'(a_i,b_j;c) = [1-\pi(a_i,b_j;c)]/(k_j-1)$.
For students of abilities $a_i$ and $a_k$, the probability that they match on their answers for a question with difficulty $b_j$ is $$ \psi(a_i,a_k;b_j,c) = \pi(a_i,b_j;c)\pi(a_k,b_j;c) + (k-1)\pi'(a_i,b_j;c)\pi'(a_k,b_j;c) $$ If you like, you can break this into probability of matching on the correct answer, $\psi_c(a_i,a_k;b_j,c) = \pi(a_i,b_j;c)\pi(a_k,b_j;c)$, and the probability of matching on an incorrect answer, $\psi_i(a_i,a_k;b_j,c) = (k-1)\pi'(a_i,b_j;c)\pi'(a_k,b_j;c)$, although from the conceptual framework of IRT, this distinction is hardly material.
Now, you can compute the probability of matching, but it will probably be combinatorially minuscule. A better measure may be the ratio of the information in the pairwise pattern of responses, $$ I(i,k) = \sum_j 1\{ \mbox{match}_j \} \ln \psi(a_i,a_k;b_j,c) + 1\{ \mbox{non-match}_j \} \ln [1- \psi(a_i,a_k;b_j,c) ] $$ and relate it to the entropy $$ E(i,k) = {\rm E}[ I(i,k) ] = \sum_j \psi(a_i,a_k;b_j,c) \ln \psi(a_i,a_k;b_j,c) + (1- \psi(a_i,a_k;b_j,c) ) \ln [1- \psi(a_i,a_k;b_j,c) ] $$ You can do this for all pairs of students, plot them or rank them, and investigate the greatest ratios of information to entropy.
The parameters of the test $\{c,b_j, j=1, 2, \ldots\}$ and student abilities $\{a_i\}$ won't fall out of blue sky, but they are easily estimable in modern software such as R with
lme4
or similar packages:or something very close to this.