Solved – How to standardise cases within groups to perform ranking across groups and over time

rankingsamplesample-sizestandard deviation

I am trying to create a method to rank ships according to their fuel burn rates. There are two different classes of ships, with different burn rates, so I am trying to create a mean and standard deviation for each class, then rank the ships according to how many standard deviations they are below the mean of their class. I will need to do this each month and each quarter.

There are three ships in one category and 7 in the other. In the category with three ships, there are also 9 ships in another area that I do not wish to rank, but have data for. In the category with 7 ships, there are 23 other ships I have data for, but do not wish to include in the ranking.

Questions:

  1. Is this the correct way to rank ships in two different classes? Any suggestions?
  2. Should the mean and standard deviation remain constant, or change every month / quarter?
  3. should the standard deviation and mean calculations include the other ships that I will not be ranking?
  4. I have data that lists the ships' average burn rates over a three year period, but these averages are not all based on the same amount of data (some ships were underway all year while some were underway only for a month).
    Can I use this data to create the standard deviation and mean, or do I need the monthly / quarterly data?

Best Answer

  1. There's no correct answer here, it depends on the purpose of your ranking. However, if there are no other variables of interest, then this doesn't seem like a stupid way to go - basically you're ranking ships relative to their expected performance.

  2. If you're comparing real ships to an "ideal ship", or standard, then the mean and std. deviation should remain constant (and it might also be more sensible to give the ships a score, rather than a ranking, as that provides more information). Having the standard remain constant also means you can track changes in individual ships, or thoe whole fleet. To figure out how the "ideal ship" performs, you should use all the data that's reasonable to use. That might mean using all the data, or, considering that the ships probably degrade with time, it might mean using the data for every ship, but only for the first few months/years, so that you can see.

  3. Unless there is something indicating that the extra ships have a different performance, you should use them, as it will give you better estimates.

  4. You can use the data given, if and only if you can assume that the performance is the same, independent of month. For example, if you only have the average of summer data for one ship, and you know that the ships perform better in summer, then that data will bias you mean estimate. Also, make sure to account for the different time spans in your average-of-averages.

Related Question