on voting

Moderator: Moderators

User avatar
pubby
Posts: 583
Joined: Thu Mar 31, 2016 11:15 am

on voting

Post by pubby »

The current contest grading scale is:

Code: Select all

Art and Style              5 points
So1und                     5 points
Polish and Completeness	10 points
Originality	            15 points
Overall	                15 points
Two categories stick out as being arbitrary and over-valued: "Polish and Completeness", and "Originality". Together these categories make up half of the points.

Both of these categories have some oddball results from last year's contest. For example, "Nothing Good Can Come of This" was ranked last in originality by non-entrants, but first in originality by entrants. I get the impression that nobody's certain on how to judge these two categories, and that their results largely overlap with the other, more concrete categories.

A Proposal

Let's scrap the grading scale and have just a single vote for the overall score. Rate every game out of 10. That's it. The top 5 games get prizes. Really simple.

In addition, let's have a bunch of smaller categories run independently:

Code: Select all

Art and Style
Sound
Game Play
Originality
Humor
Programming
Multiplayer
Rate each out of 10. The winner of each category wins bragging rights (and perhaps one of M-Tee's ribbons on the a53 menu). This scores are completely independent from the overall score; nothing is summed together.
User avatar
gauauu
Posts: 779
Joined: Sat Jan 09, 2016 9:21 pm
Location: Central Illinois, USA
Contact:

Re: on voting

Post by gauauu »

That would make some aspects of judging easier. I remember staring at the "originality" field when judging your F-FF and not knowing what to put. One one hand, it was just F-Zero, and not original at all. On the other hand, you did something really technically new and different on the NES, so that would make it incredibly original.

I think I just shrugged and made up a number based on how much I liked it instead.
M_Tee
Posts: 430
Joined: Sat Mar 30, 2013 12:24 am
Contact:

Re: on voting

Post by M_Tee »

I like having different, independent categories as suggested.
I think a ranking-based voting system might be beter than assigning scores.

I haven't read in-depth about it, but https://civs.cs.cornell.edu/ might be worth checking out, with a different poll per each category.
User avatar
pubby
Posts: 583
Joined: Thu Mar 31, 2016 11:15 am

Re: on voting

Post by pubby »

M_Tee wrote:I haven't read in-depth about it, but https://civs.cs.cornell.edu/ might be worth checking out, with
I did some quick research.

If using Cordocet, the best version would be Ranked Pairs, which assigns 2nd, 3rd, 4th, etc places rather than just 1st. The website you posted can be used for automating this.

Still, I'm not really convinced this is the way to go. Preferential voting is less precise than numeric ratings because there's less data. The results are harder to check (and near impossible if using a 3rd party), and the system is really complicated to describe. So it's cool and all, but I think numeric is better for our purposes because it's simple while still being accurate.

Outside of that, one other topic to talk about is "incincere" voting: the idea that people can vote based on the outcome they want, rather than the actual quality of the games. Because of how few voters we have, a single low vote is enough to drop an entry several places and push up the voter's own. People can win better prizes by being jerks, essentially.

I don't believe this has happened yet (we're all swell people), but it's something to consider. Ignoring the best and worst vote of each entry could filter out some of this should the need arise. Or we could just use the median.
User avatar
NESHomebrew
Formerly WhatULive4
Posts: 418
Joined: Fri Oct 30, 2009 4:43 am
Contact:

Re: on voting

Post by NESHomebrew »

Ahhh voting... I wish it could be a simple as rating each game out of 10. But do you remember how close all of the games were? And that was with multiple categories! I think there would definitely need to be categories to keep some type of granularity.

I'm down for adjusted grading categories, and tweaking the scale. Of course this would have to be agreed upon by everyone since it wasn't announced before the beginning of the competition. Maybe make everything 10's? Things have changed since the beginning of the competition. There weren't a lot of collaborations with artists and musicians, but that is changing. Perhaps it brings more value to the competition and art/sound probably deserve higher value. Originality was there pretty much to dissuade people from making simple rip-offs of official releases, but honestly that really isn't important as long as it isn't copyright infringing. The overall category was kind of like a "Ok, the art was nice, it sounded good, but was it a good game?".
M_Tee
Posts: 430
Joined: Sat Mar 30, 2013 12:24 am
Contact:

Re: on voting

Post by M_Tee »

As for numeric scores, a voting system where a number could be typed in (thereby accepting non-whole number input) would be very helpful, considering the number of collaborations present. For instance, last time around, Lukasz and I both graded each game separately and then I had planned to submit our average for our judging submission, but being restricted to whole numbers, we had to determine a fair way to handle rounding so that no game received an uneven boost due to multiple categories needing to be rounded up.

Another preference, if possible, would be to assess by game and not by category. For instance, I would like to input scores for all the categories for Game A before moving onto Game B instead of ranking every game in Category A before ranking every game in Category B.

Google Forms could handle this. Each game could be a section (page). Once one section is made, it could be duplicated and edited for the rest.

Here's a mockup with the first two games from '17 in it as an example: https://goo.gl/forms/yIDdJVv1kaF9uJ843
User avatar
pubby
Posts: 583
Joined: Thu Mar 31, 2016 11:15 am

Re: on voting

Post by pubby »

NESHomebrew wrote:Ahhh voting... I wish it could be a simple as rating each game out of 10. But do you remember how close all of the games were? And that was with multiple categories! I think there would definitely need to be categories to keep some type of granularity.
That's a very good point, but here's how the games from last year rank if only using "Overall" score:

Code: Select all

27.76 project blue
25.36 grunio
24.4 wolfling
23.09 alphonzo game
22.36 f-ff
21.94 jamin honey
20.6 miedow
20.24 robo ninja
18.42 nothing good
18.22 star evil
18 alphonzo melee
16.82 inherent smile
15.42 lightshields
The results are actually more spread out than the multiple categories combined scores!

@M_Tee that google form looks really good! But it kinda makes it hard to go back and check what you ranked previous entries.
M_Tee
Posts: 430
Joined: Sat Mar 30, 2013 12:24 am
Contact:

Re: on voting

Post by M_Tee »

Yeah, honestly, the way I'd like to judge the games would be just typing them into a spreadsheet file. category by row and game by column or vice versa.

EDIT:
NESHomebrew wrote:Originality was there pretty much to dissuade people from making simple rip-offs of official releases, but honestly that really isn't important as long as it isn't copyright infringing. The overall category was kind of like a "Ok, the art was nice, it sounded good, but was it a good game?".
I actually like the heavy weight that originality has, or at least its presence. It adds a little motivation not just to do something well, but to do something new.

The large size of overall provided an opportunity to assess aspects that I found important that weren't necessarily assessed elsewhere, and as I look back over them, most were gameplay based (difficulty, controls, replay value, social value), with just 3 points actually going to a literal "overall", so it seems to have done that job, at least in my case.
pubby wrote: I get the impression that nobody's certain on how to judge these two categories, and that their results largely overlap with the other, more concrete categories.
As an art teacher, my career is basically built around attempting to assess works in a subjective field in the most objective manner possible. Not an easy thing, haha. Definitely worth doing though.
Last edited by M_Tee on Wed Dec 21, 2022 12:07 am, edited 1 time in total.
User avatar
NESHomebrew
Formerly WhatULive4
Posts: 418
Joined: Fri Oct 30, 2009 4:43 am
Contact:

Re: on voting

Post by NESHomebrew »

I did play around with google forms for a while, but I found that survey monkey had a few advantages. I think you can return and change answers after the fact with google, but I don't think it worked as well as the survey monkey one.

I'll have to look at add-ons for google forms. If we could get something working good with google forms I'd be more than happy to use it. The nice thing as well would be a place to put some anonymous feed back for the entrants. I'm pretty sure I could do that with survey monkey as well.

I like the spreadsheet idea, where you can play around with your values and make sure you are judging them all equally.
M_Tee
Posts: 430
Joined: Sat Mar 30, 2013 12:24 am
Contact:

Re: on voting

Post by M_Tee »

Spreadsheet seems like it could be the easiest to setup and to judge, no forms to construct or navigate.

A little tricky getting all the data together at the end, but from what I've searched, it seems the following might be feasible:

Using Google Sheets, a basic judging spreadsheet could be made and copied for each judge, shared with only that judge and the person in charge. Once the judging deadline's over, each judge could have their collaborator access removed to prevent further changes. A separate sheet could then reference each of the judge's sheets dynamically via the ImportRange feature.
Games as rows, categories as column, the last column could be very wide and set up for comments, so anonymous feedback could be collected that way. A little conditional formatting could even be used on judging sheet to highlight cells if the score input is out of range.
User avatar
pubby
Posts: 583
Joined: Thu Mar 31, 2016 11:15 am

Re: on voting

Post by pubby »

I've created a few polls to gauge how the community feels on voting. Please vote!



FIRST POLL - How should the rubric work?

Option 1 is what past competitions have used. Participants assign a score to various categories (art, sound, etc) and then sum up these categories to reach a final score.

Option 2 is similar in that players still assign scores to various categories, but the final score is independent of this. Players pick whatever final score they want for each game.

In both cases, the final score is what determines prizes.

Vote here: https://www.strawpoll.me/17291856



SECOND POLL - What voting method?

With option 1, voters assign each category a score, presumably from 1 to 10.

With option 2, voters rank each game by preference. https://en.wikipedia.org/wiki/Borda_count

Vote here: https://www.strawpoll.me/17291860



THIRD POLL - What categories should we vote on?

Select all categories you want to see.

Vote here: https://www.strawpoll.me/17291910
User avatar
pubby
Posts: 583
Joined: Thu Mar 31, 2016 11:15 am

Re: on voting

Post by pubby »

Alright, so it seems like we want to keep the same format as last year, but the polls show a desire to add a "Gameplay" category.

Something like this then?

Code: Select all

Art       10
Sound     10
Gameplay  10
Polish    10
Freshness 10
Overall   20
(freshness can be renamed originality; I'm just stubborn and thought calling it "freshness" would make the category less ambiguous to judge)
M_Tee
Posts: 430
Joined: Sat Mar 30, 2013 12:24 am
Contact:

Re: on voting

Post by M_Tee »

Voting Method
I hadn't voted on method, waiting to read up on Borda method, and I just cast my vote for it. I think it would simplify the act of judging, as it's easier to rank entries against each other, as opposed to assigning a numeric score. The primary benefit is eliminating judges' relative interpretation of scale, reducing the effect from a single judge whose assessments are overall far stricter or looser.

Category Weights
I like the suggested heavier weights of art and sound. (10 instead of 5 each) Regardless of the individual weights of categories, I feel that the final scores as announced should be converted mathematically to an out-of-fifty scale in order to retain comparability to previous years' results.

Terminology
I'm not a big fan of the term freshness. It adds no clarity in terms of definition, (and although I know it is not intended) it has a feel of corporate marketing hipness (see Rotten Tomatoes' usage of the term) that leaves a poor taste, but that could very well be my own disposition.

Regardless, I still feel that the ambiguity in both the originality and overall categories is beneficial because it provides room for each judge to express their own priorities in judging.
User avatar
NESHomebrew
Formerly WhatULive4
Posts: 418
Joined: Fri Oct 30, 2009 4:43 am
Contact:

Re: on voting

Post by NESHomebrew »

pubby wrote:Alright, so it seems like we want to keep the same format as last year, but the polls show a desire to add a "Gameplay" category.

Something like this then?

Code: Select all

Art       10
Sound     10
Gameplay  10
Polish    10
Freshness 10
Overall   20
(freshness can be renamed originality; I'm just stubborn and thought calling it "freshness" would make the category less ambiguous to judge)
For polish and completeness it was kind of meant to include Gameplay. Honestly, changing the categories this close to the competition deadline probably isn't the best idea (or am I wrong?). I do like the conversation and if changes are warranted it would be nice to start the next years competition with any judging changes already ironed out.
User avatar
pubby
Posts: 583
Joined: Thu Mar 31, 2016 11:15 am

Re: on voting

Post by pubby »

I'm cool with postponing this until next year. I agree that it could be unethical to change the categories so late.

In reality, I don't think the voting system makes a huge difference. The best games always win, the worst games don't. The system can be improved of course, but there's no dire need to do it now.

While this thread is still going, here are two more polls regarding the categories. The first is for this year (a change which likely won't happen), and the second is for next year. This pastebin explains the choices: https://pastebin.com/raw/CSPCca7P

This year: https://www.strawpoll.me/17303975

Next year: https://www.strawpoll.me/17303984

Of course, these polls aren't binding or official or anything, but they might help a little in planning.
Post Reply