Author Topic: Consensus while judging?  (Read 3269 times)

Offline hoser

  • Brewmaster
  • *****
  • Posts: 771
    • View Profile
Re: Consensus while judging?
« Reply #15 on: March 31, 2014, 12:00:00 PM »
An important thing is to make sure that beers that might be good enough to push, get their chance to shine in another forum. When you have large contests with multiple flights, there is a greater chance that other palates will have the opportunity to judge it. The beers in question, just need the opportunity!

This is really the most important thing. Ensuring that a medal-worthy beer gets a chance at medaling is more important than worrying about scores.

Usually, if the guy I'm judging with cannot see eye-to-eye with me, I will come down on score with the caveat that the beer will be pushed to mini-BOS. I explain to them that the worst case for that situation is that if the beer is really as bad as they say it is (e.g. a 29 for 'subdued hop aroma' in an APA), then the beer will be kicked immediately. No harm, no foul. But if the beer was as good as I think it is, it should place in mini-BOS. I'm usually correct in pushing the "questionable" beer to mini-BOS (in that they usually medal at that point), but I've been wrong before and seen a beer I fought for get kicked quickly. Better to err on the safe side though!

Agree completely with Martin and Amanda! I tend to error on the side of caution if I am not sure.  Had a somewhat similar situation this weekend.  Thought we had a beer that was pretty good so I figured I would give it a chance in many BOS so that a few other palates could taste the beer and decide it's fate.  It placed in the top 3 and will advance to Nationals. 

Offline chumley

  • Brewmaster
  • *****
  • Posts: 585
    • View Profile
Re: Consensus while judging?
« Reply #16 on: March 31, 2014, 12:43:46 PM »
Here's the situation.  The delta between the scores was almost twenty points.  It was one of those beers that people either loved or hated (50% of the non-flight judges who tasted the beer loved it whereas the other fifty percent thought that it should be dumped).  I was going to give the beer a courtesy score of 13 before I saw the other judge's score sheet.  Our comments were so different that it made me believe that we must have tasted different beers.  I bumped my score up to 29, but there was no way that I was going to give a seriously flawed beer a forty.  The other judge would not budge.  The head judge was clearly uncomfortable judging the category.  He did not have an opinion one way or the other, so he adjusted his score up to move the beer on.   I finally reached the point were I told the head judge to throw out my score because there was no way that I was going to give the beer a score anywhere near forty.

If you gave it a 13 and the other guy gave it a 40, then you thought it was flawed and the other guy didn't.  What flaw did you find?

I often have to force myself to be objective when judging IPAs as I really don't like CTZ hops, but a lot of people do, so I have to try to be objective even when I feel like dumping the beer down the drain.

Offline AmandaK

  • Senior Brewmaster
  • ******
  • Posts: 1470
  • Redbird Brewhouse
    • View Profile
Re: Consensus while judging?
« Reply #17 on: March 31, 2014, 01:04:54 PM »
Here's the situation.  The delta between the scores was almost twenty points.  It was one of those beers that people either loved or hated (50% of the non-flight judges who tasted the beer loved it whereas the other fifty percent thought that it should be dumped).  I was going to give the beer a courtesy score of 13 before I saw the other judge's score sheet.  Our comments were so different that it made me believe that we must have tasted different beers.  I bumped my score up to 29, but there was no way that I was going to give a seriously flawed beer a forty.  The other judge would not budge.  The head judge was clearly uncomfortable judging the category.  He did not have an opinion one way or the other, so he adjusted his score up to move the beer on.   I finally reached the point were I told the head judge to throw out my score because there was no way that I was going to give the beer a score anywhere near forty.

If you gave it a 13 and the other guy gave it a 40, then you thought it was flawed and the other guy didn't.  What flaw did you find?

Also curious. 13 to 40 is a pretty large gap, but I've seen stranger things. :D
Amanda Burkemper
Kansas City Bier Meister; BJCP Master
Redbird Brewhouse - There's Always a Project
Our Homebrewed Wedding, AHA Article

Offline dmtaylor

  • Senior Brewmaster
  • ******
  • Posts: 1484
  • Two Rivers, WI
    • View Profile
Re: Consensus while judging?
« Reply #18 on: March 31, 2014, 01:06:24 PM »
On my BJCP tasting exam, there was a Belgian dubbel.  Very phenolic, tasted exactly like friggin Carmex.  I believe I even used the term "Carmex" on the tasting sheet.  As such, I scored it relatively low, in the 20s.  It was an otherwise okay dubbel, with the dark fruit flavors, etc., but I just couldn't get past the Carmex.  Meanwhile the Master level proctors all loved it, scored it in the 40s, probably claiming that they loved the rich complex phenols.  = Carmex.  Yuck.  Of course as a result of this disagreement, my exam score was severely impacted, and I remain convinced that I was in the right and they were in the wrong.  I might only be Certified but I don't care what level they were.  I don't want friggin Carmex in any beer that I drink, thank you very much.  No way I would have changed my score upwards for that beer.  After the exam, I also came to find out that many of the other test-takers agreed with me.  If only we could have negotiated with those Master judges, perhaps we could have brought them down.  I wonder how many other takers got screwed that day.

I don't know what the point of all this is, except perhaps to say, taste is subjective, and we should all be entitled to our own opinions.  I have very deep feelings against trying to force anyone to do otherwise.  We can and should compare notes, listen to reason, and adjust scores when appropriate.  However we should also respect those who refuse to budge if they feel very strongly one way or the other.  I think in those cases, we should just let the scoresheets ride as is, and yes, assume that the higher score is the correct one, in fairness to the entrant.
Dave

"This is grain, which any fool can eat, but for which the Lord intended a more divine means of consumption. Let us give praise to our Maker, and glory to His bounty, by learning about... BEER!" - Friar Tuck (Robin Hood - Prince of Thieves)

Offline S. cerevisiae

  • Senior Brewmaster
  • ******
  • Posts: 1697
  • deus ex machina
    • View Profile
Re: Consensus while judging?
« Reply #19 on: March 31, 2014, 01:14:50 PM »
I'm seeing a lot of "loved it" and not "thought it fit the style really well"
I would like to believe that is not a problem BJCP judges have, but I'd settle for finding out it is rare.

perhaps you could give us a bit more on the style in question and what was so poor about it that made you consider a 13, and eventually 29?

The beer was basically a science experiment that was entered as a specialty beer.   With no claimed "like" beer and no category guidelines to use in judging the beer, I judged the beer based on the ingredients, process, and bugs claimed on the entry form.  The beer had a really harsh middle of the tongue flavor that made it darn near undrinkable for me, which is why I contemplated giving it a 13 (my first score was actually in the low twenties).  I brought my score up because I wanted to reach a consensus. 

As I brew mainly to study the behavior of brewing cultures (I have maintained a culture collection for most of the time that I have brewed), I am familiar with the flavors produced by the bugs claimed in the fermentation.  The harsh off-flavor was not a flavor that is produced by the any of the bugs claimed under normal circumstances.   The flavor was definitely produced by wild non-brewing microflora pickup, which is a flaw that would prevent any beer from scoring in the forties.
 
« Last Edit: March 31, 2014, 01:16:36 PM by S. cerevisiae »
Mark

Just say "no" to yeast rinsing
https://www.homebrewersassociation.org/forum/index.php?topic=19850.msg252492#msg252492

Friends don't let friends use Star San as their primary sanitizer

"Acid-anionic sanitizers are broad spectrum against bacteria and viruses, but not very effective against yeasts and molds."

Offline AmandaK

  • Senior Brewmaster
  • ******
  • Posts: 1470
  • Redbird Brewhouse
    • View Profile
Re: Consensus while judging?
« Reply #20 on: March 31, 2014, 01:15:56 PM »
Very phenolic, tasted exactly like friggin Carmex.  I believe I even used the term "Carmex" on the tasting sheet.  As such, I scored it relatively low, in the 20s.  It was an otherwise okay dubbel, with the dark fruit flavors, etc., but I just couldn't get past the Carmex.  Meanwhile the Master level proctors all loved it, scored it in the 40s, probably claiming that they loved the rich complex phenols.  = Carmex.  Yuck. 


Yes, taste is subjective, but being able to recognize your limitations and biases is usually the difference between Master exams and the lower scoring exams. (Alongside independent thought, completeness, and thorough descriptive ability, of course.)

For instance, I dislike Fuggles. They are dirt. I could be utterly convinced of dirt=flaw, much like you say "Carmex=flaw". However, I also recognize that Fuggles are totally acceptable in certain categories. So I cannot, and will not, be biased against them in a competition setting.
Amanda Burkemper
Kansas City Bier Meister; BJCP Master
Redbird Brewhouse - There's Always a Project
Our Homebrewed Wedding, AHA Article

Offline Jimmy K

  • Official Poobah of No Life.
  • *
  • Posts: 3630
  • Delaware
    • View Profile
Re: Consensus while judging?
« Reply #21 on: March 31, 2014, 01:34:03 PM »
The beer was basically a science experiment that was entered as a specialty beer.   With no claimed "like" beer and no category guidelines to use in judging the beer, I judged the beer based on the ingredients, process, and bugs claimed on the entry form.  The beer had a really harsh middle of the tongue flavor that made it darn near undrinkable for me, which is why I contemplated giving it a 13 (my first score was actually in the low twenties).  I brought my score up because I wanted to reach a consensus. 
Specialty is tough and if there is no reference beer or style listed by the contestant, you're not left with much to judge on besides 'Do I like this'. I imagine this situation comes up most often in the specialty categories. Also, it is up to contestants to adequately prepare judges with specialty ingredients and base styles. I've seen judges who hesitate to give a higher score because they can't adequately decide if the beer was what the brewer intended.
Delmarva United Homebrewers - President by inverse coup - former president ousted himself.
AHA Member since 2006
BJCP Certified: B0958

Offline dmtaylor

  • Senior Brewmaster
  • ******
  • Posts: 1484
  • Two Rivers, WI
    • View Profile
Re: Consensus while judging?
« Reply #22 on: March 31, 2014, 03:50:44 PM »
taste is subjective, but being able to recognize your limitations and biases is usually the difference between Master exams and the lower scoring exams.

Hmm... good point.  You've given me something to ponder the next time I taste a delicious Belgian ale... that hopefully doesn't taste of Carmex... or Fuggles for that matter!
Dave

"This is grain, which any fool can eat, but for which the Lord intended a more divine means of consumption. Let us give praise to our Maker, and glory to His bounty, by learning about... BEER!" - Friar Tuck (Robin Hood - Prince of Thieves)

Offline udubdawg

  • Brewmaster
  • *****
  • Posts: 838
    • View Profile
Re: Consensus while judging?
« Reply #23 on: March 31, 2014, 04:16:51 PM »

The beer was basically a science experiment that was entered as a specialty beer.   With no claimed "like" beer and no category guidelines to use in judging the beer, I judged the beer based on the ingredients, process, and bugs claimed on the entry form.  The beer had a really harsh middle of the tongue flavor that made it darn near undrinkable for me, which is why I contemplated giving it a 13 (my first score was actually in the low twenties).  I brought my score up because I wanted to reach a consensus. 

As I brew mainly to study the behavior of brewing cultures (I have maintained a culture collection for most of the time that I have brewed), I am familiar with the flavors produced by the bugs claimed in the fermentation.  The harsh off-flavor was not a flavor that is produced by the any of the bugs claimed under normal circumstances.   The flavor was definitely produced by wild non-brewing microflora pickup, which is a flaw that would prevent any beer from scoring in the forties.

Ah, fricking category 23...
it's making more sense now.   Yeah you ended up in the most subjective cat.  Science experiments indeed.
I do believe competitions should have very experienced judges in that category, more than just about anywhere else, but perhaps they did.

I don't avoid 23 anymore, but I don't relish it.  I'm hoping the new guidelines will reduce some of the variety; American Wild and Specialty IPA and whatever. 
I am capable of telling why I am/am not impressed with a certain beer's blend of base and specialty information, and from what we've seen I expect you are too.  Curious why the other judge loved it so much, but I guess you really did encounter a fairly rare event early on.  I've never encountered even half of the score differential you indicate.  I actually think you did well with a 29, provided it also included feedback on why it didn't work for you.

cheers--
--Michael
 

Offline tschmidlin

  • I must live here
  • **********
  • Posts: 8197
  • Redmond, WA
    • View Profile
Re: Consensus while judging?
« Reply #24 on: March 31, 2014, 04:19:44 PM »
The beer was basically a science experiment that was entered as a specialty beer.   With no claimed "like" beer and no category guidelines to use in judging the beer, I judged the beer based on the ingredients, process, and bugs claimed on the entry form.  The beer had a really harsh middle of the tongue flavor that made it darn near undrinkable for me, which is why I contemplated giving it a 13 (my first score was actually in the low twenties).  I brought my score up because I wanted to reach a consensus. 
Specialty is tough and if there is no reference beer or style listed by the contestant, you're not left with much to judge on besides 'Do I like this'. I imagine this situation comes up most often in the specialty categories. Also, it is up to contestants to adequately prepare judges with specialty ingredients and base styles. I've seen judges who hesitate to give a higher score because they can't adequately decide if the beer was what the brewer intended.
Specialty and experimental are especially tough if you give too much credence to what the brewer intended.  As a judge in this category, I think it is part of your job to decide if what the brewer intended is actually a good idea.  A perfectly executed tomato weizen is still a terrible beer.  If someone brews a cat vomit beer and it tastes like cat vomit, is that a 50 point beer because that is what the brewer intended?  Or is it a 13 because it tastes like cat vomit?
Tom Schmidlin

Offline klickitat jim

  • Official Poobah of No Life.
  • *
  • Posts: 5251
    • View Profile
Re: Consensus while judging?
« Reply #25 on: March 31, 2014, 04:28:54 PM »
Are you a bottom up judge, or do you go piece by piece and tally afterwards?

The phenolic flaw, assuming that's what it was, seems to have a sliding scale looking at the scoring guide. Very major, like undrinkable (ie gag reflex) then 13, but if it's very minor maybe all the way up to 44. I guess this is why the need for consensus flexibility.

I hope to be the kind of judge that says "this is what I think, but I can be wrong, just tell me where I'm wrong so I can learn"

Offline garc_mall

  • Brewmaster
  • *****
  • Posts: 852
  • [1892.9, 294.9deg] AR Lynnwood, WA
    • View Profile
Re: Consensus while judging?
« Reply #26 on: March 31, 2014, 04:35:56 PM »
The beer was basically a science experiment that was entered as a specialty beer.   With no claimed "like" beer and no category guidelines to use in judging the beer, I judged the beer based on the ingredients, process, and bugs claimed on the entry form.  The beer had a really harsh middle of the tongue flavor that made it darn near undrinkable for me, which is why I contemplated giving it a 13 (my first score was actually in the low twenties).  I brought my score up because I wanted to reach a consensus. 
Specialty is tough and if there is no reference beer or style listed by the contestant, you're not left with much to judge on besides 'Do I like this'. I imagine this situation comes up most often in the specialty categories. Also, it is up to contestants to adequately prepare judges with specialty ingredients and base styles. I've seen judges who hesitate to give a higher score because they can't adequately decide if the beer was what the brewer intended.
Specialty and experimental are especially tough if you give too much credence to what the brewer intended.  As a judge in this category, I think it is part of your job to decide if what the brewer intended is actually a good idea.  A perfectly executed tomato weizen is still a terrible beer.  If someone brews a cat vomit beer and it tastes like cat vomit, is that a 50 point beer because that is what the brewer intended?  Or is it a 13 because it tastes like cat vomit?

The tomatoweizen makes its return!
In a Keg: Flanders Red Ale, Rye Altbier, Cascade/Topaz Pale
Fermenting: Flanders Red, Saison

Offline tschmidlin

  • I must live here
  • **********
  • Posts: 8197
  • Redmond, WA
    • View Profile
Re: Consensus while judging?
« Reply #27 on: March 31, 2014, 05:08:47 PM »
Are you a bottom up judge, or do you go piece by piece and tally afterwards?
I generally go piece by piece and see how it adds up, then decide if that score makes sense for that beer.  Unless it is terrible, then I try to figure out how to make it add up to 13, I never just write a 13 at the bottom and leave it at that.

The tomatoweizen makes its return!
I hated that beer so very very much. :)
Tom Schmidlin

Offline S. cerevisiae

  • Senior Brewmaster
  • ******
  • Posts: 1697
  • deus ex machina
    • View Profile
Re: Consensus while judging?
« Reply #28 on: March 31, 2014, 05:17:33 PM »
If someone brews a cat vomit beer and it tastes like cat vomit, is that a 50 point beer because that is what the brewer intended?  Or is it a 13 because it tastes like cat vomit?

I have encountered bottles of lambic that had above threshold levels of butyric acid, which gave the beer human vomit notes.  However, I have never encountered cat vomit beer.  ;D
Mark

Just say "no" to yeast rinsing
https://www.homebrewersassociation.org/forum/index.php?topic=19850.msg252492#msg252492

Friends don't let friends use Star San as their primary sanitizer

"Acid-anionic sanitizers are broad spectrum against bacteria and viruses, but not very effective against yeasts and molds."

Offline S. cerevisiae

  • Senior Brewmaster
  • ******
  • Posts: 1697
  • deus ex machina
    • View Profile
Re: Consensus while judging?
« Reply #29 on: March 31, 2014, 05:23:35 PM »
I hated that beer so very very much. :)

Are you serious?  I thought that you were kidding.  Who would brew a tomatoweizen?  Better yet, who would enter it in a contest?
Mark

Just say "no" to yeast rinsing
https://www.homebrewersassociation.org/forum/index.php?topic=19850.msg252492#msg252492

Friends don't let friends use Star San as their primary sanitizer

"Acid-anionic sanitizers are broad spectrum against bacteria and viruses, but not very effective against yeasts and molds."