Author Topic: NHC SQLite Dataset  (Read 931 times)

Offline thcipriani

  • Assistant Brewer
  • ***
  • Posts: 182
    • View Profile
NHC SQLite Dataset
« on: May 02, 2015, 10:11:54 PM »
Seems like as good of a place as any to post this info.

I've collected the NHC gold medal winners dataset into a sqlite database to allow some rudimentary data science + easy, open, access.

Now answering weird one-off questions about NHC winners over the past decade is simple:

How many winning recipes in the past decade used SafAle yeast?

Code: [Select]
$ sqlite3 nhc.db 'select count(*) from recipes where ingredients like "%safale%"'
  2

What is the most common batch size?

Code: [Select]
sqlite3 nhc.db 'select vol from recipes' | cut -d'(' -f1 | sort | uniq -c | sort -rn | head -5
     77 5 Gallons
     44 6 Gallons
     35 10 Gallons
     24 5.5 Gallons
     14 12 Gallons

You can even use the data to create interesting plots that can answer questions.

Has the average gravity of a batch changed over the years?



All the code and info is available at: https://github.com/thcipriani/nhc-homebrew-data

Enjoy everyone! Cheers!
Tyler Cipriani
Longmont, CO
http://gangsta.party/

Offline brulosopher

  • Brewmaster
  • *****
  • Posts: 505
  • They who drink beer will think beer
    • View Profile
    • Brülosophy
Re: NHC SQLite Dataset
« Reply #1 on: May 02, 2015, 10:22:32 PM »
That was pretty interesting. Only 2 past winners have use SafAle yeast? That seems surprising to me.

Offline homoeccentricus

  • Brewmaster General
  • *******
  • Posts: 2008
  • A twerp from Antwerp
    • View Profile
Re: NHC SQLite Dataset
« Reply #2 on: May 02, 2015, 10:52:23 PM »
Cool, want to have a look. Downloaded the code, but where is the data with the recipes etc.?
Frank P.

Staggering on the shoulders of giant dwarfs.

Offline thcipriani

  • Assistant Brewer
  • ***
  • Posts: 182
    • View Profile
Re: NHC SQLite Dataset
« Reply #3 on: May 02, 2015, 11:38:40 PM »
Cool, want to have a look. Downloaded the code, but where is the data with the recipes etc.?

It's all inside the
Code: [Select]
nhc.db file. You'll need sqlite to read the data: https://www.sqlite.org/download.html

SQLite is a super lightweight database that is way way way easier than something like mysql or postgresql, also the code is Public Domain (https://www.sqlite.org/copyright.html) which is neat :)

Once you have sqlite installed you can do something like:

Code: [Select]
sqlite3 nhc.db
From then onward you should be able to use simple sql syntax to query the data.
Tyler Cipriani
Longmont, CO
http://gangsta.party/

Offline homoeccentricus

  • Brewmaster General
  • *******
  • Posts: 2008
  • A twerp from Antwerp
    • View Profile
Re: NHC SQLite Dataset
« Reply #4 on: May 03, 2015, 09:34:49 AM »
Ah yes, sorry, I was looking for a larger file.  ::)
I can reproduce the queries, so nice!

The only problem I see is that the recipe field is simply a text field so it would be difficult to get a drill-down of, for instance, all the different types of yeast that have been used.  But I understand it would be quite a bit of (I assume mostly manual) work to get all that information into separate fields.
Frank P.

Staggering on the shoulders of giant dwarfs.

Offline morticaixavier

  • I must live here
  • **********
  • Posts: 7782
  • Underhill VT
    • View Profile
    • The Best Artist in the WORLD!!!!!
Re: NHC SQLite Dataset
« Reply #5 on: May 04, 2015, 02:13:42 PM »
That was pretty interesting. Only 2 past winners have use SafAle yeast? That seems surprising to me.
More likely only 2 past winkers calmed or SafAle specifically. Look for S-05 and you'll get a lot more hits I suspect.
"Creativity is the residue of wasted time"
-A Einstein

"errors are [...] the portals of discovery"
- J Joyce

Offline homoeccentricus

  • Brewmaster General
  • *******
  • Posts: 2008
  • A twerp from Antwerp
    • View Profile
Re: NHC SQLite Dataset
« Reply #6 on: May 04, 2015, 03:13:57 PM »
Only one, for a Piny The Elder clone (I count 280 recipes in total).
Frank P.

Staggering on the shoulders of giant dwarfs.

Offline hopfenundmalz

  • Global Moderator
  • I must live here
  • *****
  • Posts: 9008
  • Milford, MI
    • View Profile
Re: NHC SQLite Dataset
« Reply #7 on: May 04, 2015, 03:34:50 PM »
That was pretty interesting. Only 2 past winners have use SafAle yeast? That seems surprising to me.
More likely only 2 past winkers calmed or SafAle specifically. Look for S-05 and you'll get a lot more hits I suspect.
Winkers or wankers? I kid.
Jeff Rankert
Ann Arbor Brewers Guild
AHA Governing Committee
BJCP National
Home-brewing, not just a hobby, it is a lifestyle!

Offline morticaixavier

  • I must live here
  • **********
  • Posts: 7782
  • Underhill VT
    • View Profile
    • The Best Artist in the WORLD!!!!!
Re: NHC SQLite Dataset
« Reply #8 on: May 04, 2015, 09:34:22 PM »
That was pretty interesting. Only 2 past winners have use SafAle yeast? That seems surprising to me.
More likely only 2 past winkers calmed or SafAle specifically. Look for S-05 and you'll get a lot more hits I suspect.
Winkers or wankers? I kid.
I have got to start paying attention to what autocorrect days I said. Hopefully everyone got the point.
"Creativity is the residue of wasted time"
-A Einstein

"errors are [...] the portals of discovery"
- J Joyce

Offline hopfenundmalz

  • Global Moderator
  • I must live here
  • *****
  • Posts: 9008
  • Milford, MI
    • View Profile
Re: NHC SQLite Dataset
« Reply #9 on: May 04, 2015, 10:07:53 PM »
That was pretty interesting. Only 2 past winners have use SafAle yeast? That seems surprising to me.
More likely only 2 past winkers calmed or SafAle specifically. Look for S-05 and you'll get a lot more hits I suspect.
Winkers or wankers? I kid.
I have got to start paying attention to what autocorrect days I said. Hopefully everyone got the point.

You have to have noticed some of mine by now, right?  ;)
Jeff Rankert
Ann Arbor Brewers Guild
AHA Governing Committee
BJCP National
Home-brewing, not just a hobby, it is a lifestyle!