To me the only real information gained here is that two different systems can make detectably different beer from the same recipe.
"Can," indeed, not necessarily "will."
This reminds me of the mash temperature experiment - two similar beers but different gravities. In the mash temperature experiment the final gravities were 0.09 apart and 9 out of 20 people could detect a difference (not significant). in the fly sparge experiment the OGs are 0.04 apart and 9 out of 16 people could detect a difference (significant).
Both results hover at the edge of significance - one negative, one positive. The danger is that people start interpreting these results as gospel: you can taste OG differences but not FG differences. I think some caution is needed before any firm conclusions can be drawn. Until replicated the results really aren't clear cut as there's a large margin of error in both cases.
I encourage people interpret every xBmt with caution, regardless of p-value. I'm also a big fan of people trying stuff out for themselves, especially a variable like mash method that likely won't have a detrimental impact. Fun stuff!
Also note unwanted variable of longer mash time in fly sparge discussed in comments. Could have been easily removed by timing the batch sparge.
Removing one variable by adding another is certainly something we'll be doing, but I intentionally wanted to establish a simple baseline with this xBmt.