Comments on: Contest: Do You Know Blue?

By: Dan Meyer

Dan Meyer — Wed, 05 Mar 2014 16:38:05 +0000

Dave Major is the only one who has access to the code AFAIK and it’s been tough to get ahold of him. Until that happens, the site would have to be rebuilt.

By: Stephen Thomas

Stephen Thomas — Wed, 05 Mar 2014 03:03:41 +0000

Is there anyway we can get the code used for Do You know blue so we can try and put up another site? or would we need to build our own from scratch?

Mr. Steve

By: Dan Meyer

Dan Meyer — Tue, 04 Mar 2014 23:38:56 +0000

The site is down. At this point, we have no way of restoring the site, which is kind of a drag for all of us. Please accept our apologies.

By: ely klenzak

ely klenzak — Tue, 04 Mar 2014 23:10:33 +0000

did you take the site down, cause if I try to pull it up…. its just a white screen

By: boulis

boulis — Wed, 05 Jun 2013 06:46:21 +0000

@Dave I believe there might be inaccuracies with the way you calculate the rules. Possibly due to rounding errors as I wrote above. I tried more tests with the database of 4304 colours. Using either 140 points or the full set of points (4304) I was calculating separating surfaces of various degrees (0th to 6th), and then testing their accuracy (i.e. how well they separated the blues from the non blues) on all 4304 points. The results I got should have matched the results from the standings page. But they do not. I only entered a few rules in the website, but all except for one (140 points, 3rd degree) perform worse compared to my expected results. The greater the degree, the worse they perform. The one exception performed 93.17% instead of 92.77% that I calculated.
In my rules I enter numbers with up to 14 decimal places of accuracy. Maybe your parsing truncates them to less accurate representations (?) I tried producing rules where the r g b values are divided by 100 first (so that the terms of a rule are larger and do not have to employ the e notation, e.g. 1.092344e-06). The results in the standings page are even worse (where of course my own calculations on accuracy did not changed). Anyway just reporting, probably it is not important for the rules school kids are playing with.

By: Thanassis

Thanassis — Fri, 31 May 2013 05:31:02 +0000

I found a mistake in my code, that explains the poor performance of the 4-degree plane. After correction, I should have seen 93.75%. Still the webpage give me 91.36%. My calculations are based on all 4304 colours currently in the database. Maybe there are rounding errors as many terms in my rule are very small (10^-9) and one is 10^-11

By: Thanassis

Thanassis — Fri, 31 May 2013 04:56:47 +0000

Steve, I think yes, it is by design. At least in terms of blue, you get almost the same colours each time. It makes sense to give 50:50 samples for the few colours the page shows. When you pick colours randomly then the ratio is 20:80 (that’s seems to be the case with the xkcd study too)

By: Mr. Steve

Mr. Steve — Fri, 31 May 2013 04:47:23 +0000

So still a problem when I enter b>0 and b<0 I should get 80%, but instead get 20%. Seems to act more like an OR than an AND. Also OR does not seem to work at all.

Also Given that the population is ~80% NonBlue and ~20% Blue. Why do I seem to get samples that are at best 50/50 and in the last case were 17 Blue and 13 Non Blue?
Is this by design? If so what is the rational?

I did something similar to Thanassis (scraping the data set from the Website) so I could try a number of different algorithms locally. May give the data set to some of my students as well and see if they can write some programs to try and optimize certain constants in their formulas.

Is this code going to be open sourced or made available somewhere?

By: Thanassis

Thanassis — Fri, 31 May 2013 04:21:36 +0000

When the site opened again after the first contest I decided to play a bit more with it. I saw that the original points/colours were kept. So I entered my best rule (a cubic plane) and it scored slightly better than before. Then I thought I’d try something major. You have a page (/blueis) that you give all the colors in the database in little boxes, separated in two regions blue and not blue. I saved these regions as images and then wrote a program to parse them and get r,g,b, values for all the different colours. So now I have the entire database (4090 points at the time). Now I could run the algorithm on the entire database and see what are the best results I could get. This is not really machine learning, it is more like fitting :)
Curiously enough the cubic regression on the 4090 points works slightly worse than the cubic regression on the 140 points I initially had!
Then I noticed some things that made me scratch my head even more.
I tried a 4th degree plane to separate the blue from non blue, as I was expecting it would do better than cubic with 4090 points. It did. With my calculations it was giving me an accuracy of 93.4%. But when I applied the rule with your website I got 80%. I assumed 80% is what you get if your rule compute always false. So maybe my rule was not parsed/calculated correctly (it is a long rule afterall). I checked this assumption by entering an always-false rule (r=300) and noticed that although it was close, it was not identical.
Moreover I started noticing that accuracy scores were changing in the standings *without* the number of colours changing. The number of responses are changing, but I assume, that the colours in the database are already fixed, i.e, when a color enters the database, it does not appear in the test and it does not change its blueness value.
Dave, if you have thoughts on the last points I’d love to hear them.

By: Mr. Steve

Mr. Steve — Mon, 27 May 2013 04:46:45 +0000

OR is not working and I have kids working on it this weekend sending me emails. Anyway we can get a slight extension at least for a short time after the OR is fixed? I sent an email to Dave with an example of the problem. But just try to enter any expression with an OR