The New Guy on Numer.ai – what is Numerai?

So about ten days ago I got interested in this project, which describes itself as the ‘hardest data science tournament on the planet’. This would be numer.ai – and I’ve decided to write a series about my experiences with getting to know this platform, what is it about, what are the people like etc. So, lets get down to business with my experience discovering numer.ai and why I think it is interesting. This is a very high level look at the Numerai platform while also a high level look at data science in general.

What is Numerai?

I found out about Numerai when I came across Lex Fridman’s interview with Richard Craib, founder of Numerai. I quickly came to understand that Numerai was a platform for aggregating as many opinions about stock market movements as possible. But how? And why? Lets start with my understanding of the why first, and why I think this is interesting.

Why Numerai?

Numerai is trying to do something very interesting. It is basically a Hedge Fund that is utilizing a large and open group of data scientists to make investment decisions for them in a tournament style of play. The larger the better because the Hedge Fund is making decisions based on ‘The Wisdom Of the Crowd’ – the more opinions given, the more opinions they can average out, which typically leads to a more ‘correct’ answer or in this case prediction. But Numerai want two aspects to be true about these predictions. Firstly, they should be informed opinions, based on data. This is why people with data science skills and interests are in need – another reason will be discussed in the ‘How Numerai’ part below. Secondly, to make sure opinions have a suitable motivation behind them, participants can ‘stake’ something of value behind their predictions in a tournament, meaning they have ‘skin in the game’ – again I will discuss this more in the ‘How Numerai’ part. If they are not confident they will not stake – this is important when trusting the wisdom of a crowd. While wisdom is hard to define, we can use desire to not lose money as a proxy for it.

This is a neat experiment in getting many opinions (also called models) of predictions of stock performance under the condition that those giving their opinion actually care about the outcome, and then creating what Numerai call the ‘MetaModel’ – a compilation of everyone staking something of value in a tournament. How this tournament is scored is something I will discuss in another post (when I understand it better). So the power of this approach is Numerai harness over 4000 opinions, weekly, on how stocks will perform. Not only that, because these opinions come from individuals who don’t technically work together, those models/opinions have no reason to be correlated with each other. The models are about as diverse as can be expected. The same is most likely not true with Hedge Funds with rows of quants who all work closely with each other. Having said that, the Numerai community is open, friendly and collaborative so there is an element of group think going on as well, no doubt.

How Numerai

To participate in a Numerai tournament, you take data given by the Numerai people and try to build a model that matches a bunch of features in the data to targets in the data. To Machine Learning experts what features and targets are is obvious. To the novice, a little explanation is needed so we can understand how Numerai works.

A little Data Science aside – skip if you know what features and targets are.

Features are individual elements of data (a datum) which have a numerical value given for it. An example might be ‘temperature in Honolulu today’. I just checked and today it was 26 C (79 F). That’s below average given the range over the entire year (it is winter in the Northern Hemisphere as I write this). So lets say I am trying to predict how much ice cream will be sold down Waikiki. This feature (temperature) is likely to have some influence on a target we want to know, namely, volume of ice cream sold. Another feature might be hotel occupancy down Waikiki. Maybe that number is only 300 rooms out of 5000 (it’s also 2021 and we are still in the middle of the Covid-19 pandemic). Volume of ice cream (target – we are trying to predict) is perhaps 100 scoops – not a lot for a tourist destination. One important thing that is done with data like this is it is normalized. It can be hard to relate numerical values for temperature, hotel occupancy and scoops of ice cream. None of these share the same units for example. And it turns out a lot of machine learning algorithms hate having to deal with things in different units and certainly with features that have very different numerical scales (degrees Celsius and number of tourists could be several orders of magnitude different). So the numbers are normalized, usually between 0 and 1. For Honolulu temperature we might put 15 C as 0.0 and 35 C as 1.0 and scale everything else in between. For hotel occupancy, maybe we make this number the percentage of rooms in Waikiki occupied – also a number between 0.0 and 1.0 (100%). As for scoops of ice cream, maybe some domain knowledge tells us Waikiki can not sell more than 10000 scoops a day, so scoops sold would be divided by 10000, giving a number between 0.0 and 1.0. So in this brief example we would have two features, temperature (26/(35-15) = 0.55) and occupancy (300/5000 = 0.06) and a target of (100/10000 = 0.01). And then we do this over many days, collecting another row of feature values that give a target value. See table below:

DateFeature 1
(Temperature)
Feature 2
(Hotel Occupancy)
Target
(Scoops Sold)
Today0.550.060.01
Yesterday0.600.060.02
Day Before….….….
….….….….

This table cold get quite large, with many many rows. Our job is to apply a machine learning algorithm to this table and build a model that can take the features and predict the target. So maybe we have a row with tomorrows temperature and hotel occupancy (these things can be found out or guest) and we make a prediction on the target – scoops of ice cream to be sold.

Still with me? OK, good. So Numerai do the same thing. But theor features are things that Numerai think will effect stock performance and the target is a measure of whether a stock position is favorable or not favorable. I will discuss the contents of Numerai’s data set in another post – but for now, this is the how of the tournament. Numerai publish data and a community of data scientists build models to make predictions.

But before I end this post, I need to explain two more things about the how Numerai works.

Firstly, the details of the features that Numerai use are largely unknown to the data scientists. That is, they are essentially given as Feature A, Feature B etc. This may seem odd, but it is a very smart thing to do. Without knowing what Numerai’s data is derived from, data scientists can not second guess what the features mean. We are completely blind to their syntax, their meaning, so our models do not become biased by what our human brains think is important. This is key in my opinion. The models that are built are only based on numerical values between zero and one along with the notion that they belong to a given feature.

Secondly, the Numerai data is broken up into ‘Eras’ (time periods) but the aim is not to predict time series data. Going back to our lovely Waikiki example, we don’t want to predict how many scoops will be sold based on yesterday, the day before and last week, as if one day directly influences the next. We are focused here on only using the features to predict the target in that row. I described this concept in a post on the forums over at numer.ai but rather than link to it, I will copy-pasta it below:

Say I want to predict what kind of vehicle is coming down my road next. It is far away so I can only determine some coarse grain properties. I can see its color, I can see how fast it is going, I can see if its exhaust is clear or sooty. Yellow, slow, sooty features suggest the next vehicle will be a school bus (in North America anyway). If those features were red, fast and clear exhaust that would suggest a Ferrari. Now there is no way this data will tell me what the vehicle after the Ferrari is, so no point in trying to model yellow->red and slow->fast and soot->clean.

Oh and one last thing – Numerai does not reward participants who stake on the tournament with a kickback from the Hedge Fund, or with a real fiat currency at all (they used to). They have their own cryptocurrency, which can be exchanged for fiat currency. You do not participate in the Hedge Fund directly at all. This is a common misconception. Mmmmm, maybe my next article will consist of a list of common misconceptions about Numerai.

To end, for me this is an exciting platform. I doubt I will make money, but I have no doubt I will learn a lot. It combines the ‘wisdom of the crowd’ theory with ‘skin in the game’ motivation to build a model of the market to essentially guide a hedge fund. Participants get clean feature/target data so they do not have to spend a lot of time cleaning it up themselves. I think this is a smart idea. Payouts in cryptocurrency sour me a little – but honestly not too much. It will be exciting to see where this goes.

Robbo

Chicken of the woods

Mushroom season is well under way and this weekend in the Middlesex Fells I came across the largest patch of ‘Chicken of the Woods’ mushroom (Laetiporus sulphureus). See below:

This lovely mushroom is apparently edible but I didn’t take any to try. Regretting that now. It should be cooked and even them there are some adverse reactions reported. Mostly stomach upset. It is not dangerous or deadly. It should also be eaten when young. This specimen was very fresh. Here is a closeup:

It was still very thick and wet and had not been attacked by bugs or deer yet (both of which will happily eat it – and I have seen deer in the area, although infrequently).

All in all it was a lovely walk that morning. Fresh cool summer air – and I was out before the flying bugs got annoying. When leaving the area I spotted another patch on the back side of the tree:

These two patches to the right were younger and probably still developing. All in all a great day out there. I’ve developed an interest in getting there early in the mornings when there are less people and everything is just fresher before the heat of the day disturbs everything.

Naughty Radio Beacon / Pirate

HF (High frequency or Shortwave) Radio never ceases to amaze me. Tonight while tuning around the 43 meter band (around 6780 kHz) I came across this interesting morse code (CW) signal. Its slow so I had no problem decoding it in my head:

For the non-morse code people out there, it is saying ‘FUCK TRUMP’ – over and over.

This signal can be heard right now from North Carolina to New Hampshire.

 

UPDATE: Station played some music and is now sending a slow scan TV signal. I’m recording but I can’t decode right now. Perhaps I will later and update.

 

OK – rest of the broadcast is here below:

Another Polypore Mushroom (Daedaleopsis confragosa)

I know I’ve been coming thick and fast with the mushroom updates but this weekend I solved a mystery that has been bothering me for a while. There seem to be two major, large, polypore mushrooms that I see attached to tree trunks in my area – the first was identified as the Birth polypore (Fomitopsis betulina) and A Blog Post here.  But I would see another type, more flat and with rings of colour and generally larger and encircling the trunk… see below.

This is about 20 cm across (8 or so inches). The underside is also not with simple pores.. it is more maze-like. See below:

My books and exhaustive web search failed to really identify it. Eventually I think I found it. It is the  thin walled maze polypore or blushing bracket (Daedaleopsis confragosa)  and seems very common in this area.

Here is a large community of them here:

It doesn’t seem to have any very interesting properties. Apparently the flesh is tough and not very edible, but not poisonous, so it’s considered ‘inedible’. Its just surprising that was is a very common fungus in the Middlesex Fells didn’t rate a mention in many books or forums.

 

In Search Of Turkey Tail Mushrooms – Failure – but found something else!

Wild mushroom literature is obsessed with a few types of mushroom species – usually for their flavour or medicinal value. When it comes to medicinal, none seems to be more prized than the Turkey Tail Mushroom (Trametes versicolor). This polypore mushroom (think pores, not gills, on the underside) is saprotrophic (grows on dead wood) and extracts have been shown to be anti-inflammatory and anti-cancer agents.

I haven’t looked at those studies myself and I’m pretty skeptical about claims like this. The excitement for me is not in making a tea and treating some inflammatory issue I might have. The mushrooms themselves apparently aren’t very tasty either. No, I just like the thrill of the case and so this fairly warm weekend I decided to go in-search-of the Turkey Tail mushroom. Long story short, I failed. It’s the dead of winter and maybe not a great time, instead I came across some lookalikes and went about trying to identify them instead.

Exhibit A is below:

To start, this mushroom is not really the right coloring expected of turkey tail – it’s a little too dull. A sample of it dried out pretty quick and turned pretty much grey. The cap was, however, velvety, which apparently is correct. The underside (not shown, sorry 😒) did have a pore like structure and definitely not gills, but the texture was more like fibers or teeth, not pores. This is just not adding up for Turkey Tail but it does seem to match Trichaptum biforme. This fungus has similar ecology to the Turkey Tail, but it’s underside is apparently more tooth like. T. Biforme is also noted to have a purple edge to it but alas this does fade so it’s not surprising it was absent. Another look about elsewhere uncovered this patch below, looking very old and wet and possibly discolored by green algae.

Exhibit B:It’s underside structure was similarly tooth-like. This specimen was also just looking very old and abused by winter.

The purple thing did trigger a memory so I looked through some old photographs I had and low and behold in September or October last year I uncovered a mass of these little guys, complete with subtle purple edging. Look below.


So it seems that Trichaptum biforme is common in the Middlesex Fells and is acting as a major decomposer of dead wood in the area. And it survives into the winter quite well. I’ll have to keep my eyes peeled for Turkey Tail once it’s season comes in though.

Mushrooms in the Middlesex Fells – Identifications!

We recently had some warm weather often extremely frigid conditions between Christmas 2017 and the first week and a half of 2018. Well, it warmed up to about 13 °C / 55 °F on Saturday so I hit up the Middlesex Fells with the dog.

Over Christmas I’ve been reading more and more about fungus and mushrooms and I really wanted to go to a few locations where I had seen mushrooms during the summer and  autumn season last year. So the first place I went was a silver birch tree very close to the east side of Bellevue Pond on South Border Rd, Medford, MA, USA. I had seen these really curious white balls on this live tree in early September and at the time I really didn’t know what they were. They certainly looked fungal/mushroomy but I was expecting to see a more typical mushroom shape and was surprised that if this was a mushroom that it could push through the bark. See the images below:

Fast forward to now and after significant snow melt, the same tree looks like this:

Cool, they did turn into a more typical ‘mushroom’ shape. After some web searches and reference mushroom books I identified this mushroom as the birch polypore (Fomitopsis betulina). One of the lobes/caps had fallen off and was on the ground nearby so I picked it up and flipped it over.

You can see the underside doesn’t have gills but pores. The margin (edge) of the cap rolls over and under the underside, exactly matching the birch polypore description. (I took this sample home!) Below is a closeup on the cap that formed at the bottom of the tree – it was still attached.

It turns out this mushroom is edible but doesn’t taste very good. I didn’t eat this one. It contains a number of compounds that are suppose to be good at killing some intestinal worms and has anti-bacterial and anti-inflammatory properties. Well, research continues.

Another fungus I encountered was as an odd looking jelly-like mushroom. See below:

I wasn’t even sure this thing was a fungus. Some research showed that, yeap it is! And a well known fungus that comes out this time on year (deep winter). Its the amber jelly roll or willow rain (Exidia recisa). Apparently it is edible but does not have an interesting taste, nor is it fowl or bitter.

I have many more photos from last Summer and I may get into some more identifications of those as well. It is amazing how many different kinds of mushrooms are out there, even in the dead of winter.