TransWikia.com

What should the output of a neural network that needs to classify in an unsupervised fashion XOR data be?

Artificial Intelligence Asked on November 20, 2021

XOR data, without labels:

[[0,0],[0,1],[1,0],[1,1]]

I’m using this network for auto-classifying XOR data:

H1  <-- Dense(units=2, activation=relu)    #any activation here
Z   <-- Dense(units=2, activation=softmax) #softmax for 2 classes of XOR result
Out <-- Dense(units=2, activation=sigmoid) #sigmoid to return 2 values in (0,1)

There’s a logical problem in the network, that is, Z represents 2 classes,
however, the 2 classes can’t be decoded back to 4 samples of XOR data.

How to fix the network above to auto-classify XOR data, in unsupervised manner?

One Answer

How to fix the network above to auto-classify XOR data, in unsupervised manner?

This cannot be done, except accidentally.

Unsupervised learning cannot replace or emulate supervised learning.

As a thought experiment, consider why you would expect the network to discover XOR, when simply considering outputs rounded to binary, you could equally find AND, OR, NAND, NOR or any of the 16 possible mapping functions from input to output. All of the possible maps are equally valid functions, and there is no reason why a discovered function mapping should become any one of them by preference.

Unsupervised learning approaches typically find patterns that optimise some measure across the dataset without using labelled data. Clustering is a classic example, and auto-encoding is sometimes considered unsupervised because there is no separate label (although the term self-supervised is also used, because there is still technically a label used in training, it happens to equal the input).

You cannot use auto-encoding approaches here anyway, because XOR needs to map ${0,1} times {0,1} rightarrow {0,1}$

You could potentially use a loss function based on how close to a 0 or 1 any output is. That should cause the network to converge to one of the 16 possible binary functions, based on random initialisation. For example, you could use $y(1-y)$ as the loss.

Answered by Neil Slater on November 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP