# What should the output of a neural network that needs to classify in an unsupervised fashion XOR data be?

Artificial Intelligence Asked on November 20, 2021

XOR data, without labels:

[[0,0],[0,1],[1,0],[1,1]]


I’m using this network for auto-classifying XOR data:

H1  <-- Dense(units=2, activation=relu)    #any activation here
Z   <-- Dense(units=2, activation=softmax) #softmax for 2 classes of XOR result
Out <-- Dense(units=2, activation=sigmoid) #sigmoid to return 2 values in (0,1)


There’s a logical problem in the network, that is, Z represents 2 classes,
however, the 2 classes can’t be decoded back to 4 samples of XOR data.

How to fix the network above to auto-classify XOR data, in unsupervised manner?

How to fix the network above to auto-classify XOR data, in unsupervised manner?

This cannot be done, except accidentally.

Unsupervised learning cannot replace or emulate supervised learning.

As a thought experiment, consider why you would expect the network to discover XOR, when simply considering outputs rounded to binary, you could equally find AND, OR, NAND, NOR or any of the 16 possible mapping functions from input to output. All of the possible maps are equally valid functions, and there is no reason why a discovered function mapping should become any one of them by preference.

Unsupervised learning approaches typically find patterns that optimise some measure across the dataset without using labelled data. Clustering is a classic example, and auto-encoding is sometimes considered unsupervised because there is no separate label (although the term self-supervised is also used, because there is still technically a label used in training, it happens to equal the input).

You cannot use auto-encoding approaches here anyway, because XOR needs to map $${0,1} times {0,1} rightarrow {0,1}$$

You could potentially use a loss function based on how close to a 0 or 1 any output is. That should cause the network to converge to one of the 16 possible binary functions, based on random initialisation. For example, you could use $$y(1-y)$$ as the loss.

Answered by Neil Slater on November 20, 2021

## Related Questions

### Can GANs be used to generate something other than images?

1  Asked on November 24, 2021

### What should the output of a neural network that needs to classify in an unsupervised fashion XOR data be?

1  Asked on November 20, 2021

### Choosing a policy improvement algorithm for a continuing problem with continuous action and state-space

1  Asked on November 20, 2021

### Why is the policy loss the mean of $-Q(s, mu(s))$ in the DDPG algorithm?

1  Asked on November 17, 2021 by dhanush-giriyan

### Are tabular reinforcement learning methods obsolete (or getting obsolete)?

1  Asked on November 12, 2021

### How do I test an LSTM-based reinforcement learning model using any Atari games in OpenAI gym?

1  Asked on November 10, 2021

### How does the target network in double DQNs find the maximum Q value for each action?

1  Asked on November 7, 2021

### Understanding the loss function in deep Q-learning

2  Asked on November 4, 2021

### Is a reward given at every step or only given when the RL agent fails or succeeds?

1  Asked on November 4, 2021

### Ways to keep up with the latest developments in Machine Learning and AI?

0  Asked on November 4, 2021 by tinu

### What is the expectation of an empirical model in model based RL?

1  Asked on November 4, 2021 by ijuneja

### How can I change observation states’ values in OpenAI gym’s cartpole environment?

1  Asked on August 24, 2021 by kashan

### What does the term $|mathcal{A}(s)|$ mean in the $epsilon$-greedy policy?

1  Asked on August 24, 2021 by metrician

### Do the order of the features ie channel matter for a 1d convolutional network?

1  Asked on August 24, 2021 by user289602

### What is convergence analysis, and why is it needed in reinforcement learning?

1  Asked on August 24, 2021 by daniel-koh

### Correct dimensionality of parameter vector for solving an MRP with linear function approximation?

0  Asked on August 24, 2021 by soitgoes

### How can I convert a simple CLI RPG to a compatible environment for training an RL agent via stable-baselines?

0  Asked on August 24, 2021 by seunosiko

### What is the amount of test data needed to evaluate a CNN?

0  Asked on August 24, 2021 by user38639

### What is the Turing test?

2  Asked on August 24, 2021