TransWikia.com

One-Hot Encoded Matrix Inupt/Ouput for Autoencoder

Data Science Asked by aslconwnb on April 29, 2021

I am trying to write an autoencoder to reduce the dimensionality of my genomic data. Currently, my data is in the form of a $273278 times 1$ vector. Each element of the vector indicates whether a position has no mutations (0), one mutation (1), or two mutations (2). As such, the input and output of my autoencoder looks like this:

$$begin{bmatrix}
0
1
0
2
vdots
end{bmatrix}$$

This uses label encoding to represent the categorical data. This works, but the autoencoder isn’t very accurate since the 0, 1, and 2 data are not related to each other.

I am considering using one-hot encoding to create a $273278 times 3$ matrix where each column corresponds to 0, 1, or 2. As such, the above vector would turn into this:

$$begin{bmatrix}
1 & 0 & 0
0 & 1 & 0
1 & 0 & 0
0 & 0 & 1
vdots & vdots & vdots
end{bmatrix}$$

However, I am unsure of how to input this matrix into a (keras) neural network. Is there a function to do this? Would flattening this matrix be mathematically appropriate? Is there another method to do this?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP