Building a Keras text embedding model with cosine proximity

Question

I am trying to build a word embedding keras model wherein I give as input a text that is converted to its corresponding input ids and masks (like input to an Albert model) and it gives me back a 768 dimensional vector as output. What I am planning to do here is use an Albert layer followed by a LSTM layer and a dense layer to give back a vector. As for target variables, there are 768 dimensional vectors. I want to use something like a cosine proximity function in order to determine the weights of the model. My general architecture is like this. However after training the model for some epochs, when I test the model, I get almost the same output vectors for all inputs. Is there something I need to change to make the model work?
        max_seq_length = 400
        in_id = Input(shape=(max_seq_length,), name="input_ids")
        in_mask = Input(shape=(max_seq_length,), name="input_masks")
        in_segment = Input(shape=(max_seq_length,), name="segment_ids")
        
        albert_inputs = [in_id, in_mask, in_segment]   
        albert_output = AlbertLayer(n_fine_tune_layers=3, pooling="first")(albert_inputs)
        x = RepeatVector(1)(albert_output)
        x = LSTM(units=512, return_sequences=False,
                 recurrent_dropout=0.3, dropout=0.3)(x)
        x = Flatten()(x)
        embedding_output = Dense(768)(x)
        
        model = Model(inputs=albert_inputs, outputs=embedding_output)
        model.compile(loss=cosine_proximity, optimizer='adam')

For the target variables, I have corresponding vectors for each training instance. For one input vector, its possible there might be multiple target vectors. I am using separate training instances for the input vector in this case.

Building a Keras text embedding model with cosine proximity

Add your own answers!

Ask a Question