Top new questions this week:
|
I’m doing some experiments with word embeddings to try to capture context-aware similarity, so that for example the word pair apple – hardware, are very dissimilar in the context of a fruit store, but …
|
I already have 2 datasets. One to use for training and one for testing. Both datasets are unbalanced (with similar percentages), with around 90% of label 1 . Will it be useful to balance the data if …
|
I’m dealing with text classification using BERT pre-trained model with a multiclass imbalanced dataset. When we use a 0.5 default classification threshold we obtain a f1 measure of around 0.7. But we …
|
I have recently used a package to perform Aspect-Based Sentiment Analysis (ABSA) through a BERT model. Briefly, the model takes two inputs: words that constitute the aspects a sentence on which we …
|
Greatest hits from previous weeks:
|
I have built my model. Now I want to draw the network architecture diagram for my research paper. Example is shown below:
|
What is the difference between Gradient Descent and Stochastic Gradient Descent? I am not very familiar with these, can you describe the difference with a short example?
|
I am trying to build a Regression model and I am looking for a way to check whether there’s any correlation between features and target variables? This is my …
|
The key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates). Why do we make use of GRU …
|
Using tensorflow-gpu 2.0.0rc0. I want to choose whether it uses the GPU or the CPU.
|
So, I have not been able to find any literature on this subject but it seems like something worth giving a thought: What are the best practices in model training and optimization if new observations …
|
I’m currently working with Python and Scikit learn for classification purposes, and doing some reading around GridSearch I thought this was a great way for optimising my estimator parameters to get …
|
Can you answer these questions?
|
In Wasserstein GAN, it’s explained that maximizing a certain formula over a set of K-Lipschitz functions approximates the 1-Wasserstein distance and they model the functions as NNs. That much I …
|
Is there an end-to-end trained transformer like Rebel for french data? Rebel can extract entities and relations from text, yet as far as I know, it works only with english texts. Is there any other …
|