Top new questions this week:
|
David Silver argues, in his Reinforcement Learning course, that policy-based reinforcement learning (RL) is more effective than value-based RL in high-dimensional action spaces. He points out that the …
|
ChatGPT occasionally generates responses to prompts that refer to itself as a “bot” or “language model.” For instance, when given a certain input (the first paragraph of this …
|
I am reading the book: Natural Language Processing with Transformers. It has the following paragraph Although head_dim does not have to be smaller than the number of embedding dimensions of the …
|
The standard method is to normalize the entire dataset (the training part) then send it to the model to train on. However I’ve noticed that in this manner the model doesn’t really work well when …
|
In the Actor-Critic example, provided by PyTorch, it seems that the update rule only occurs when the episode ends (like in a Monte-Carlo process). Specifically, in their …
|
In the original DQN paper, gradients during training are derived as follows: $\nabla_{\theta_i} L_i\left(\theta_i\right)=\mathbb{E}_{s, a \sim \rho(\cdot) ; s^{\prime} \sim \mathcal{E}}\left[\left(r+\…
|
One of the main criticisms against the use of ChatGPT on stack exchange is that it doesn’t attribute the main knowledge/sources used to generate a given output. How can a language model keep track of …
|
Greatest hits from previous weeks:
|
I have been messing around in tensorflow playground. One of the input data sets is a spiral. No matter what input parameters I choose, no matter how wide and deep the neural network I make, I cannot …
|
These two terms seem to be related, especially in their application in computer science and software engineering. Is one a subset of another? Is one a tool used to build a system for the other? What …
|
Local search algorithms are useful for solving pure optimization problems, in which the aim is to find the best state according to an objective function. My question is what is the objective function?
|
What is the difference between simple reflex and model-based agents? What is the role of the internal state in the case of model-based agents?
|
Is it possible to feed a neural network the output from a random number generator and expect it learn the hashing (or generator) function, so that it can predict what will be the next generated pseudo-…
|
I think that the advantage of using Leaky ReLU instead of ReLU is that in this way we cannot have vanishing gradient. Parametric ReLU has the same advantage with the only difference that the slope of …
|
I have been spending a few days trying to wrap my head around how and why neural networks are used to play chess. Although I know very little about how the game of chess works, I can understand the …
|
Can you answer this question?
|
AFAIK an AI is first trained using a data set of input and output values. After the training proccess you can give the AI input and it will produce output. For example when you write a sentence to an …
|